llama.cpp.git - llama.cpp

Age	Commit message (Expand)	Author
2023-07-19	flake : update flake.nix (#2270)	wzy
2023-07-19	cmake : install targets (#2256)	wzy
2023-07-12	FP16 is supported in CM=6.0 (#2177)	Howard Su
2023-07-10	mpi : add support for distributed inference via MPI (#2099)	Evan Miller
2023-07-09	ggml : fix buidling with Intel MKL but ask for "cblas.h" issue (#2104) (#2115)	clyang
2023-07-05	Quantized dot products for CUDA mul mat vec (#2067)	Johannes Gäßler
2023-07-04	Simple webchat for server (#1998)	Tobias Lütke
2023-07-01	cmake : don't force -mcpu=native on aarch64 (#2063)	Daniel Drake
2023-06-26	k-quants : support for super-block size of 64 (#2001)	Kawrakow
2023-06-21	cmake: revert CUDA arch default to 52, 61 if f16 (#1959)	Johannes Gäßler
2023-06-19	cmake : fix trailing whitespaces	Georgi Gerganov
2023-06-19	cmake : fix build shared ggml when CUDA is enabled (#1929)	Howard Su
2023-06-19	Convert vector to f16 for dequantize mul mat vec (#1913)	Johannes Gäßler
2023-06-18	cmake : add CUDA_ARCHITECTURES to new target ggml_static (#1917)	Howard Su
2023-06-17	Allow cmake to build ggml as a library (#1896)	Kerfuffle
2023-06-16	cmake : add auto detection of BLAS_INCLUDE_DIRS (#1886)	Zenix
2023-06-16	CUDA : faster k-quant dot kernels (#1862)	Kawrakow
2023-06-15	cmake : remove whitespaces	Georgi Gerganov
2023-06-15	cmake : set include path for OpenBlas (#1830)	Igor Okulist
2023-06-10	cmake : fix Metal build (close #1791)	Georgi Gerganov
2023-06-10	metal : fix issue with ggml-metal.metal path. Closes #1769 (#1782)	Andrei
2023-06-08	k-quants : add missing compile definition to CMakeLists (#1748)	johnson442
2023-06-07	k-quants : allow to optionally disable at compile time (#1734)	Georgi Gerganov
2023-06-05	ggml : add SOTA 2,3,4,5,6 bit k-quantizations (#1684)	Kawrakow
2023-06-04	llama : Metal inference (#1642)	Georgi Gerganov
2023-05-27	[CI] Fix openblas (#1613)	Henri Vasserman
2023-05-26	cuda : performance optimizations (#1530)	Johannes Gäßler
2023-05-23	OpenCL Token Generation Acceleration (#1459)	0cc4m
2023-05-21	examples : add server example with REST API (#1443)	Steward Garcia
2023-05-20	feature : support blis and other blas implementation (#1536)	Zenix
2023-05-20	Revert "feature : add blis and other BLAS implementation support (#1502)"	Georgi Gerganov
2023-05-20	feature : add blis and other BLAS implementation support (#1502)	Zenix
2023-05-03	fix build-info.h for git submodules (#1289)	kuvaus
2023-05-02	ggml : fix ppc64le build error and make cmake detect Power processors (#1284)	Marvin Gießing
2023-05-01	Add git-based build information for better issue tracking (#1232)	DannyDaemonic
2023-04-30	build: add armv{6,7,8} support to cmake (#1251)	Pavol Rusnak
2023-04-29	build : fix reference to old llama_util.h	Georgi Gerganov
2023-04-28	ggml : add CLBlast support (#1164)	0cc4m
2023-04-22	ggml : fix Q4_3 cuBLAS	Georgi Gerganov
2023-04-22	cmake : fix build under Windows when enable BUILD_SHARED_LIBS (#1100)	Howard Su
2023-04-21	cmake : link threads publicly to ggml (#1042)	源文雨
2023-04-20	Improve cuBLAS performance by dequantizing on the GPU (#1065)	slaren
2023-04-19	ggml : Q4 cleanup - remove 4-bit dot product code (#1061)	Stephan Walter
2023-04-19	Add NVIDIA cuBLAS support (#1044)	slaren
2023-04-18	Adding a simple program to measure speed of dot products (#1041)	Kawrakow
2023-04-17	Speedup the AVX-512 implementation of ggml_vec_dot_q4_0() (#933)	Ivan Komarov
2023-04-15	cmake : add finding the OpenBLAS header file (#992)	katsu560
2023-04-13	llama : merge llama_internal.h into llama.h	Georgi Gerganov
2023-04-13	cmake : add explicit F16C option (x86) (#576)	anzz1
2023-04-10	Rewrite loading code to try to satisfy everyone:	comex