llama.cpp.git - llama.cpp

Age	Commit message (Expand)	Author
2023-04-08	Add quantize-stats command for testing quantization (#728)	unbounded
2023-04-05	ggml : multi-thread ggml_rope() (~3-4 times faster on M1) (#781)	Georgi Gerganov
2023-04-05	ggml, llama : avoid heavy V transpose + improvements (#775)	Georgi Gerganov
2023-04-03	10+% performance improvement of ggml_vec_dot_q4_0 on AVX2 (#654)	SebastianApel
2023-04-02	ggml : change ne to int64_t (#626)	Marian Cepok
2023-03-31	Enable -std= for cmake builds, fix warnings (#598)	Stephan Walter
2023-03-31	Optimize AVX2 ggml_vec_dot_q4_0 (#642)	slaren
2023-03-31	Add AVX acceleration (#617)	perserk
2023-03-30	Ensure --mlock works properly with mmap() support	Justine Tunney
2023-03-30	Add mmap support for model files	Slaren
2023-03-30	Remove unused variable (#607)	Casey Primozic
2023-03-30	ggml : fix NEON signs (close #620, #622)	Georgi Gerganov
2023-03-30	Fix GGML_F32Cx8_STORE in AVX without F16C path (#619)	slaren
2023-03-29	ggml : init time on first ggml_init() call	Georgi Gerganov
2023-03-29	ggml : add ARM_NEON dequantize_row_q4_1()	Georgi Gerganov
2023-03-29	ggml : add ARM_NEON quantize_row_q4_1()	Georgi Gerganov
2023-03-29	ggml : add ARM_NEON ggml_vec_dot_q4_1()	Georgi Gerganov
2023-03-29	Fix GCC warning about binary literal (#595)	anzz1
2023-03-28	Enable Fused-Multiply-Add (FMA) and F16C/CVT16 vector extensions on MSVC (#375)	anzz1
2023-03-28	ggml : add AVX2 implementation of quantize_row_q4_1 (#515)	slaren
2023-03-28	ggml : refactor quantized processing functions (#509)	Stephan Walter
2023-03-28	all : be more strict about converting float to double (#458)	Stephan Walter
2023-03-28	ggml : introduce structs for the q4 data blocks (#356)	Stephan Walter
2023-03-28	Fix usage of F16C intrinsics in AVX code (#563)	slaren
2023-03-26	Fix undefined variables in debug build, remove unused variables (#531)	Stephan Walter
2023-03-25	Add AVX2 implementation of dequantize_row_q4_1 (#505)	slaren
2023-03-25	Overhaul the examples structure	Georgi Gerganov
2023-03-25	Retire the ggml_mul_mat() branch for transposed src0 (#500)	Georgi Gerganov
2023-03-25	Add AVX2 implementation of dequantize_row_q4_0 (#467)	slaren
2023-03-25	Remove obsolete assert and fix compiler warning	Georgi Gerganov
2023-03-25	Fix nasty bug in ggml_compute_forward_mul_mat_f32() and reenable BLAS	Georgi Gerganov
2023-03-24	Disable BLAS altogether - the bug is not just for qunatized mat mul	Georgi Gerganov
2023-03-24	Disable BLAS branch in mul_mat - seems there is a bug	Georgi Gerganov
2023-03-24	Reduce memory usage and allocate enough memory for largest context (#473)	Georgi Gerganov
2023-03-24	additional optimizations for POWER9 (#454)	Cameron Kaiser
2023-03-24	Support calling mlock() on loaded model data on Linux and macOS (#453)	comex
2023-03-22	Deduplicate q4 quantization functions (#383)	Stephan Walter
2023-03-22	fix: add POSIX functionality for Linux compilation (#51)	Valentyn Bezshapkin
2023-03-22	Introduce C-style API (#370)	Georgi Gerganov
2023-03-21	Add OpenBSD support (#314)	Kevin Lo
2023-03-21	Add initial AVX512 support for dot product on Linux (#320)	Casey Primozic
2023-03-19	Change RMSNorm eps to 1e-6 (#173)	Georgi Gerganov
2023-03-17	Don't tell users to use a bad number of threads (#243)	Stephan Walter
2023-03-17	Q4_1 quantization (#193)	Matvey Soloviev
2023-03-15	Fix RMS norm in GGML (#191)	Nebula
2023-03-16	Add RMS norm and use it (#187)	hoangmit
2023-03-15	inline -> static inline for "bytesFromNibbles" (#161)	hoangmit
2023-03-14	Don't use vdotq_s32 if it's not available (#139)	Ronsor
2023-03-13	Add NetBSD support. (#90)	Thomas Klausner
2023-03-13	Use vdotq_s32 to improve performance (#67)	Georgi Gerganov