aboutsummaryrefslogtreecommitdiff
path: root/ggml.c
AgeCommit message (Expand)Author
2023-04-05ggml, llama : avoid heavy V transpose + improvements (#775)Georgi Gerganov
2023-04-0310+% performance improvement of ggml_vec_dot_q4_0 on AVX2 (#654)SebastianApel
2023-04-02ggml : change ne to int64_t (#626)Marian Cepok
2023-03-31Enable -std= for cmake builds, fix warnings (#598)Stephan Walter
2023-03-31Optimize AVX2 ggml_vec_dot_q4_0 (#642)slaren
2023-03-31Add AVX acceleration (#617)perserk
2023-03-30Ensure --mlock works properly with mmap() supportJustine Tunney
2023-03-30Add mmap support for model filesSlaren
2023-03-30Remove unused variable (#607)Casey Primozic
2023-03-30ggml : fix NEON signs (close #620, #622)Georgi Gerganov
2023-03-30Fix GGML_F32Cx8_STORE in AVX without F16C path (#619)slaren
2023-03-29ggml : init time on first ggml_init() callGeorgi Gerganov
2023-03-29ggml : add ARM_NEON dequantize_row_q4_1()Georgi Gerganov
2023-03-29ggml : add ARM_NEON quantize_row_q4_1()Georgi Gerganov
2023-03-29ggml : add ARM_NEON ggml_vec_dot_q4_1()Georgi Gerganov
2023-03-29Fix GCC warning about binary literal (#595)anzz1
2023-03-28Enable Fused-Multiply-Add (FMA) and F16C/CVT16 vector extensions on MSVC (#375)anzz1
2023-03-28ggml : add AVX2 implementation of quantize_row_q4_1 (#515)slaren
2023-03-28ggml : refactor quantized processing functions (#509)Stephan Walter
2023-03-28all : be more strict about converting float to double (#458)Stephan Walter
2023-03-28ggml : introduce structs for the q4 data blocks (#356)Stephan Walter
2023-03-28Fix usage of F16C intrinsics in AVX code (#563)slaren
2023-03-26Fix undefined variables in debug build, remove unused variables (#531)Stephan Walter
2023-03-25Add AVX2 implementation of dequantize_row_q4_1 (#505)slaren
2023-03-25Overhaul the examples structureGeorgi Gerganov
2023-03-25Retire the ggml_mul_mat() branch for transposed src0 (#500)Georgi Gerganov
2023-03-25Add AVX2 implementation of dequantize_row_q4_0 (#467)slaren
2023-03-25Remove obsolete assert and fix compiler warningGeorgi Gerganov
2023-03-25Fix nasty bug in ggml_compute_forward_mul_mat_f32() and reenable BLASGeorgi Gerganov
2023-03-24Disable BLAS altogether - the bug is not just for qunatized mat mulGeorgi Gerganov
2023-03-24Disable BLAS branch in mul_mat - seems there is a bugGeorgi Gerganov
2023-03-24Reduce memory usage and allocate enough memory for largest context (#473)Georgi Gerganov
2023-03-24additional optimizations for POWER9 (#454)Cameron Kaiser
2023-03-24Support calling mlock() on loaded model data on Linux and macOS (#453)comex
2023-03-22Deduplicate q4 quantization functions (#383)Stephan Walter
2023-03-22fix: add POSIX functionality for Linux compilation (#51)Valentyn Bezshapkin
2023-03-22Introduce C-style API (#370)Georgi Gerganov
2023-03-21Add OpenBSD support (#314)Kevin Lo
2023-03-21Add initial AVX512 support for dot product on Linux (#320)Casey Primozic
2023-03-19Change RMSNorm eps to 1e-6 (#173)Georgi Gerganov
2023-03-17Don't tell users to use a bad number of threads (#243)Stephan Walter
2023-03-17Q4_1 quantization (#193)Matvey Soloviev
2023-03-15Fix RMS norm in GGML (#191)Nebula
2023-03-16Add RMS norm and use it (#187)hoangmit
2023-03-15inline -> static inline for "bytesFromNibbles" (#161)hoangmit
2023-03-14Don't use vdotq_s32 if it's not available (#139)Ronsor
2023-03-13Add NetBSD support. (#90)Thomas Klausner
2023-03-13Use vdotq_s32 to improve performance (#67)Georgi Gerganov
2023-03-13Revert "10% performance boost on ARM"Georgi Gerganov
2023-03-13Check for vdotq_s32 availabilityGeorgi Gerganov