aboutsummaryrefslogtreecommitdiff
path: root/ggml.c
AgeCommit message (Expand)Author
2023-04-19Add NVIDIA cuBLAS support (#1044)slaren
2023-04-19Multi-threaded ggml_cpy (#1035)slaren
2023-04-18ggml : add new Q4_2 quantization (ARM only) (#1046)Georgi Gerganov
2023-04-18ggml : scratch that - vmlaq_n_f32 is always betterGeorgi Gerganov
2023-04-18ggml : optimize ggml_vec_dot_q4_0_q8_0() using vectorized accumulatorsGeorgi Gerganov
2023-04-17Add LoRA support (#820)slaren
2023-04-17ggml : avoid using ggml_fp16_to_fp32() and ggml_fp32_to_fp16() in ggml.cGeorgi Gerganov
2023-04-17Speedup the AVX-512 implementation of ggml_vec_dot_q4_0() (#933)Ivan Komarov
2023-04-15Fix potential int8 overflow in non-SIMD vec_dot (#986)Stephan Walter
2023-04-15Refactor ggml.c for future tensor types (#1001)Stephan Walter
2023-04-15ggml : add Q8_0 quantization for intermediate results (#951)Georgi Gerganov
2023-04-15ggml : use posix_memalign on non-Windows envGeorgi Gerganov
2023-04-14Expose type name from ggml (#970)Pavol Rusnak
2023-04-14ggml : add unary and binary map operations (#874)Kerfuffle
2023-04-14ggml : minorGeorgi Gerganov
2023-04-14ggml : always allocate buffers with size multiple of GGML_MEM_ALIGNGeorgi Gerganov
2023-04-14ggml : fix q4_1 dot product typesGeorgi Gerganov
2023-04-14ggml : optimize rope function to avoid call powf in the tight loop (#807)Howard Su
2023-04-13ggml : add GGML_DEFAULT_N_THREADSGeorgi Gerganov
2023-04-13ggml : speed-up ggml_vec_dot_q4_1() ARM_NEON + 32-bit ARM support (#900)Georgi Gerganov
2023-04-13ggml : optimize non-SIMD Q4_0 vector dot product (#703)Stephan Walter
2023-04-13ggml : introduce GGML_ALIGNED_MALLOC/GGML_ALIGNED_FREE macros (#884)Pavol Rusnak
2023-04-13ggml : update cblas_sgemm columns var to be more reasonable (#838)Vladimir
2023-04-11Fix whitespace, add .editorconfig, add GitHub workflow (#883)Pavol Rusnak
2023-04-11Add enum llama_ftype, sync ggml_type to model files (#709)Stephan Walter
2023-04-11Windows fixes (#890)comex
2023-04-10ggml : fix WASM buildGeorgi Gerganov
2023-04-10ggml : add ggml_cont() + optimize ggml_cpy() for contiguous dstGeorgi Gerganov
2023-04-10ggml : remove trailing whitespacesGeorgi Gerganov
2023-04-10Simplify to include lower-case windows.h always, fix compile on mingw32 (#747)Marco Matthies
2023-04-10ggml : fix quantize_row_q4_1() ARM_NEON (close #876)Georgi Gerganov
2023-04-10Rewrite loading code to try to satisfy everyone:comex
2023-04-08Add quantize-stats command for testing quantization (#728)unbounded
2023-04-05ggml : multi-thread ggml_rope() (~3-4 times faster on M1) (#781)Georgi Gerganov
2023-04-05ggml, llama : avoid heavy V transpose + improvements (#775)Georgi Gerganov
2023-04-0310+% performance improvement of ggml_vec_dot_q4_0 on AVX2 (#654)SebastianApel
2023-04-02ggml : change ne to int64_t (#626)Marian Cepok
2023-03-31Enable -std= for cmake builds, fix warnings (#598)Stephan Walter
2023-03-31Optimize AVX2 ggml_vec_dot_q4_0 (#642)slaren
2023-03-31Add AVX acceleration (#617)perserk
2023-03-30Ensure --mlock works properly with mmap() supportJustine Tunney
2023-03-30Add mmap support for model filesSlaren
2023-03-30Remove unused variable (#607)Casey Primozic
2023-03-30ggml : fix NEON signs (close #620, #622)Georgi Gerganov
2023-03-30Fix GGML_F32Cx8_STORE in AVX without F16C path (#619)slaren
2023-03-29ggml : init time on first ggml_init() callGeorgi Gerganov
2023-03-29ggml : add ARM_NEON dequantize_row_q4_1()Georgi Gerganov
2023-03-29ggml : add ARM_NEON quantize_row_q4_1()Georgi Gerganov
2023-03-29ggml : add ARM_NEON ggml_vec_dot_q4_1()Georgi Gerganov
2023-03-29Fix GCC warning about binary literal (#595)anzz1