Age | Commit message (Expand) | Author |
2023-04-23 | A better `packNibbles` and `mul_sum_i8_pairs_float` implementation using AVX5... | Yishuo Wang |
2023-04-22 | ggml : fix Q4_3 cuBLAS | Georgi Gerganov |
2023-04-22 | ci : trigger CI for drafts, but not most PR actions (#1125) | Stephan Walter |
2023-04-22 | Fix CI: ARM NEON, quantization unit tests, editorconfig (#1122) | Stephan Walter |
2023-04-22 | ggml : unit test for quantization functions (#953) | unbounded |
2023-04-22 | llama : print timings on ctrl+c exit (#1021) | wbpxre150 |
2023-04-22 | llama : have n_batch default to 512 (#1091) | eiery |
2023-04-22 | cmake : fix build under Windows when enable BUILD_SHARED_LIBS (#1100) | Howard Su |
2023-04-22 | ggml : fix AVX build + update to new Q8_0 format | Georgi Gerganov |
2023-04-22 | ggml : alternative Q4_3 implementation using modified Q8_0 (#1109) | Georgi Gerganov |
2023-04-22 | ggml : AVX2 optimization for vec_dot_q4_3_q8_0 and refactoring (#1099) | Stephan Walter |
2023-04-22 | examples : Improve Alpaca Default Repeat Penalty: Better Match Alpaca.cpp Exp... | Clint Herron |
2023-04-22 | llama : add api for getting/setting the complete state: rng, logits, embeddin... | xaedes |
2023-04-21 | Improve cuBLAS performance by using a memory pool (#1094) | slaren |
2023-04-21 | llama : fixed rlimit error message (#888) | apaz |
2023-04-21 | cmake : link threads publicly to ggml (#1042) | 源文雨 |
2023-04-21 | main : evaluate tokens in batches after swapping context (#1014) | Alex Klinkhamer |
2023-04-21 | llama : remember and restore kv cache data pointers (#1104) | xaedes |
2023-04-21 | ggml : a faster version for Q4_1 x Q8_0 dot products (#1083) | Kawrakow |
2023-04-21 | Show perplexity ETA in hours and minutes (#1096) | slaren |
2023-04-21 | llama : fix comment for "output.weight" tensor | Georgi Gerganov |
2023-04-20 | Add ggml-model-*.bin checksums for 7B, 13B, 30B, 65B (#1088) | Stephan Walter |
2023-04-20 | ggml : sync ggml (add GPT-NeoX RoPE implementation) | Georgi Gerganov |
2023-04-20 | ggml : fix bug in ggml_compute_forward_dup_f32() | Georgi Gerganov |
2023-04-20 | Add Q4_3 support to cuBLAS (#1086) | slaren |
2023-04-20 | ggml : do not break cuBLAS build (Q4_3 is not yet implemented) | Georgi Gerganov |
2023-04-20 | ggml : fix Q4_3 quantization | Georgi Gerganov |
2023-04-20 | llama : multi-threaded quantization (#1075) | Kawrakow |
2023-04-20 | ggml : add Q4_3 quantization (#1082) | Georgi Gerganov |
2023-04-20 | ci : remove the LLAMA_ACCELERATE matrix dimension from Ubuntu builds in the C... | Ivan Komarov |
2023-04-20 | fix: LLAMA_CUBLAS=1 undefined reference 'shm_open' (#1080) | 源文雨 |
2023-04-20 | AVX2 optimization for vec_dot_q4_2_q8_0 (#1068) | Stephan Walter |
2023-04-20 | Improve cuBLAS performance by dequantizing on the GPU (#1065) | slaren |
2023-04-19 | Minor: Readme fixed grammar, spelling, and misc updates (#1071) | CRD716 |
2023-04-19 | Q4_2 quantization with rmse-optimized scale and quants (#1062) | Kawrakow |
2023-04-19 | ggml : use 8-bit precision for Q4_1 intermediate results (#1047) | Georgi Gerganov |
2023-04-19 | readme : add warning about Q4_2 and Q4_3 | Georgi Gerganov |
2023-04-19 | ggml : Q4 cleanup - remove 4-bit dot product code (#1061) | Stephan Walter |
2023-04-19 | Add NVIDIA cuBLAS support (#1044) | slaren |
2023-04-19 | Multi-threaded ggml_cpy (#1035) | slaren |
2023-04-18 | ggml : add new Q4_2 quantization (ARM only) (#1046) | Georgi Gerganov |
2023-04-18 | ggml : scratch that - vmlaq_n_f32 is always better | Georgi Gerganov |
2023-04-18 | gitignore : vdot | Georgi Gerganov |
2023-04-18 | ggml : optimize ggml_vec_dot_q4_0_q8_0() using vectorized accumulators | Georgi Gerganov |
2023-04-18 | Adding a simple program to measure speed of dot products (#1041) | Kawrakow |
2023-04-18 | readme : update hot topics about new LoRA functionality | Georgi Gerganov |
2023-04-18 | ci : do not run on drafts | Georgi Gerganov |
2023-04-18 | Do not close file after mmap (Windows version) (#1034) | Ivan Komarov |
2023-04-17 | readme : add Ruby bindings (#1029) | Atsushi Tatsuma |
2023-04-17 | add 4_0 to default outfile namestr dict (#1031) | Cameron |