aboutsummaryrefslogtreecommitdiff
AgeCommit message (Expand)Author
2023-06-17metal : add norm, cpy f16->f16, alibi kernels (#1823)Aaron Miller
2023-06-17exposed modules so that they can be invoked by nix run github:ggerganov/llama...Faez Shakil
2023-06-17Server Example Refactor and Improvements (#1570)Randall Fitzgerald
2023-06-17hooks : setting up flake8 and pre-commit hooks (#1681)Jiří Podivín
2023-06-17readme : alternative way to build for Android with CLBlast. (#1828)Gustavo Rocha Dias
2023-06-17Allow cmake to build ggml as a library (#1896)Kerfuffle
2023-06-17train : get raw text instead of page with html (#1905)David Yang
2023-06-16opencl : support k-quants (#1836)0cc4m
2023-06-16examples : add "simple" (#1840)SuperUserNameMan
2023-06-16cmake : add auto detection of BLAS_INCLUDE_DIRS (#1886)Zenix
2023-06-16llama : fix embd when offloading non-repeating layers (#1891)Johannes Gäßler
2023-06-16Fixed possible macro redefinition (#1892)FrankHB
2023-06-16build : fix and ignore MSVC warnings (#1889)Borislav Stanimirov
2023-06-16CUDA : faster k-quant dot kernels (#1862)Kawrakow
2023-06-16gitignore : add several entries specific to Visual Studio (#1888)Borislav Stanimirov
2023-06-15Fixed CUDA runtime version check (#1879)Johannes Gäßler
2023-06-15cmake : remove whitespacesGeorgi Gerganov
2023-06-15examples : add chat-vicuna.sh (#1854)yangli2
2023-06-15cmake : set include path for OpenBlas (#1830)Igor Okulist
2023-06-15swift : Package compile breaks due to ggml-metal.metal (#1831)Frederik Vogel
2023-06-15make : add train-text-from-scratch (#1850)daboe01
2023-06-15readme : server compile flag (#1874)Srinivas Billa
2023-06-15make : clean *.so files (#1857)sandyiscool
2023-06-15Fix the validation of main device (#1872)Howard Su
2023-06-15metal : parallel command buffer encoding (#1860)Georgi Gerganov
2023-06-15Better error when using both LoRA + GPU layers (#1861)Johannes Gäßler
2023-06-14CUDA full GPU acceleration, KV cache in VRAM (#1827)Johannes Gäßler
2023-06-13baby-llama : fix operator!= (#1821)0xspringtime
2023-06-13train : improved training-from-scratch example (#1652)xaedes
2023-06-13llama : do a warm-up eval at start for better timings (#1824)Georgi Gerganov
2023-06-13Allow "quantizing" to f16 and f32 (#1787)Kerfuffle
2023-06-12Metal implementation for all k_quants (#1807)Kawrakow
2023-06-12ci : run when changing only the CUDA sources (#1800)slaren
2023-06-12Leverage mmap for offloading tensors to GPU (#1597)Howard Su
2023-06-12metal : fix failure to load model (#1817)Kawrakow
2023-06-11Fix issue where interactive mode crashes when input exceeds ctx size (#1789)Kerfuffle
2023-06-11Fixed WSL cuda's OOM error (#1594)Kyle Liang
2023-06-11Update SHA256SUMS with current hashes for models quantized using q4_0 (#1798)Ryan Landay
2023-06-10cmake : fix Metal build (close #1791)Georgi Gerganov
2023-06-10k-quants : GCC12 compilation fix (#1792)Artyom Lebedev
2023-06-10metal : fix issue with ggml-metal.metal path. Closes #1769 (#1782)Andrei
2023-06-10doc : fix wrong address of BLIS.md (#1772)Aisuko
2023-06-10ggml : force no_alloc == false when creating opt tensors (close #1699)Georgi Gerganov
2023-06-10metal : add Q4_1 implementation (#1785)Kawrakow
2023-06-10llama : support requantizing models instead of only allowing quantization fro...Kerfuffle
2023-06-10ggml : workaround for missing _mm256_setr_m128i in GCC < 8 (#1638)Xingchen Song(宋星辰)
2023-06-10make : add SSSE3 compilation use case (#1659)rankaiyx
2023-06-09OpenCL: Add release memory (#1741)Robert Sung-wook Shin
2023-06-09Windows nvcc workaround (#1753)Johannes Gäßler
2023-06-09metal : fix build "tanhf" -> "tanh"Georgi Gerganov