aboutsummaryrefslogtreecommitdiff
AgeCommit message (Expand)Author
2023-06-20readme : add link to p1Georgi Gerganov
2023-06-20Fix typo (#1949)Xiake Sun
2023-06-20llama : fix params struct slignment (#1936)Ettore Di Giacinto
2023-06-20[Fix] Reenable server embedding endpoint (#1937)Henri Vasserman
2023-06-19ggml : fix bug in LBFGS optimizer (found by ggml tests)Georgi Gerganov
2023-06-19llama : use aligned memory during ggml_init call from loading saved sessions ...l3utterfly
2023-06-19cmake : fix trailing whitespacesGeorgi Gerganov
2023-06-19llama : only use Q6_K for output weights if tensor size is multiple of 256 (#...Kawrakow
2023-06-19cuda : faster k-quants on older GPUs (#1930)Kawrakow
2023-06-19ggml : sync latest ggml repo (#1924)Georgi Gerganov
2023-06-19cmake : fix build shared ggml when CUDA is enabled (#1929)Howard Su
2023-06-19Convert vector to f16 for dequantize mul mat vec (#1913)Johannes Gäßler
2023-06-18Added tokens per second to info prints (#1928)Johannes Gäßler
2023-06-18Fixed incorrectly applying RMS norm twice (#1925)Johannes Gäßler
2023-06-18ggml : fix bug in ggml_compute_forward_add_q_f32 (#1918)l3utterfly
2023-06-18readme : update Android build instructions (#1922)Mike
2023-06-18llama : prevent usage of k-quants when tensor size is not a multiple of 256 (...Kawrakow
2023-06-18examples : fix examples/metal (#1920)Kawrakow
2023-06-18metal : handle buffers larger than device's maxBufferLength (#1826)Georgi Gerganov
2023-06-18cmake : add CUDA_ARCHITECTURES to new target ggml_static (#1917)Howard Su
2023-06-17make : do not print help for simple exampleGeorgi Gerganov
2023-06-17minor : warning fixesGeorgi Gerganov
2023-06-17Only one CUDA stream per device for async compute (#1898)Johannes Gäßler
2023-06-17llama : fix kv_cache `n` init (close #1903)Georgi Gerganov
2023-06-17make : update for latest Arch (#1701)DaniAndTheWeb
2023-06-17ggml : fix warnings under MSVC (#1908)Howard Su
2023-06-17metal : add norm, cpy f16->f16, alibi kernels (#1823)Aaron Miller
2023-06-17exposed modules so that they can be invoked by nix run github:ggerganov/llama...Faez Shakil
2023-06-17Server Example Refactor and Improvements (#1570)Randall Fitzgerald
2023-06-17hooks : setting up flake8 and pre-commit hooks (#1681)Jiří Podivín
2023-06-17readme : alternative way to build for Android with CLBlast. (#1828)Gustavo Rocha Dias
2023-06-17Allow cmake to build ggml as a library (#1896)Kerfuffle
2023-06-17train : get raw text instead of page with html (#1905)David Yang
2023-06-16opencl : support k-quants (#1836)0cc4m
2023-06-16examples : add "simple" (#1840)SuperUserNameMan
2023-06-16cmake : add auto detection of BLAS_INCLUDE_DIRS (#1886)Zenix
2023-06-16llama : fix embd when offloading non-repeating layers (#1891)Johannes Gäßler
2023-06-16Fixed possible macro redefinition (#1892)FrankHB
2023-06-16build : fix and ignore MSVC warnings (#1889)Borislav Stanimirov
2023-06-16CUDA : faster k-quant dot kernels (#1862)Kawrakow
2023-06-16gitignore : add several entries specific to Visual Studio (#1888)Borislav Stanimirov
2023-06-15Fixed CUDA runtime version check (#1879)Johannes Gäßler
2023-06-15cmake : remove whitespacesGeorgi Gerganov
2023-06-15examples : add chat-vicuna.sh (#1854)yangli2
2023-06-15cmake : set include path for OpenBlas (#1830)Igor Okulist
2023-06-15swift : Package compile breaks due to ggml-metal.metal (#1831)Frederik Vogel
2023-06-15make : add train-text-from-scratch (#1850)daboe01
2023-06-15readme : server compile flag (#1874)Srinivas Billa
2023-06-15make : clean *.so files (#1857)sandyiscool
2023-06-15Fix the validation of main device (#1872)Howard Su