aboutsummaryrefslogtreecommitdiff
path: root/llama.cpp
AgeCommit message (Expand)Author
2023-03-29llama : fix compile warnings when reading the vocabGeorgi Gerganov
2023-03-29llama : use the same threshold for OpenBLAS and ggml thread limiting (#577)Maël Kerbiriou
2023-03-28py : add temporary script to convert old ggml files to newer version (#539)thement
2023-03-28all : be more strict about converting float to double (#458)Stephan Walter
2023-03-28ggml : introduce structs for the q4 data blocks (#356)Stephan Walter
2023-03-25Cleanup STL headers + fix embedding examples + minor stuffGeorgi Gerganov
2023-03-25Don't interefe with BLAS for large prompts by running only 1 threadGeorgi Gerganov
2023-03-25Add timings for the prompt evaluation (#478)slaren
2023-03-25Fix nasty bug in ggml_compute_forward_mul_mat_f32() and reenable BLASGeorgi Gerganov
2023-03-25Add support for file load progress reporting callbacks (#434)Jed Fox
2023-03-25Fix crash for 65B model with pre-allocated memory (#485)Chris Kuehl
2023-03-24Reduce memory usage and allocate enough memory for largest context (#473)Georgi Gerganov
2023-03-24Temporary bump the memory buffer size - hopefully fix issues from 483bab2eGeorgi Gerganov
2023-03-24Properly free llama_context on failureGeorgi Gerganov
2023-03-24Support calling mlock() on loaded model data on Linux and macOS (#453)comex
2023-03-24Add embedding mode with arg flag. Currently working (#282)Luciano
2023-03-24Revert "Fix memory allocation issues and seg faults"Georgi Gerganov
2023-03-24Fix memory allocation issues and seg faultsGeorgi Gerganov
2023-03-23Avoid the transposed X branch in the Z = X * Y matrix multiplication (#439)Georgi Gerganov
2023-03-22Add missing header for memcpy (#386)Yusuf Kağan Hanoğlu
2023-03-22Init llama_context_params properly from CLI (#370)Georgi Gerganov
2023-03-22Introduce C-style API (#370)Georgi Gerganov