index
:
llama.cpp.git
master
llama.cpp
user
about
summary
refs
log
tree
commit
diff
log msg
author
committer
range
path:
root
/
llama.cpp
Age
Commit message (
Expand
)
Author
2023-04-11
Windows fixes (#890)
comex
2023-04-10
Print model version.
comex
2023-04-10
Rewrite loading code to try to satisfy everyone:
comex
2023-04-08
Add quantize-stats command for testing quantization (#728)
unbounded
2023-04-07
llama : always sort logits before nucleus sampling (#812)
Ivan Stepanov
2023-04-05
ggml, llama : avoid heavy V transpose + improvements (#775)
Georgi Gerganov
2023-04-05
llama : define non-positive top_k; top_k range check (#779)
Ivan Stepanov
2023-04-03
Define non-positive temperature behavior (#720)
Ivan Stepanov
2023-04-02
Added api for getting/setting the kv_cache (#685)
Christian Falch
2023-04-02
ggml : change ne to int64_t (#626)
Marian Cepok
2023-04-02
llama : do not allocate KV cache for "vocab_only == true" (#682)
Stephan Walter
2023-03-30
Introduce GGML migration tool for new file format
Justine Tunney
2023-03-30
Ensure --mlock works properly with mmap() support
Justine Tunney
2023-03-30
Make loading weights 10-100x faster
Justine Tunney
2023-03-30
Initial windows support (untested)
Slaren
2023-03-30
Always initialize mm_addr and mm_length in llama_model
Slaren
2023-03-30
Unmap the file in llama_free
Slaren
2023-03-30
Make mmap_file static
Slaren
2023-03-30
Fix ggml_init_params in quantize
Slaren
2023-03-30
Add mmap support for model files
Slaren
2023-03-29
llama : fix compile warnings when reading the vocab
Georgi Gerganov
2023-03-29
llama : use the same threshold for OpenBLAS and ggml thread limiting (#577)
Maël Kerbiriou
2023-03-28
py : add temporary script to convert old ggml files to newer version (#539)
thement
2023-03-28
all : be more strict about converting float to double (#458)
Stephan Walter
2023-03-28
ggml : introduce structs for the q4 data blocks (#356)
Stephan Walter
2023-03-25
Cleanup STL headers + fix embedding examples + minor stuff
Georgi Gerganov
2023-03-25
Don't interefe with BLAS for large prompts by running only 1 thread
Georgi Gerganov
2023-03-25
Add timings for the prompt evaluation (#478)
slaren
2023-03-25
Fix nasty bug in ggml_compute_forward_mul_mat_f32() and reenable BLAS
Georgi Gerganov
2023-03-25
Add support for file load progress reporting callbacks (#434)
Jed Fox
2023-03-25
Fix crash for 65B model with pre-allocated memory (#485)
Chris Kuehl
2023-03-24
Reduce memory usage and allocate enough memory for largest context (#473)
Georgi Gerganov
2023-03-24
Temporary bump the memory buffer size - hopefully fix issues from 483bab2e
Georgi Gerganov
2023-03-24
Properly free llama_context on failure
Georgi Gerganov
2023-03-24
Support calling mlock() on loaded model data on Linux and macOS (#453)
comex
2023-03-24
Add embedding mode with arg flag. Currently working (#282)
Luciano
2023-03-24
Revert "Fix memory allocation issues and seg faults"
Georgi Gerganov
2023-03-24
Fix memory allocation issues and seg faults
Georgi Gerganov
2023-03-23
Avoid the transposed X branch in the Z = X * Y matrix multiplication (#439)
Georgi Gerganov
2023-03-22
Add missing header for memcpy (#386)
Yusuf Kağan Hanoğlu
2023-03-22
Init llama_context_params properly from CLI (#370)
Georgi Gerganov
2023-03-22
Introduce C-style API (#370)
Georgi Gerganov