index
:
llama.cpp.git
master
llama.cpp
user
about
summary
refs
log
tree
commit
diff
log msg
author
committer
range
path:
root
/
Makefile
Age
Commit message (
Expand
)
Author
2023-08-09
CUDA: tuned mul_mat_q kernels (#2546)
Johannes Gäßler
2023-08-08
Allow passing grammar to completion endpoint (#2532)
Martin Krasser
2023-08-07
[Makefile] Move ARM CFLAGS before compilation (#2536)
GiviMAD
2023-08-04
Add --simple-io option for subprocesses and break out console.h and cpp (#1558)
DannyDaemonic
2023-08-02
tests : Fix compilation warnings (Linux/GCC) (#2451)
Eve
2023-07-31
CUDA: fixed LLAMA_FAST compilation option (#2473)
Johannes Gäßler
2023-07-31
CUDA: mmq CLI option, fixed mmq build issues (#2453)
Johannes Gäßler
2023-07-30
ggml : add graph tensor allocator (#2411)
slaren
2023-07-29
CUDA: Quantized matrix matrix multiplication (#2160)
Johannes Gäßler
2023-07-26
make : build with -Wmissing-prototypes (#2394)
Cebtenzzre
2023-07-24
Chat UI extras (#2366)
Aarni Koskela
2023-07-23
llama : add grammar-based sampling (#1773)
Evan Jones
2023-07-23
make : fix CLBLAST compile support in FreeBSD (#2331)
Jose Maldonado
2023-07-21
gitignore : changes for Poetry users + chat examples (#2284)
Jose Maldonado
2023-07-21
make : fix indentation
Georgi Gerganov
2023-07-21
make : support customized LLAMA_CUDA_NVCC and LLAMA_CUDA_CCBIN (#2275)
Sky Yan
2023-07-21
make : add new target for test binaries (#2244)
Jiří Podivín
2023-07-21
make : fix embdinput library and server examples building on MSYS2 (#2235)
Przemysław Pawełczyk
2023-07-14
make : use pkg-config for OpenBLAS (#2222)
wzy
2023-07-14
make : fix combination of LLAMA_METAL and LLAMA_MPI (#2208)
James Reynolds
2023-07-10
mpi : add support for distributed inference via MPI (#2099)
Evan Miller
2023-07-07
docker : add support for CUDA in docker (#1461)
dylan
2023-07-05
Quantized dot products for CUDA mul mat vec (#2067)
Johannes Gäßler
2023-07-04
Allow old Make to build server. (#2098)
Henri Vasserman
2023-07-04
Update Makefile: clean simple (#2097)
ZhouYuChen
2023-06-28
llama : support input embeddings directly (#1910)
ningshanwutuobang
2023-06-26
k-quants : support for super-block size of 64 (#2001)
Kawrakow
2023-06-19
Convert vector to f16 for dequantize mul mat vec (#1913)
Johannes Gäßler
2023-06-18
metal : handle buffers larger than device's maxBufferLength (#1826)
Georgi Gerganov
2023-06-17
make : do not print help for simple example
Georgi Gerganov
2023-06-17
make : update for latest Arch (#1701)
DaniAndTheWeb
2023-06-17
Server Example Refactor and Improvements (#1570)
Randall Fitzgerald
2023-06-16
examples : add "simple" (#1840)
SuperUserNameMan
2023-06-16
CUDA : faster k-quant dot kernels (#1862)
Kawrakow
2023-06-15
make : add train-text-from-scratch (#1850)
daboe01
2023-06-15
make : clean *.so files (#1857)
sandyiscool
2023-06-13
Allow "quantizing" to f16 and f32 (#1787)
Kerfuffle
2023-06-10
make : add SSSE3 compilation use case (#1659)
rankaiyx
2023-06-07
k-quants : allow to optionally disable at compile time (#1734)
Georgi Gerganov
2023-06-06
ggml : fix builds, add ggml-quants-k.o (close #1712, close #1710)
Georgi Gerganov
2023-06-05
ggml : add SOTA 2,3,4,5,6 bit k-quantizations (#1684)
Kawrakow
2023-06-04
llama : Metal inference (#1642)
Georgi Gerganov
2023-05-28
LLAMA_DEBUG adds debug symbols (#1617)
Johannes Gäßler
2023-05-27
Include server in releases + other build system cleanups (#1610)
Kerfuffle
2023-05-26
cuda : performance optimizations (#1530)
Johannes Gäßler
2023-05-23
OpenCL Token Generation Acceleration (#1459)
0cc4m
2023-05-21
make : .PHONY clean (#1553)
Stefan Sydow
2023-05-20
feature : support blis and other blas implementation (#1536)
Zenix
2023-05-20
Revert "feature : add blis and other BLAS implementation support (#1502)"
Georgi Gerganov
2023-05-20
feature : add blis and other BLAS implementation support (#1502)
Zenix
[next]