index
:
llama.cpp.git
master
llama.cpp
user
about
summary
refs
log
tree
commit
diff
log msg
author
committer
range
path:
root
/
ggml.h
Age
Commit message (
Expand
)
Author
2023-06-05
ggml : add SOTA 2,3,4,5,6 bit k-quantizations (#1684)
Kawrakow
2023-06-04
llama : Metal inference (#1642)
Georgi Gerganov
2023-05-29
ggml : sync cgraph import / export API
Georgi Gerganov
2023-05-27
ggml : add ggml_tensor_overhead()
Georgi Gerganov
2023-05-27
ggml : sync ggml core (minor additions, e.g. ggml_get_tensor_by_name())
Georgi Gerganov
2023-05-23
OpenCL Token Generation Acceleration (#1459)
0cc4m
2023-05-20
ggml : add ggml_clamp() (#1539)
Georgi Gerganov
2023-05-19
ggml : use F16 instead of F32 in Q4_0, Q4_1, Q8_0 (#1508)
Georgi Gerganov
2023-05-14
ggml : various fixes (#1450)
Georgi Gerganov
2023-05-14
ggml : add GGML_QNT_VERSION to track quantization format changes
Georgi Gerganov
2023-05-13
ggml : GPU-accelerated token generation (#1412)
Johannes Gäßler
2023-05-13
ggml : implement backward pass for llama + small training-llama-from-scratch ...
xaedes
2023-05-12
ggml : remove bit shuffling (#1405)
Georgi Gerganov
2023-05-02
ggml: add names to tensors (#1268)
slaren
2023-05-01
cuBLAS: refactor and optimize f16 mat mul performance (#1259)
slaren
2023-04-30
ggml : add Q5 WASM SIMD + GGML_FTYPE
Georgi Gerganov
2023-04-29
ggml : fix visibility and unused warnings
Georgi Gerganov
2023-04-28
Remove Q4_3 which is no better than Q5 (#1218)
Stephan Walter
2023-04-28
ggml : sync ggml (ggml_alibi)
Georgi Gerganov
2023-04-28
ggml : add CLBlast support (#1164)
0cc4m
2023-04-26
ggml : add Q5_0 and Q5_1 quantization (#1187)
Georgi Gerganov
2023-04-25
ggml : add Q8_0 quantization format (rename the old one to Q8_1) (ARM NEON) (...
Georgi Gerganov
2023-04-24
ggml : export symbols (#1155)
Georgi Gerganov
2023-04-20
ggml : sync ggml (add GPT-NeoX RoPE implementation)
Georgi Gerganov
2023-04-20
llama : multi-threaded quantization (#1075)
Kawrakow
2023-04-20
ggml : add Q4_3 quantization (#1082)
Georgi Gerganov
2023-04-19
Add NVIDIA cuBLAS support (#1044)
slaren
2023-04-18
ggml : add new Q4_2 quantization (ARM only) (#1046)
Georgi Gerganov
2023-04-17
Add LoRA support (#820)
slaren
2023-04-17
Speedup the AVX-512 implementation of ggml_vec_dot_q4_0() (#933)
Ivan Komarov
2023-04-15
ggml : add Q8_0 quantization for intermediate results (#951)
Georgi Gerganov
2023-04-14
Expose type name from ggml (#970)
Pavol Rusnak
2023-04-14
ggml : add unary and binary map operations (#874)
Kerfuffle
2023-04-13
ggml : add GGML_DEFAULT_N_THREADS
Georgi Gerganov
2023-04-11
Add enum llama_ftype, sync ggml_type to model files (#709)
Stephan Walter
2023-04-10
ggml : add ggml_cont() + optimize ggml_cpy() for contiguous dst
Georgi Gerganov
2023-04-10
Rewrite loading code to try to satisfy everyone:
comex
2023-04-08
Add quantize-stats command for testing quantization (#728)
unbounded
2023-04-05
ggml, llama : avoid heavy V transpose + improvements (#775)
Georgi Gerganov
2023-04-02
ggml : change ne to int64_t (#626)
Marian Cepok
2023-03-30
Ensure --mlock works properly with mmap() support
Justine Tunney
2023-03-30
Add mmap support for model files
Slaren
2023-03-28
ggml : introduce structs for the q4 data blocks (#356)
Stephan Walter
2023-03-24
Support calling mlock() on loaded model data on Linux and macOS (#453)
comex
2023-03-22
Deduplicate q4 quantization functions (#383)
Stephan Walter
2023-03-22
Introduce C-style API (#370)
Georgi Gerganov
2023-03-16
Add RMS norm and use it (#187)
hoangmit
2023-03-10
Initial release
Georgi Gerganov