aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorPavol Rusnak <pavol@rusnak.io>2023-05-05 16:43:36 +0200
committerGitHub <noreply@github.com>2023-05-05 16:43:36 +0200
commit921dcee00a55d9aba3b3026d0509d31ac8386e2a (patch)
tree66dd198f7f2d7efc1bbe6aa004fd14ed4a50aaa8
parent2d13786e91ec9fd28ddf737053822042a824da78 (diff)
readme: add missing info (#1324)
-rw-r--r--README.md6
1 files changed, 4 insertions, 2 deletions
diff --git a/README.md b/README.md
index f1fa635..233c5c5 100644
--- a/README.md
+++ b/README.md
@@ -18,10 +18,12 @@ The main goal of `llama.cpp` is to run the LLaMA model using 4-bit integer quant
- Plain C/C++ implementation without dependencies
- Apple silicon first-class citizen - optimized via ARM NEON and Accelerate framework
-- AVX2 support for x86 architectures
+- AVX, AVX2 and AVX512 support for x86 architectures
- Mixed F16 / F32 precision
-- 4-bit integer quantization support
+- 4-bit, 5-bit and 8-bit integer quantization support
- Runs on the CPU
+- OpenBLAS support
+- cuBLAS and CLBlast support
The original implementation of `llama.cpp` was [hacked in an evening](https://github.com/ggerganov/llama.cpp/issues/33#issuecomment-1465108022).
Since then, the project has improved significantly thanks to many contributions. This project is for educational purposes and serves