From 574406dc7e350ddbffaeca33bf0392b7bfeb1436 Mon Sep 17 00:00:00 2001 From: Georgi Gerganov Date: Wed, 26 Apr 2023 23:14:13 +0300 Subject: ggml : add Q5_0 and Q5_1 quantization (#1187) * ggml : add Q5_0 quantization (cuBLAS only) * ggml : fix Q5_0 qh -> uint32_t * ggml : fix q5_0 histogram stats * ggml : q5_0 scalar dot product * ggml : q5_0 ARM NEON dot * ggml : q5_0 more efficient ARM NEON using uint64_t masks * ggml : rename Q5_0 -> Q5_1 * ggml : adding Q5_0 mode * quantize : add Q5_0 and Q5_1 to map * ggml : AVX2 optimizations for Q5_0, Q5_1 (#1195) --------- Co-authored-by: Stephan Walter --- .gitignore | 1 + 1 file changed, 1 insertion(+) (limited to '.gitignore') diff --git a/.gitignore b/.gitignore index e52d479..c7573bb 100644 --- a/.gitignore +++ b/.gitignore @@ -15,6 +15,7 @@ build-em/ build-debug/ build-release/ build-static/ +build-cublas/ build-no-accel/ build-sanitize-addr/ build-sanitize-thread/ -- cgit v1.2.3