diff options
author | Kawrakow <48489457+ikawrakow@users.noreply.github.com> | 2023-04-19 20:20:14 +0200 |
---|---|---|
committer | GitHub <noreply@github.com> | 2023-04-19 20:20:14 +0200 |
commit | f7d05095b404b5500b4a702ea16f67fc22446e49 (patch) | |
tree | 5524438ba47ddd2a11a625d0ceb50c6595136190 /README.md | |
parent | 884e7d7a2bfd7325b107442d6758983f5886ed3d (diff) |
Q4_2 quantization with rmse-optimized scale and quants (#1062)
* Q4_2 quantization with rmse-optimized scale and quants
For quantize-stats we get
q4_2: rmse 0.00159301, maxerr 0.17480469, 95pct<0.0030, median<0.0012
For 7B perplexity with BLAS enabled we get 6.2038 after 655 chunks.
Quantization is slow (~90 seconds on my Mac for 7B) as not
multi-threaded as in PR #896.
* ggml : satisfy the sanitizer builds
Not sure why this makes them fail
* Better follow ggml conventions for function names
* Fixed type as per reviewer comment
---------
Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Diffstat (limited to 'README.md')
0 files changed, 0 insertions, 0 deletions