diff options
author | unbounded <haakon@likedan.net> | 2023-04-25 19:20:46 +0200 |
---|---|---|
committer | GitHub <noreply@github.com> | 2023-04-25 20:20:46 +0300 |
commit | dd0eabc049fb1efc631cab8eb0a646808d704e18 (patch) | |
tree | 23a35354481ec346c4501937b95612a19fff9d21 /tests | |
parent | 54bb60e26858be251a0eb3cb70f80322aff804a0 (diff) |
ggml : use full range for Q4_0 and Q4_2 quantization (#729)
* Use full range for q4_0 quantization
By keeping the sign of the highest magnitude, we can make sure the
highest value maps to -8, which is currently unused.
This is a bit of a freebie since it is fully backwards compatible with
the current format.
* Update quantize_row_q4_0 for AVX/AVX2
* Update quantize_row_q4_0 for WASM
Untested
* Update quantize_row_q4_0 for Arm NEON
* Update quantize_row_q4_0 for PowerPC
Untested
* Use full range for q4_2 quantization
Diffstat (limited to 'tests')
0 files changed, 0 insertions, 0 deletions