llama.cpp.git - llama.cpp

diff options

author	unbounded <haakon@likedan.net>	2023-04-25 19:20:46 +0200
committer	GitHub <noreply@github.com>	2023-04-25 20:20:46 +0300
commit	dd0eabc049fb1efc631cab8eb0a646808d704e18 (patch)
tree	23a35354481ec346c4501937b95612a19fff9d21 /llama.h
parent	54bb60e26858be251a0eb3cb70f80322aff804a0 (diff)

ggml : use full range for Q4_0 and Q4_2 quantization (#729)

* Use full range for q4_0 quantization By keeping the sign of the highest magnitude, we can make sure the highest value maps to -8, which is currently unused. This is a bit of a freebie since it is fully backwards compatible with the current format. * Update quantize_row_q4_0 for AVX/AVX2 * Update quantize_row_q4_0 for WASM Untested * Update quantize_row_q4_0 for Arm NEON * Update quantize_row_q4_0 for PowerPC Untested * Use full range for q4_2 quantization

Diffstat (limited to 'llama.h')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: