aboutsummaryrefslogtreecommitdiff
path: root/convert-lora-to-ggml.py
diff options
context:
space:
mode:
authorGeorgi Gerganov <ggerganov@gmail.com>2023-04-19 20:10:08 +0300
committerGitHub <noreply@github.com>2023-04-19 20:10:08 +0300
commit884e7d7a2bfd7325b107442d6758983f5886ed3d (patch)
tree9b3bcda080b127f069092cfc04db151421746754 /convert-lora-to-ggml.py
parent7cd5c4a3e9106151d48f328bb3c94c298a211f18 (diff)
ggml : use 8-bit precision for Q4_1 intermediate results (#1047)
* ggml : use 8-bit precision for Q4_1 intermediate results (ARM) * ggml : optimize ggml_vec_dot_q4_1_q8_0() via vmalq_n_f32 56 ms/token with Q4_1 ! * ggml : AVX2 implementation of ggml_vec_dot_q4_1_q8_0 (#1051) * gitignore : ignore ppl-*.txt files --------- Co-authored-by: slaren <2141330+slaren@users.noreply.github.com>
Diffstat (limited to 'convert-lora-to-ggml.py')
0 files changed, 0 insertions, 0 deletions