aboutsummaryrefslogtreecommitdiff
path: root/ggml-metal.metal
diff options
context:
space:
mode:
authorKerfuffle <44031344+KerfuffleV2@users.noreply.github.com>2023-06-10 01:59:17 -0600
committerGitHub <noreply@github.com>2023-06-10 10:59:17 +0300
commit4f0154b0bad775ac4651bf73b5c216eb43c45cdc (patch)
tree33a6036c589fd494af7de0cd786e395d4fd3f699 /ggml-metal.metal
parentef3171d16241c18581d4d08374f0b9e396ade6b7 (diff)
llama : support requantizing models instead of only allowing quantization from 16/32bit (#1691)
* Add support for quantizing already quantized models * Threaded dequantizing and f16 to f32 conversion * Clean up thread blocks with spares calculation a bit * Use std::runtime_error exceptions.
Diffstat (limited to 'ggml-metal.metal')
0 files changed, 0 insertions, 0 deletions