aboutsummaryrefslogtreecommitdiff
path: root/SHA256SUMS
diff options
context:
space:
mode:
authorGeorgi Gerganov <ggerganov@gmail.com>2023-04-05 22:07:33 +0300
committerGitHub <noreply@github.com>2023-04-05 22:07:33 +0300
commit986b6ce9f99503c51ec5afd8a10baa32359434c6 (patch)
treef4655b45b130b908729eb1407ca9e016c05f21a4 /SHA256SUMS
parent34162989297fdfe3ab7305451ce55bc87e3f4c9c (diff)
ggml, llama : avoid heavy V transpose + improvements (#775)
ggml : - added ggml_view_3d() - ggml_view_tensor() now inherits the stride too - reimplement ggml_cpy() to account for dst stride - no longer require tensor->data to be memory aligned llama : - compute RoPE on 32-bit tensors (should be more accurate) - store RoPE-ed K in the KV cache - store transposed V in the KV cache (significant speed-up) - avoid unnecessary Q copy
Diffstat (limited to 'SHA256SUMS')
0 files changed, 0 insertions, 0 deletions