diff options
author | slaren <2141330+slaren@users.noreply.github.com> | 2023-04-29 02:04:18 +0200 |
---|---|---|
committer | GitHub <noreply@github.com> | 2023-04-29 02:04:18 +0200 |
commit | 7fc50c051ae8a78e9643fdf172d12e20f2dd9b6c (patch) | |
tree | cc017db2f3443a39221ad319ab51df0925012e84 /examples/main/README.md | |
parent | b1ee8f59b4101b46999a0995d9a34506f7285466 (diff) |
cuBLAS: use host pinned memory and dequantize while copying (#1207)
* cuBLAS: dequantize simultaneously while copying memory
* cuBLAS: use host pinned memory
* cuBLAS: improve ggml_compute_forward_mul_mat_f16_f32 with pinned memory
* cuBLAS: also pin kv cache
* fix rebase
Diffstat (limited to 'examples/main/README.md')
0 files changed, 0 insertions, 0 deletions