diff options
author | Matteo Boschini <12133566+mbosc@users.noreply.github.com> | 2023-08-01 09:43:12 +0200 |
---|---|---|
committer | GitHub <noreply@github.com> | 2023-08-01 10:43:12 +0300 |
commit | 1873ff586bd8499a18f763632711bf15d253585e (patch) | |
tree | f5c52d81b59d9044b2cd2b3b584e05be268ec278 /examples/server | |
parent | 49e7cb5bb1f75c91dd5db7d2d88cbc11bd9ee0c5 (diff) |
metal : add gqa8 kernel to allow llama-2-70B on metal (#2459)
* Added gqa8 kernel to allow llama-2-70B on metal
* Update ggml-metal.m
Co-authored-by: Cebtenzzre <cebtenzzre@gmail.com>
* Extend kernel_mul_mat_f16_f32 to handle gqa broadcast
* Added ne03==ne13 assertion
---------
Co-authored-by: Cebtenzzre <cebtenzzre@gmail.com>
Diffstat (limited to 'examples/server')
0 files changed, 0 insertions, 0 deletions