diff options
author | Johannes Gäßler <johannesg@5d6.de> | 2023-05-25 23:07:29 +0200 |
---|---|---|
committer | GitHub <noreply@github.com> | 2023-05-26 00:07:29 +0300 |
commit | 1fcdcc28b119a6608774d52de905931bd5f8a43d (patch) | |
tree | a28504b1f2b0ed7d4b550316c37a9b7e25de889c /examples/baby-llama | |
parent | ac7876ac20124a15a44fd6317721ff1aa2538806 (diff) |
cuda : performance optimizations (#1530)
* xor hack
* block y dim
* loop unrolling
* Fixed cmake LLAMA_CUDA_BY option
* Removed hipblas compatibility code
* Define GGML_CUDA_DMMV_BLOCK_Y if not defined
* Fewer iters, more ops per iter
* Renamed DMMV X/Y compilation options
Diffstat (limited to 'examples/baby-llama')
0 files changed, 0 insertions, 0 deletions