diff options
author | slaren <2141330+slaren@users.noreply.github.com> | 2023-05-01 18:11:07 +0200 |
---|---|---|
committer | GitHub <noreply@github.com> | 2023-05-01 18:11:07 +0200 |
commit | 58b367c2d757c0ea12aec672382462b42204c724 (patch) | |
tree | b2fa89daf71c08788c44e3fb9abf1747ec8ee65d /examples/save-load-state | |
parent | ea3a0ad6b6b5ca4693b94acd4cb32e2803f66fae (diff) |
cuBLAS: refactor and optimize f16 mat mul performance (#1259)
* cuBLAS: refactor, convert fp16 to fp32 on device
* cuBLAS: use multiple streams, choose smartly between mul_mat_q and mul_mat_f16
* fix build
* cuBLAS: update block_q5_1
Diffstat (limited to 'examples/save-load-state')
0 files changed, 0 insertions, 0 deletions