aboutsummaryrefslogtreecommitdiff
path: root/grammars/json.gbnf
diff options
context:
space:
mode:
authorKawrakow <48489457+ikawrakow@users.noreply.github.com>2023-07-24 00:19:47 +0300
committerGitHub <noreply@github.com>2023-07-24 00:19:47 +0300
commit2f9cf974a066ac0e03fbb235d834b01b0164d743 (patch)
tree1c0c1b42ef5d1f8013d9641d778225e98b59d134 /grammars/json.gbnf
parent4f06592cc6b83979e4b442e8cb97b3948c857188 (diff)
Some more Q4_K and Q5_K speedup on CUDA (#2346)
* Faster Q5_K on CUDA * Small Q5_K improvement on older GPUs * Spped up Q4_K on CUDA GTX1660: 29.5 ms/t -> 25.6 ms/t RTX4080: 8.40 ms/t -> 8.25 ms/t * Spped up Q4_K on CUDA GTX1660: 36.7 ms/t -> 35.6 ms/t RTX4080: 9.8 ms/t -> 9.5 ms/t * Address PR comments * Add some comments to satisfy PR reviewer --------- Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
Diffstat (limited to 'grammars/json.gbnf')
0 files changed, 0 insertions, 0 deletions