Add initial AVX512 support for dot product on Linux (#320) - llama.cpp.git

diff options

author	Casey Primozic <casey@cprimozic.net>	2023-03-21 07:35:42 -0700
committer	GitHub <noreply@github.com>	2023-03-21 15:35:42 +0100
commit	2e664f1ff413995506c9a54f3a8d5b8c64e37a91 (patch)
tree	0162d9c81e72e85d21a5806b35dbefc6587c105d /prompts
parent	8cf9f34eddc124d4ab28f4d2fe8e99d574510bde (diff)

Add initial AVX512 support for dot product on Linux (#320)

* Update Makefile to detect AVX512 support and add compiler flags if it's available * Based on existing AVX2 implementation, dot product on one 32-value block of 4-bit quantized ints at a time * Perform 8 bit -> 16 bit sign extension and multiply+add on 32 values at time instead of 16 * Use built-in AVX512 horizontal reduce add to get sum at the end * Manual unrolling on inner dot product loop to reduce loop counter overhead

Diffstat (limited to 'prompts')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: