aboutsummaryrefslogtreecommitdiff
path: root/pocs/vdot/vdot.cpp
AgeCommit message (Collapse)Author
2023-07-05ggml : generalize `quantize_fns` for simpler FP16 handling (#1237)Stephan Walter
* Generalize quantize_fns for simpler FP16 handling * Remove call to ggml_cuda_mul_mat_get_wsize * ci : disable FMA for mac os actions --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-06-16build : fix and ignore MSVC warnings (#1889)Borislav Stanimirov
2023-04-18Adding a simple program to measure speed of dot products (#1041)Kawrakow
On my Mac, the direct Q4_1 product is marginally slower (~69 vs ~55 us for Q4_0). The SIMD-ified ggml version is now almost 2X slower (~121 us). On a Ryzen 7950X CPU, the direct product for Q4_1 quantization is faster than the AVX2 implementation (~60 vs ~62 us). --------- Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>