aboutsummaryrefslogtreecommitdiff
path: root/.clang-tidy
AgeCommit message (Collapse)Author
2023-06-08clang-tidy : restore dot file from accidental deletionGeorgi Gerganov
2023-06-08metal : add Q4_K implementation (#1733)Kawrakow
* Metal implementation for Q4_K Very slow for now: 42 ms / token, Q4_0 runs in 28 ms/token on my 30-core M2 Max GPU. * Optimizing Q4_K on metal The first token always takes longer, I guess because the metal kernel is being jit-compiled. So, using n = 128 to measure time. At this point Q4_K takes 29.5 ms / token compared to 27.2 ms / token for Q4_0. Quite a bit better than the initial attempt, but still not good enough. * Optimizing q4_K metal dot some more For n = 256 it is now 28.1 ms/token compared to 27 ms/token for q4_0. * Fix after merge with master --------- Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
2023-05-12Add clang-tidy reviews to CI (#1407)slaren