llama.cpp.git - llama.cpp

Age	Commit message (Expand)	Author
2023-08-01	metal : add gqa8 kernel to allow llama-2-70B on metal (#2459)	Matteo Boschini
2023-07-25	Another speed gain for Q4_0 and Q4_1 on Metal (#2375)	Kawrakow
2023-07-23	metal : support bcast add & dup & cont op (#2323)	Jiahao Li
2023-07-21	Faster Q3_K implementation on Metal (#2307)	Kawrakow
2023-07-21	Faster Q2_K on Metal (#2297)	Kawrakow
2023-07-20	Faster Q5_K and Q6_K on Metal (#2294)	Kawrakow
2023-07-20	Faster Q4_K on Metal (#2290)	Kawrakow
2023-07-20	metal: minor q4 optimization and reduce code size (#2248)	Shouzheng Liu
2023-07-15	llama : add custom RoPE (#2054)	Xiao-Yong Jin
2023-07-14	Metal: faster Q4_0 and Q4_1 matrix x vector kernels (#2212)	Kawrakow
2023-07-12	metal : new q4_0 matrix-vector kernel (#2188)	Shouzheng Liu
2023-06-26	k-quants : support for super-block size of 64 (#2001)	Kawrakow
2023-06-17	metal : add norm, cpy f16->f16, alibi kernels (#1823)	Aaron Miller
2023-06-12	Metal implementation for all k_quants (#1807)	Kawrakow
2023-06-10	metal : add Q4_1 implementation (#1785)	Kawrakow
2023-06-09	metal : fix build "tanhf" -> "tanh"	Georgi Gerganov
2023-06-09	metal : add GELU implementation (#1770)	AT
2023-06-09	metal : faster q4_0 (#1775)	Kawrakow
2023-06-08	metal : add Q2_K implementation (#1762)	Kawrakow
2023-06-08	metal : Q6_K implementation (#1752)	Kawrakow
2023-06-08	metal : add Q4_K implementation (#1733)	Kawrakow
2023-06-06	metal : add f16 support	Georgi Gerganov
2023-06-04	llama : Metal inference (#1642)	Georgi Gerganov