index
:
llama.cpp.git
master
llama.cpp
user
about
summary
refs
log
tree
commit
diff
log msg
author
committer
range
path:
root
/
ggml-metal.m
Age
Commit message (
Expand
)
Author
2023-07-14
Metal: faster Q4_0 and Q4_1 matrix x vector kernels (#2212)
Kawrakow
2023-07-12
metal : new q4_0 matrix-vector kernel (#2188)
Shouzheng Liu
2023-07-11
ggml : remove src0 and src1 from ggml_tensor and rename opt to src (#2178)
Spencer Sutton
2023-07-10
mpi : add support for distributed inference via MPI (#2099)
Evan Miller
2023-07-07
ggml : change ggml_graph_compute() API to not require context (#1999)
Qingyou Meng
2023-07-01
metal : release buffers when freeing metal context (#2062)
Aaron Miller
2023-06-26
k-quants : support for super-block size of 64 (#2001)
Kawrakow
2023-06-18
metal : handle buffers larger than device's maxBufferLength (#1826)
Georgi Gerganov
2023-06-17
minor : warning fixes
Georgi Gerganov
2023-06-17
metal : add norm, cpy f16->f16, alibi kernels (#1823)
Aaron Miller
2023-06-15
metal : parallel command buffer encoding (#1860)
Georgi Gerganov
2023-06-12
Metal implementation for all k_quants (#1807)
Kawrakow
2023-06-12
metal : fix failure to load model (#1817)
Kawrakow
2023-06-10
metal : fix issue with ggml-metal.metal path. Closes #1769 (#1782)
Andrei
2023-06-10
metal : add Q4_1 implementation (#1785)
Kawrakow
2023-06-09
metal : add GELU implementation (#1770)
AT
2023-06-09
metal : faster q4_0 (#1775)
Kawrakow
2023-06-08
metal : add Q2_K implementation (#1762)
Kawrakow
2023-06-08
metal : Q6_K implementation (#1752)
Kawrakow
2023-06-08
metal : add Q4_K implementation (#1733)
Kawrakow
2023-06-06
metal : add f16 support
Georgi Gerganov
2023-06-06
metal : add checks for buffer size (#1706)
Spencer Sutton
2023-06-05
metal : use shared buffers between CPU and GPU (#1696)
kiltyj
2023-06-04
llama : Metal inference (#1642)
Georgi Gerganov