aboutsummaryrefslogtreecommitdiff
path: root/ggml.c
AgeCommit message (Expand)Author
2023-05-23OpenCL Token Generation Acceleration (#1459)0cc4m
2023-05-21ggml : output 3d sizes in ggml_graph_dump_dot()Georgi Gerganov
2023-05-20ggml : update WASM SIMDGeorgi Gerganov
2023-05-20ggml : add ggml_clamp() (#1539)Georgi Gerganov
2023-05-20cuda : loading models directly into VRAM, norm calculation on GPU, broadcasti...Johannes Gäßler
2023-05-20llama : fix name shadowing and C4146 (#1526)Maxime
2023-05-20ggml : fix scalar implementation of Q4_1 dotGeorgi Gerganov
2023-05-19ggml : use F16 instead of F32 in Q4_0, Q4_1, Q8_0 (#1508)Georgi Gerganov
2023-05-16~7% faster Q5_1 AVX2 code (#1477)Ilya Kurdyukov
2023-05-14ggml : alternative fix for race condition bug in non-inplace ggml_compute_for...xaedes
2023-05-14ggml : various fixes (#1450)Georgi Gerganov
2023-05-14ggml : add AVX support based on AVX2 code (#1430)katsu560
2023-05-13ggml : multi-thread mul and diag_mask ops (#1428)Georgi Gerganov
2023-05-13ggml : GPU-accelerated token generation (#1412)Johannes Gäßler
2023-05-13ggml : implement backward pass for llama + small training-llama-from-scratch ...xaedes
2023-05-13ggml : sync alibi fix from ggml repoGeorgi Gerganov
2023-05-13Adding SSE instructions to ggml_vec_dot_q4_0_q8_0 (#1413)3ooabkhxtn
2023-05-12ggml : remove bit shuffling (#1405)Georgi Gerganov
2023-05-09use pause asm insn in busyloop to run the CPU (13600K) 10 °C cooler (#1314)Sami Farin
2023-05-06ggml : Allow usage of CLBlast alongside Accelerate.framework (#1336)swittk
2023-05-04ggml : change immintrin.h to intrin.h for compatibility (#1307)Ron Jailall
2023-05-03ggml : vectorize Q8_0 quantizationGeorgi Gerganov
2023-05-02ggml : fix 32-bit ARMGeorgi Gerganov
2023-05-02ggml : fix ppc64le build error and make cmake detect Power processors (#1284)Marvin Gießing
2023-05-02ggml: add names to tensors (#1268)slaren
2023-05-01cuBLAS: refactor and optimize f16 mat mul performance (#1259)slaren
2023-05-01ggml : fix ggml_used_mem() (#1264)Kerfuffle
2023-04-30ggml : fix UB (int << 31)Georgi Gerganov
2023-04-30ggml : add Q5 WASM SIMD + GGML_FTYPEGeorgi Gerganov
2023-04-30ggml : fix labels for GGML_OP_ALIBIGeorgi Gerganov
2023-04-29ggml : fix 32-bit ARM NEONGeorgi Gerganov
2023-04-29ggml : use vzip instead of vuzp for consistencyGeorgi Gerganov
2023-04-29ggml : fix visibility and unused warningsGeorgi Gerganov
2023-04-29ggml : fix #if for f32_f32 mul_mat (CLBlast) (#1229)Georgi Gerganov
2023-04-29ggml : adjust mul_mat_f16 work memory (#1226)Georgi Gerganov
2023-04-29cuBLAS: use host pinned memory and dequantize while copying (#1207)slaren
2023-04-29cuBLAS: non-contiguous tensor support (#1215)Henri Vasserman
2023-04-28Remove Q4_3 which is no better than Q5 (#1218)Stephan Walter
2023-04-28ggml : sync ggml (ggml_alibi)Georgi Gerganov
2023-04-28ggml : add helper debug printf in soft_maxGeorgi Gerganov
2023-04-28ggml : add CLBlast support (#1164)0cc4m
2023-04-28add avx2 for dot_q8_0_q8_0, 2x faster than scalar (#1211)Yann Follet
2023-04-26ggml : slightly faster AVX2 implementation for Q5 (#1197)Stephan Walter
2023-04-26ggml : add Q5_0 and Q5_1 quantization (#1187)Georgi Gerganov
2023-04-25ggml : add Q8_0 quantization format (rename the old one to Q8_1) (ARM NEON) (...Georgi Gerganov
2023-04-25ggml : use full range for Q4_0 and Q4_2 quantization (#729)unbounded
2023-04-24ggml : fix bug in ggml_compute_forward_sum_f32 (#1162)xaedes
2023-04-24Fix build for gcc 8 and test in CI (#1154)Stephan Walter
2023-04-23ggml : do not print perf ops that have not been used at allGeorgi Gerganov
2023-04-23ggml : better PERF prints + support "LLAMA_PERF=1 make"Georgi Gerganov