aboutsummaryrefslogtreecommitdiff
AgeCommit message (Expand)Author
2023-03-28deploy : add a Package.swift for SwiftPM support (#393)Jed Fox
2023-03-28ggml : introduce structs for the q4 data blocks (#356)Stephan Walter
2023-03-28gitignore : add "embedding"Georgi Gerganov
2023-03-28Check the existence of f16_model_path_base in quantize.py (#574)dotpy314
2023-03-28Fix usage of F16C intrinsics in AVX code (#563)slaren
2023-03-28main.cpp fixes, refactoring (#571)anzz1
2023-03-28Add embedding example to Makefile (#540)RJ Adriaansen
2023-03-27Fix missing ggml link in cmake for examples/* on w64-mingw32 (#542)Marco Matthies
2023-03-26ci: add debug build to sanitizer build matrix (#527)Erik Scholz
2023-03-26Fix undefined variables in debug build, remove unused variables (#531)Stephan Walter
2023-03-26Add support for linux/arm64 platform during Docker Builds (#514)Juan Calderon-Perez
2023-03-26Update README and comments for standalone perplexity tool (#525)Stephan Walter
2023-03-26[main] fix infinite generation (-n == -1) (#523)anzz1
2023-03-26Add logo to README.mdGeorgi Gerganov
2023-03-26Exit from interactive mode if input stream is bad (#491)Harald Fernengel
2023-03-26CI: Run other sanitizer builds even if one fails (#511)anzz1
2023-03-25Clarify console output in convert-pth-to-ggml.py (#512)jp-x-g
2023-03-25CMake / CI additions (#497)anzz1
2023-03-25(Windows) Set console to UTF-8 on init (#420)anzz1
2023-03-25Fix colors enabling on WIN32Georgi Gerganov
2023-03-25If n_predict == -1, generate foreverGeorgi Gerganov
2023-03-25Inifinite generation via context swapping (#71)Georgi Gerganov
2023-03-25Cleanup STL headers + fix embedding examples + minor stuffGeorgi Gerganov
2023-03-25Move chat scripts into "./examples"Georgi Gerganov
2023-03-25Add AVX2 implementation of dequantize_row_q4_1 (#505)slaren
2023-03-25Overhaul the examples structureGeorgi Gerganov
2023-03-25Retire the ggml_mul_mat() branch for transposed src0 (#500)Georgi Gerganov
2023-03-25Disable prompt verbosity by default and add option to enable (#480)Georgi Gerganov
2023-03-25Add AVX2 implementation of dequantize_row_q4_0 (#467)slaren
2023-03-25Don't interefe with BLAS for large prompts by running only 1 threadGeorgi Gerganov
2023-03-25Add longer DAN prompt for testing big batch numbersGeorgi Gerganov
2023-03-25Add timings for the prompt evaluation (#478)slaren
2023-03-25Remove obsolete information from READMEGeorgi Gerganov
2023-03-25Remove obsolete assert and fix compiler warningGeorgi Gerganov
2023-03-25Fix nasty bug in ggml_compute_forward_mul_mat_f32() and reenable BLASGeorgi Gerganov
2023-03-25bounds checking for input prefix (#492)anzz1
2023-03-25feat: '--in-prefix STRING' option (#426)anzz1
2023-03-25Add support for file load progress reporting callbacks (#434)Jed Fox
2023-03-25Add missing struct annotation (#483)Doomsdayrs
2023-03-25Fix crash for 65B model with pre-allocated memory (#485)Chris Kuehl
2023-03-24Disable BLAS altogether - the bug is not just for qunatized mat mulGeorgi Gerganov
2023-03-24Disable BLAS branch in mul_mat - seems there is a bugGeorgi Gerganov
2023-03-24Immediately start processing the prompt before user input has been provided (...Georgi Gerganov
2023-03-24Reduce memory usage and allocate enough memory for largest context (#473)Georgi Gerganov
2023-03-24Temporary bump the memory buffer size - hopefully fix issues from 483bab2eGeorgi Gerganov
2023-03-24Update README.md (#444)Gary Mulder
2023-03-24fix instruct mode (#445)rabidcopy
2023-03-24Properly free llama_context on failureGeorgi Gerganov
2023-03-24additional optimizations for POWER9 (#454)Cameron Kaiser
2023-03-24Support calling mlock() on loaded model data on Linux and macOS (#453)comex