aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-03-19Change RMSNorm eps to 1e-6 (#173)Georgi Gerganov
I think this is what is used in the Python code
2023-03-18Warn user if a context size greater than 2048 tokens is specified (#274)Ronsor
LLaMA doesn't support more than 2048 token context sizes, and going above that produces terrible results.
2023-03-18Fix typo in readmePavol Rusnak
2023-03-18Add note about Python 3.11 to readmePavol Rusnak
2023-03-18Add memory/disk requirements to readmePavol Rusnak
2023-03-18Remove unused code since n_vocab is model.hparams.n_vocab (#262)Alex Nguyen
2023-03-18fixed warning with std::ignore about unused function result (#151)Justin Suess
fixed warning with std::ignore about unused function result
2023-03-18Fix n^2 loop in tokenization (#254)Gary Linscott
This causes long prompts to parse very slowly.
2023-03-18CI Improvements (#230)anzz1
* CI Improvements Manual build feature, autoreleases for Windows * better CI naming convention use branch name in releases and tags
2023-03-17Nix flake (#40)Niklas Korz
* Nix flake * Nix: only add Accelerate framework on macOS * Nix: development shel, direnv and compatibility * Nix: use python packages supplied by withPackages * Nix: remove channel compatibility * Nix: fix ARM neon dotproduct on macOS --------- Co-authored-by: Pavol Rusnak <pavol@rusnak.io>
2023-03-17Implement non-greedy tokenizer that tries to maximize token lengths (#242)thement
* Implement non-greedy tokenizer that tries to maximize token lengths * Insert single space in front of the prompt - this is to match original llama tokenizer behavior --------- Co-authored-by: Jakub Horak <jakub.horak@ibawizard.net>
2023-03-17Default to 4 threads (#243)Georgi Gerganov
2023-03-17Update Contributing sectionGeorgi Gerganov
2023-03-17Don't tell users to use a bad number of threads (#243)Stephan Walter
The readme tells people to use the command line option "-t 8", causing 8 threads to be started. On systems with fewer than 8 cores, this causes a significant slowdown. Remove the option from the example command lines and use /proc/cpuinfo on Linux to determine a sensible default.
2023-03-17add ptread link to fix cmake build under linux (#114)mmyjona
* add ptread link to fix cmake build under linux * add cmake to linux and macos platform * separate make and cmake workflow --------- Co-authored-by: SebastiƔn A <sebastian.aedo29@gmail.com>
2023-03-17šŸš€ Dockerize llamacpp (#132)Bernat Vadell
* feat: dockerize llamacpp * feat: split build & runtime stages * split dockerfile into main & tools * add quantize into tool docker image * Update .devops/tools.sh Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * add docker action pipeline * change CI to publish at github docker registry * fix name runs-on macOS-latest is macos-latest (lowercase) * include docker versioned images * fix github action docker * fix docker.yml * feat: include all-in-one command tool & update readme.md --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-03-17Q4_1 quantization (#193)Matvey Soloviev
* Add AVX2 version of ggml_vec_dot_q4_1 * Small optimisations to q4_1 dot product (@Const-me) * Rearrange Q4_1 quantization to work for multipart models. (Fix #152) * Fix ggml_vec_mad_q4_1 too * Fix non-vectorised q4_1 vec mul
2023-03-16Update README.mdGeorgi Gerganov
2023-03-16Expand "Contributing" sectionGeorgi Gerganov
2023-03-16Update hot topics - RMSnormGeorgi Gerganov
2023-03-15Fix RMS norm in GGML (#191)Nebula
2023-03-16Add RMS norm and use it (#187)hoangmit
* add ggml_rms_norm * update op num
2023-03-15fixed typo (#178)moritzbrantner
2023-03-15add SIGINT support for _WIN32 environments (#120)Rickey Bowers Jr
* add SIGINT support for _WIN32 environments * perhaps more consistent
2023-03-15added ctx_size parameter (#148)Justin Suess
* added ctx_size parameter * added it in more places * Apply suggestions from code review --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-03-15fixed color reset on exit (#149)Justin Suess
* fixed color reset on exit * added sigint handler for ansi_color_reset * Update main.cpp --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-03-15Fix potential licensing issue (#126)Musab Gultekin
* Update README.md * Update README.md remove facebook
2023-03-15Use `tokenizer.vocab_size()` instead of hardcoding 32000 in ā†µRonsor
convert-pth-to-ggml.py (#142) There are ways that special tokens or other new tokens could be added to the tokenizer; therefore it's probably best not to assume the vocabulary is only 32000 tokens.
2023-03-15inline -> static inline for "bytesFromNibbles" (#161)hoangmit
Without "static" prefix, it fails to compile in clang
2023-03-14Don't use vdotq_s32 if it's not available (#139)Ronsor
* Don't use vdotq_s32 if it's not available `dotprod` extensions aren't available on some ARM CPUs (e.g. Raspberry Pi 4), so check for them and only use them if they're available. Reintroduces the code removed in 84d9015 if `__ARM_FEATURE_DOTPROD` isn't defined. * Update ggml.c --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-03-14Add section to README on how to run the project on Android (#130)Radoslav Gerganov
2023-03-14Add Misc section + update hot topics + minor fixesGeorgi Gerganov
2023-03-13Add windows to the CI (#98)SebastiƔn A
2023-03-13CMake build in Release by default (#75)Georgi Gerganov
2023-03-13Update contribution section, hot topics, limitations, etc.Georgi Gerganov
2023-03-13Print system informationGeorgi Gerganov
2023-03-13Initial support for CMake (#75)SebastiƔn A
2023-03-13Add NetBSD support. (#90)Thomas Klausner
2023-03-13Use fprintf for diagnostic output (#48)Pavol Rusnak
keep printf only for printing model output one can now use ./main ... 2>dev/null to suppress any diagnostic output
2023-03-13Use vdotq_s32 to improve performance (#67)Georgi Gerganov
* 10% performance boost on ARM * Back to original change
2023-03-13Reduce model loading time (#43)uint256_t
* Use buffering * Use vector * Minor --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-03-13Fix UTF-8 handling (including colors) (#79)Val Kharitonov
2023-03-13Add quantize script for batch quantization (#92)Pavol Rusnak
* Add quantize script for batch quantization * Indentation * README for new quantize.sh * Fix script name * Fix file list on Mac OS --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-03-13Add initial contribution guidelinesGeorgi Gerganov
2023-03-13Gate signal support on being on a unixoid system. (#74)Matvey Soloviev
2023-03-13Fix token count accountingMatvey Soloviev
2023-03-13Revert "10% performance boost on ARM"Georgi Gerganov
This reverts commit 113a9e83ebc0f788f861394437087bf3ca0e019b. There are some reports for illegal instruction. Moved this stuff to vdotq_s32 branch until resolve
2023-03-13Check for vdotq_s32 availabilityGeorgi Gerganov
2023-03-13Ammend to previous commit - forgot to update non-QRDMX branchGeorgi Gerganov
2023-03-1310% performance boost on ARMGeorgi Gerganov