aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-03-19Add --ignore-eos parameter (#181)slaren
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-03-19interactive mode: print '\n' in sigint_handler, this flush stdout thus ↵Qingyou Meng
ensure color reset. (#283)
2023-03-19Command line switch to use F16 for memory_k and memory_v (refactor of #154) ↵Erik Scholz
(#294) * Use F16 for memory_k and memory_v * add command line switch to use f16 instead of f32 for memory k+v --------- Co-authored-by: Ty Everett <ty@tyweb.us>
2023-03-19Update hot topics to mention Alpaca supportGeorgi Gerganov
2023-03-19Fix off-by-one bug (#115)Georgi Gerganov
2023-03-19Fix python stuff (#109)Georgi Gerganov
2023-03-19Refactoring `convert-pth-to-ggml.py`: more concise and readable (#109)qunash
* Refactor get_n_parts function to simplify code and improve readability * Use f-strings instead of concatenation * Refactoring: more concise and readable * modularize --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-03-19Drop trailing new line from file prompts (#80)Georgi Gerganov
2023-03-19Add instruction for using Alpaca (#240)Georgi Gerganov
2023-03-19Add "--instruct" argument for usage with Alpaca (#240)Georgi Gerganov
Also start adding prompts in "./prompts"
2023-03-19Change RMSNorm eps to 1e-6 (#173)Georgi Gerganov
I think this is what is used in the Python code
2023-03-18Warn user if a context size greater than 2048 tokens is specified (#274)Ronsor
LLaMA doesn't support more than 2048 token context sizes, and going above that produces terrible results.
2023-03-18Fix typo in readmePavol Rusnak
2023-03-18Add note about Python 3.11 to readmePavol Rusnak
2023-03-18Add memory/disk requirements to readmePavol Rusnak
2023-03-18Remove unused code since n_vocab is model.hparams.n_vocab (#262)Alex Nguyen
2023-03-18fixed warning with std::ignore about unused function result (#151)Justin Suess
fixed warning with std::ignore about unused function result
2023-03-18Fix n^2 loop in tokenization (#254)Gary Linscott
This causes long prompts to parse very slowly.
2023-03-18CI Improvements (#230)anzz1
* CI Improvements Manual build feature, autoreleases for Windows * better CI naming convention use branch name in releases and tags
2023-03-17Nix flake (#40)Niklas Korz
* Nix flake * Nix: only add Accelerate framework on macOS * Nix: development shel, direnv and compatibility * Nix: use python packages supplied by withPackages * Nix: remove channel compatibility * Nix: fix ARM neon dotproduct on macOS --------- Co-authored-by: Pavol Rusnak <pavol@rusnak.io>
2023-03-17Implement non-greedy tokenizer that tries to maximize token lengths (#242)thement
* Implement non-greedy tokenizer that tries to maximize token lengths * Insert single space in front of the prompt - this is to match original llama tokenizer behavior --------- Co-authored-by: Jakub Horak <jakub.horak@ibawizard.net>
2023-03-17Default to 4 threads (#243)Georgi Gerganov
2023-03-17Update Contributing sectionGeorgi Gerganov
2023-03-17Don't tell users to use a bad number of threads (#243)Stephan Walter
The readme tells people to use the command line option "-t 8", causing 8 threads to be started. On systems with fewer than 8 cores, this causes a significant slowdown. Remove the option from the example command lines and use /proc/cpuinfo on Linux to determine a sensible default.
2023-03-17add ptread link to fix cmake build under linux (#114)mmyjona
* add ptread link to fix cmake build under linux * add cmake to linux and macos platform * separate make and cmake workflow --------- Co-authored-by: Sebastián A <sebastian.aedo29@gmail.com>
2023-03-17🚀 Dockerize llamacpp (#132)Bernat Vadell
* feat: dockerize llamacpp * feat: split build & runtime stages * split dockerfile into main & tools * add quantize into tool docker image * Update .devops/tools.sh Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * add docker action pipeline * change CI to publish at github docker registry * fix name runs-on macOS-latest is macos-latest (lowercase) * include docker versioned images * fix github action docker * fix docker.yml * feat: include all-in-one command tool & update readme.md --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-03-17Q4_1 quantization (#193)Matvey Soloviev
* Add AVX2 version of ggml_vec_dot_q4_1 * Small optimisations to q4_1 dot product (@Const-me) * Rearrange Q4_1 quantization to work for multipart models. (Fix #152) * Fix ggml_vec_mad_q4_1 too * Fix non-vectorised q4_1 vec mul
2023-03-16Update README.mdGeorgi Gerganov
2023-03-16Expand "Contributing" sectionGeorgi Gerganov
2023-03-16Update hot topics - RMSnormGeorgi Gerganov
2023-03-15Fix RMS norm in GGML (#191)Nebula
2023-03-16Add RMS norm and use it (#187)hoangmit
* add ggml_rms_norm * update op num
2023-03-15fixed typo (#178)moritzbrantner
2023-03-15add SIGINT support for _WIN32 environments (#120)Rickey Bowers Jr
* add SIGINT support for _WIN32 environments * perhaps more consistent
2023-03-15added ctx_size parameter (#148)Justin Suess
* added ctx_size parameter * added it in more places * Apply suggestions from code review --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-03-15fixed color reset on exit (#149)Justin Suess
* fixed color reset on exit * added sigint handler for ansi_color_reset * Update main.cpp --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-03-15Fix potential licensing issue (#126)Musab Gultekin
* Update README.md * Update README.md remove facebook
2023-03-15Use `tokenizer.vocab_size()` instead of hardcoding 32000 in ↵Ronsor
convert-pth-to-ggml.py (#142) There are ways that special tokens or other new tokens could be added to the tokenizer; therefore it's probably best not to assume the vocabulary is only 32000 tokens.
2023-03-15inline -> static inline for "bytesFromNibbles" (#161)hoangmit
Without "static" prefix, it fails to compile in clang
2023-03-14Don't use vdotq_s32 if it's not available (#139)Ronsor
* Don't use vdotq_s32 if it's not available `dotprod` extensions aren't available on some ARM CPUs (e.g. Raspberry Pi 4), so check for them and only use them if they're available. Reintroduces the code removed in 84d9015 if `__ARM_FEATURE_DOTPROD` isn't defined. * Update ggml.c --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-03-14Add section to README on how to run the project on Android (#130)Radoslav Gerganov
2023-03-14Add Misc section + update hot topics + minor fixesGeorgi Gerganov
2023-03-13Add windows to the CI (#98)Sebastián A
2023-03-13CMake build in Release by default (#75)Georgi Gerganov
2023-03-13Update contribution section, hot topics, limitations, etc.Georgi Gerganov
2023-03-13Print system informationGeorgi Gerganov
2023-03-13Initial support for CMake (#75)Sebastián A
2023-03-13Add NetBSD support. (#90)Thomas Klausner
2023-03-13Use fprintf for diagnostic output (#48)Pavol Rusnak
keep printf only for printing model output one can now use ./main ... 2>dev/null to suppress any diagnostic output
2023-03-13Use vdotq_s32 to improve performance (#67)Georgi Gerganov
* 10% performance boost on ARM * Back to original change