llama.cpp.git - llama.cpp

Age	Commit message (Collapse)	Author
2023-03-19	Add --ignore-eos parameter (#181)	slaren
	Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-03-19	interactive mode: print '\n' in sigint_handler, this flush stdout thus ↵	Qingyou Meng
	ensure color reset. (#283)
2023-03-19	Command line switch to use F16 for memory_k and memory_v (refactor of #154) ↵	Erik Scholz
	(#294) * Use F16 for memory_k and memory_v * add command line switch to use f16 instead of f32 for memory k+v --------- Co-authored-by: Ty Everett <ty@tyweb.us>
2023-03-19	Update hot topics to mention Alpaca support	Georgi Gerganov

2023-03-19	Fix off-by-one bug (#115)	Georgi Gerganov

2023-03-19	Fix python stuff (#109)	Georgi Gerganov

2023-03-19	Refactoring `convert-pth-to-ggml.py`: more concise and readable (#109)	qunash
	* Refactor get_n_parts function to simplify code and improve readability * Use f-strings instead of concatenation * Refactoring: more concise and readable * modularize --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-03-19	Drop trailing new line from file prompts (#80)	Georgi Gerganov

2023-03-19	Add instruction for using Alpaca (#240)	Georgi Gerganov

2023-03-19	Add "--instruct" argument for usage with Alpaca (#240)	Georgi Gerganov
	Also start adding prompts in "./prompts"
2023-03-19	Change RMSNorm eps to 1e-6 (#173)	Georgi Gerganov
	I think this is what is used in the Python code
2023-03-18	Warn user if a context size greater than 2048 tokens is specified (#274)	Ronsor
	LLaMA doesn't support more than 2048 token context sizes, and going above that produces terrible results.
2023-03-18	Fix typo in readme	Pavol Rusnak

2023-03-18	Add note about Python 3.11 to readme	Pavol Rusnak

2023-03-18	Add memory/disk requirements to readme	Pavol Rusnak

2023-03-18	Remove unused code since n_vocab is model.hparams.n_vocab (#262)	Alex Nguyen

2023-03-18	fixed warning with std::ignore about unused function result (#151)	Justin Suess
	fixed warning with std::ignore about unused function result
2023-03-18	Fix n^2 loop in tokenization (#254)	Gary Linscott
	This causes long prompts to parse very slowly.
2023-03-18	CI Improvements (#230)	anzz1
	* CI Improvements Manual build feature, autoreleases for Windows * better CI naming convention use branch name in releases and tags
2023-03-17	Nix flake (#40)	Niklas Korz
	* Nix flake * Nix: only add Accelerate framework on macOS * Nix: development shel, direnv and compatibility * Nix: use python packages supplied by withPackages * Nix: remove channel compatibility * Nix: fix ARM neon dotproduct on macOS --------- Co-authored-by: Pavol Rusnak <pavol@rusnak.io>
2023-03-17	Implement non-greedy tokenizer that tries to maximize token lengths (#242)	thement
	* Implement non-greedy tokenizer that tries to maximize token lengths * Insert single space in front of the prompt - this is to match original llama tokenizer behavior --------- Co-authored-by: Jakub Horak <jakub.horak@ibawizard.net>
2023-03-17	Default to 4 threads (#243)	Georgi Gerganov

2023-03-17	Update Contributing section	Georgi Gerganov

2023-03-17	Don't tell users to use a bad number of threads (#243)	Stephan Walter
	The readme tells people to use the command line option "-t 8", causing 8 threads to be started. On systems with fewer than 8 cores, this causes a significant slowdown. Remove the option from the example command lines and use /proc/cpuinfo on Linux to determine a sensible default.
2023-03-17	add ptread link to fix cmake build under linux (#114)	mmyjona
	* add ptread link to fix cmake build under linux * add cmake to linux and macos platform * separate make and cmake workflow --------- Co-authored-by: Sebastián A <sebastian.aedo29@gmail.com>
2023-03-17	🚀 Dockerize llamacpp (#132)	Bernat Vadell
	* feat: dockerize llamacpp * feat: split build & runtime stages * split dockerfile into main & tools * add quantize into tool docker image * Update .devops/tools.sh Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * add docker action pipeline * change CI to publish at github docker registry * fix name runs-on macOS-latest is macos-latest (lowercase) * include docker versioned images * fix github action docker * fix docker.yml * feat: include all-in-one command tool & update readme.md --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-03-17	Q4_1 quantization (#193)	Matvey Soloviev
	* Add AVX2 version of ggml_vec_dot_q4_1 * Small optimisations to q4_1 dot product (@Const-me) * Rearrange Q4_1 quantization to work for multipart models. (Fix #152) * Fix ggml_vec_mad_q4_1 too * Fix non-vectorised q4_1 vec mul
2023-03-16	Update README.md	Georgi Gerganov

2023-03-16	Expand "Contributing" section	Georgi Gerganov

2023-03-16	Update hot topics - RMSnorm	Georgi Gerganov

2023-03-15	Fix RMS norm in GGML (#191)	Nebula

2023-03-16	Add RMS norm and use it (#187)	hoangmit
	* add ggml_rms_norm * update op num
2023-03-15	fixed typo (#178)	moritzbrantner

2023-03-15	add SIGINT support for _WIN32 environments (#120)	Rickey Bowers Jr
	* add SIGINT support for _WIN32 environments * perhaps more consistent
2023-03-15	added ctx_size parameter (#148)	Justin Suess
	* added ctx_size parameter * added it in more places * Apply suggestions from code review --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-03-15	fixed color reset on exit (#149)	Justin Suess
	* fixed color reset on exit * added sigint handler for ansi_color_reset * Update main.cpp --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-03-15	Fix potential licensing issue (#126)	Musab Gultekin
	* Update README.md * Update README.md remove facebook
2023-03-15	Use `tokenizer.vocab_size()` instead of hardcoding 32000 in ↵	Ronsor
	convert-pth-to-ggml.py (#142) There are ways that special tokens or other new tokens could be added to the tokenizer; therefore it's probably best not to assume the vocabulary is only 32000 tokens.
2023-03-15	inline -> static inline for "bytesFromNibbles" (#161)	hoangmit
	Without "static" prefix, it fails to compile in clang
2023-03-14	Don't use vdotq_s32 if it's not available (#139)	Ronsor
	* Don't use vdotq_s32 if it's not available `dotprod` extensions aren't available on some ARM CPUs (e.g. Raspberry Pi 4), so check for them and only use them if they're available. Reintroduces the code removed in 84d9015 if `__ARM_FEATURE_DOTPROD` isn't defined. * Update ggml.c --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-03-14	Add section to README on how to run the project on Android (#130)	Radoslav Gerganov

2023-03-14	Add Misc section + update hot topics + minor fixes	Georgi Gerganov

2023-03-13	Add windows to the CI (#98)	Sebastián A

2023-03-13	CMake build in Release by default (#75)	Georgi Gerganov

2023-03-13	Update contribution section, hot topics, limitations, etc.	Georgi Gerganov

2023-03-13	Print system information	Georgi Gerganov

2023-03-13	Initial support for CMake (#75)	Sebastián A

2023-03-13	Add NetBSD support. (#90)	Thomas Klausner

2023-03-13	Use fprintf for diagnostic output (#48)	Pavol Rusnak
	keep printf only for printing model output one can now use ./main ... 2>dev/null to suppress any diagnostic output
2023-03-13	Use vdotq_s32 to improve performance (#67)	Georgi Gerganov
	* 10% performance boost on ARM * Back to original change