llama.cpp.git - llama.cpp

Age	Commit message (Collapse)	Author
2023-03-21	Check for reverse prompt by characters instead of tokens (#292) (#330)	tjohnman
	* Check for reverse prompt by characters instead of tokens (#292) * Update main.cpp Wording. * Cleanup. * Remove unnecessary use of std::stringstream. --------- Co-authored-by: Johnman <tjohnman@github> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-03-21	Fix convert script, warnings alpaca instructions, default params	Georgi Gerganov

2023-03-21	cmdline option for custom amount of model parts (--n_parts N) (#348)	anzz1
	* cmdline option for custom amount of model parts (--n_parts N) * Update main.cpp --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-03-21	Add tokenizer test + revert to C++11 (#355)	Georgi Gerganov
	* Add test-tokenizer-0 to do a few tokenizations - feel free to expand * Added option to convert-pth-to-ggml.py script to dump just the vocabulary * Added ./models/ggml-vocab.bin containing just LLaMA vocab data (used for tests) * Added utility to load vocabulary file from previous point (temporary implementation) * Avoid using std::string_view and drop back to C++11 (hope I didn't break something) * Rename gpt_vocab -> llama_vocab * All CMake binaries go into ./bin/ now
2023-03-20	move file magic/version to header, print expected version (#319)	Mack Straight

2023-03-20	sentencepiece bpe compatible tokenizer (#252)	Mack Straight
	* potential out of bounds read * fix quantize * style * Update convert-pth-to-ggml.py * mild cleanup * don't need the space-prefixing here rn since main.cpp already does it * new file magic + version header field * readme notice * missing newlines Co-authored-by: slaren <2141330+slaren@users.noreply.github.com>
2023-03-19	bugfix: default should not be interactive (#304)	cocktailpeanut

2023-03-19	fix coloring of last `n_batch` of prompt, and refactor line input (#221)	Rickey Bowers Jr
	* fix coloring of last `n_batch` of prompt, and refactor line input * forgot the newline that needs to be sent to the model * (per #283) try to force flush of color reset in SIGINT handler
2023-03-19	Support for multiple reverse prompts. (#299)	tjohnman
	Co-authored-by: Johnman <> Co-authored-by: Johnman <tjohnman@github>
2023-03-19	Make prompt randomization optional. (#300)	tjohnman
	Co-authored-by: Johnman <>
2023-03-19	Respect the maximum number of tokens in interactive. (#298)	tjohnman
	Co-authored-by: Johnman <johnman@github> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-03-19	Add --ignore-eos parameter (#181)	slaren
	Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-03-19	interactive mode: print '\n' in sigint_handler, this flush stdout thus ↵	Qingyou Meng
	ensure color reset. (#283)
2023-03-19	Command line switch to use F16 for memory_k and memory_v (refactor of #154) ↵	Erik Scholz
	(#294) * Use F16 for memory_k and memory_v * add command line switch to use f16 instead of f32 for memory k+v --------- Co-authored-by: Ty Everett <ty@tyweb.us>
2023-03-19	Fix off-by-one bug (#115)	Georgi Gerganov

2023-03-19	Drop trailing new line from file prompts (#80)	Georgi Gerganov

2023-03-19	Add "--instruct" argument for usage with Alpaca (#240)	Georgi Gerganov
	Also start adding prompts in "./prompts"
2023-03-18	Warn user if a context size greater than 2048 tokens is specified (#274)	Ronsor
	LLaMA doesn't support more than 2048 token context sizes, and going above that produces terrible results.
2023-03-18	Remove unused code since n_vocab is model.hparams.n_vocab (#262)	Alex Nguyen

2023-03-18	fixed warning with std::ignore about unused function result (#151)	Justin Suess
	fixed warning with std::ignore about unused function result
2023-03-17	Implement non-greedy tokenizer that tries to maximize token lengths (#242)	thement
	* Implement non-greedy tokenizer that tries to maximize token lengths * Insert single space in front of the prompt - this is to match original llama tokenizer behavior --------- Co-authored-by: Jakub Horak <jakub.horak@ibawizard.net>
2023-03-16	Add RMS norm and use it (#187)	hoangmit
	* add ggml_rms_norm * update op num
2023-03-15	add SIGINT support for _WIN32 environments (#120)	Rickey Bowers Jr
	* add SIGINT support for _WIN32 environments * perhaps more consistent
2023-03-15	added ctx_size parameter (#148)	Justin Suess
	* added ctx_size parameter * added it in more places * Apply suggestions from code review --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-03-15	fixed color reset on exit (#149)	Justin Suess
	* fixed color reset on exit * added sigint handler for ansi_color_reset * Update main.cpp --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-03-13	Print system information	Georgi Gerganov

2023-03-13	Use fprintf for diagnostic output (#48)	Pavol Rusnak
	keep printf only for printing model output one can now use ./main ... 2>dev/null to suppress any diagnostic output
2023-03-13	Reduce model loading time (#43)	uint256_t
	* Use buffering * Use vector * Minor --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-03-13	Fix UTF-8 handling (including colors) (#79)	Val Kharitonov

2023-03-13	Gate signal support on being on a unixoid system. (#74)	Matvey Soloviev

2023-03-13	Fix token count accounting	Matvey Soloviev

2023-03-13	Fix color getting reset before prompt output done (#65)	Matvey Soloviev
	(cherry picked from commit 7eb2987619feee04c40eff69b604017d09919cb6)
2023-03-12	Add interactive mode (#61)	Matvey Soloviev
	* Initial work on interactive mode. * Improve interactive mode. Make rev. prompt optional. * Update README to explain interactive mode. * Fix OS X build
2023-03-12	Add back top_k (#56)	beiller
	* Add back top_k * Update utils.cpp * Update utils.h --------- Co-authored-by: Bill Hamilton <bill.hamilton@shopify.com> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-03-12	Windows fixes (#31)	Sebastián A
	* Apply fixes suggested to build on windows Issue: https://github.com/ggerganov/llama.cpp/issues/22 * Remove unsupported VLAs * MSVC: Remove features that are only available on MSVC C++20. * Fix zero initialization of the other fields. * Change the use of vector for stack allocations.
2023-03-12	Add repetition penalty (#20)	beiller
	* Adding repeat penalization * Update utils.h * Update utils.cpp * Numeric fix Should probably still scale by temp even if penalized * Update comments, more proper application I see that numbers can go negative so a fix from a referenced commit * Minor formatting --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-03-11	Bump memory buffer	Georgi Gerganov

2023-03-11	Support all LLaMA models + change Q4_0 quantization storage	Georgi Gerganov

2023-03-10	Fix a bug in the rope calculation	Georgi Gerganov

2023-03-10	Final touches	Georgi Gerganov

2023-03-10	Initial release	Georgi Gerganov