llama.cpp.git - llama.cpp

Age	Commit message (Collapse)	Author
2023-05-19	main : make reverse prompt option act as a stop token in non-interactive ↵	Jason McCartney
	mode (#1032) * Make reverse prompt option act as a stop token in non-interactive scenarios * Making requested review changes * Update gpt_params_parse and fix a merge error * Revert "Update gpt_params_parse and fix a merge error" This reverts commit 2bb2ff1748513591ad45b175a75ed1d8089d84c8. * Update gpt_params_parse and fix a merge error take 2
2023-05-18	Fixes #1511 lambda issue for w64devkit (mingw) (#1513)	DannyDaemonic
	* Fix for w64devkit and mingw
2023-05-16	define default model path once, sync path with readme (#1366)	András Salamon

2023-05-12	llama : fix --mtest option (close #1414)	Georgi Gerganov

2023-05-10	main : add option to save full output to session (#1338)	Evan Jones
	* main : add option to save full output to session * split behavior into --session and --prompt-cache * restore original implementation with new names * PR comments * move the check for incompatible parameters to gpt_params_parse * Fix whitespace Co-authored-by: DannyDaemonic <DannyDaemonic@gmail.com> --------- Co-authored-by: DannyDaemonic <DannyDaemonic@gmail.com>
2023-05-08	Interface improvements and `--multiline-input` (previously `--author-mode`) ↵	DannyDaemonic
	(#1040) * Interface improvements * Multiline input * Track character width * Works with all characters and control codes + Windows console fixes
2023-05-08	llama : require first token to be BOS (#1303)	Georgi Gerganov
	* llama : require first token to be BOS * scripts : add ppl-run-all.sh * perplexity : add BOS for each chunk * readme : update perplexity values after BOS fix * perplexity : add clarifying comments
2023-05-06	Remove default arguments from sampling functions (#1343)	Jed Fox

2023-05-04	main : add --in-suffix option (#1318)	44670
	* adding --in-suffix option * print input suffix before generation
2023-05-04	Only escape prompts when used with `-e` (#1311)	DannyDaemonic

2023-05-04	Update main's README.md with new features (#1296)	DannyDaemonic

2023-05-04	fix #1224 reverse prompt and multi line (#1297)	Tomas
	* fix reverse prompt and multi line * Code Formatting Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-05-02	Handle signals properly on Windows (#1123)	DannyDaemonic

2023-05-02	examples : add llama_init_from_gpt_params() common function (#1290)	Ron Evans
	Signed-off-by: deadprogram <ron@hybridgroup.com>
2023-05-02	examples : improve vertical alignment of a few variables (#1286)	Ron Evans
	Signed-off-by: deadprogram <ron@hybridgroup.com>
2023-05-02	llama : allow 0 as a seed number. (#1275)	Robert Brisita

2023-05-02	main : switch input_noecho to input_echo to remove negation (#979)	Ron Evans
	Signed-off-by: deadprogram <ron@hybridgroup.com>
2023-05-01	Add git-based build information for better issue tracking (#1232)	DannyDaemonic
	* Add git-based build information for better issue tracking * macOS fix * "build (hash)" and "CMAKE_SOURCE_DIR" changes * Redo "CMAKE_CURRENT_SOURCE_DIR" and clearer build messages * Fix conditional dependency on missing target * Broke out build-info.cmake, added find_package fallback, and added build into to all examples, added dependencies to Makefile * 4 space indenting for cmake, attempt to clean up my mess in Makefile * Short hash, less fancy Makefile, and don't modify build-info.h if it wouldn't change it
2023-05-01	llama : fix session load / save (#1263)	Georgi Gerganov

2023-04-29	common : change default parameters to pre-#1126 (#1223)	Georgi Gerganov

2023-04-29	llama : new sampling algorithms (#1126)	Ivan Stepanov
	* Sample interface, new samplers. New samplers: - locally typical sampling - tail free sampling - frequency and presence penalty - mirostat Ignore EOS fix: -inf should be used. * mirostat * Added --logit-bias and --no-penalize-nl, removed std::span * Use C++11, clarify llama API documentation, rename Mirostat parameters to --mirostat_lr and --mirostat_ent, add temperature sampling for Mirostat, simplify Mirostat sampling API parameters (removed N and k) Use C++11, clarify llama API documentation, rename Mirostat parameters to --mirostat_lr and --mirostat_ent, add temperature sampling for Mirostat, simplify Mirostat sampling API parameters (removed N and k) * Save and load example adjust * Tests * Windows build fix * Windows test fix
2023-04-28	llama : add session file format and saved sessions in main (#1169)	Evan Jones

2023-04-24	examples/main README improvements and some light refactoring (#1131)	mgroeber9110

2023-04-23	Fix LoRA acronym (#1145)	slaren

2023-04-23	Added README.md for main with examples and explanations (#1139)	DannyDaemonic

2023-04-22	Fix CI: ARM NEON, quantization unit tests, editorconfig (#1122)	Stephan Walter

2023-04-22	llama : print timings on ctrl+c exit (#1021)	wbpxre150
	* print timings on ctrl+c exit * remove redundant free memory call. * add global pointer to ctx.
2023-04-21	main : evaluate tokens in batches after swapping context (#1014)	Alex Klinkhamer
	* examples : evaluate tokens in batches after swapping context * Update examples/main/main.cpp --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-04-17	Add LoRA support (#820)	slaren

2023-04-16	examples: add missing <ctime> include for time() (#1011)	Pavol Rusnak

2023-04-14	Revert "main : alternative instruct mode (Vicuna support, etc.) (#863)" (#982)	Pavol Rusnak
	This reverts commit f4d277ae17247ee51129ef1a9ff74d377cc90b1b.
2023-04-14	main : alternative instruct mode (Vicuna support, etc.) (#863)	Tomáš Pazdiora
	* Add support for configs, add configurable prefixes / suffixes, deprecate instruct mode, add stop prompt * Add multiline mode, update text input. * bugfix * update implementation * typos * Change --multiline implementation to be toggled by EOF. * bugfix * default multiline mode * add more configs * update formating * update formatting * apply suggestions
2023-04-11	Fix whitespace, add .editorconfig, add GitHub workflow (#883)	Pavol Rusnak

2023-04-11	Windows fixes (#890)	comex
	Mostly for msys2 and mingw64 builds, which are different from each other and different from standard Visual Studio builds. Isn't Windows fun? - Define _GNU_SOURCE in more files (it's already used in ggml.c for Linux's sake). - Don't use PrefetchVirtualMemory if not building for Windows 8 or later (mingw64 doesn't by default). But warn the user about this situation since it's probably not intended. - Check for NOMINMAX already being defined, which it is on mingw64. - Actually use the `increment` variable (bug in my `pizza` PR). - Suppress unused variable warnings in the fake pthread_create and pthread_join implementations for Windows. - (not Windows-related) Remove mention of `asprintf` from comment; `asprintf` is no longer used. Fixes #871.
2023-04-10	Rewrite loading code to try to satisfy everyone:	comex
	- Support all three formats (ggml, ggmf, ggjt). (However, I didn't include the hack needed to support GPT4All files without conversion. Those can still be used after converting them with convert.py from my other PR.) - Support both mmap and read (mmap is used by default, but can be disabled with `--no-mmap`, and is automatically disabled for pre-ggjt files or on platforms where mmap is not supported). - Support multi-file models like before, but automatically determine the number of parts rather than requiring `--n_parts`. - Improve validation and error checking. - Stop using the per-file type field (f16) entirely in favor of just relying on the per-tensor type/size fields. This has no immediate benefit, but makes it easier to experiment with different formats, and should make it easier to support the new GPTQ-for-LLaMa models in the future (I have some work in progress on that front). - Support VirtualLock on Windows (using the same `--mlock` option as on Unix). - Indicate loading progress when using mmap + mlock. (Which led me to the interesting observation that on my Linux machine, with a warm file cache, mlock actually takes some time, whereas mmap without mlock starts almost instantly...) - To help implement this, move mlock support from ggml to the loading code. - madvise/PrefetchVirtualMemory support (based on #740) - Switch from ifstream to the `fopen` family of functions to avoid unnecessary copying and, when mmap is enabled, allow reusing the same file descriptor for both metadata reads and mmap (whereas the existing implementation opens the file a second time to mmap). - Quantization now produces a single-file output even with multi-file inputs (not really a feature as much as 'it was easier this way'). Implementation notes: I tried to factor the code into more discrete pieces than before. Regarding code style: I tried to follow the code style, but I'm naughty and used a few advanced C++ features repeatedly: - Destructors to make it easier to ensure everything gets cleaned up. - Exceptions. I don't even usually use exceptions when writing C++, and I can remove them if desired... but here they make the loading code much more succinct while still properly handling a variety of errors, ranging from API calls failing to integer overflow and allocation failure. The exceptions are converted to error codes at the API boundary.) Co-authored-by: Pavol Rusnak <pavol@rusnak.io> (for the bit I copied from #740)
2023-04-08	fix for windows utf-8 input (#840)	Tomáš Pazdiora
	Use UTF-16 as input on Windows, since UTF-8 does not work and reads multibyte characters as zeros
2023-04-06	Do not crash when it has nothing to say. (#796)	Sergey Alirzaev
	Otherwise observing this in the interactive mode: /usr/lib/gcc/x86_64-pc-linux-gnu/12/include/g++-v12/bits/stl_vector.h:1230: reference std::vector<int>::back() [_Tp = int, _Alloc = std::allocator<int>]: Assertion '!this->empty()' failed.
2023-04-03	Windows: reactive sigint handler after each Ctrl-C (#736)	mgroeber9110

2023-03-28	llama : fix linkage with mingw (#551)	anzz1
	* Revert 7e53955 (#542) Still needs to be fixed properly * Fix linking on mingw32
2023-03-28	all : be more strict about converting float to double (#458)	Stephan Walter
	* Be more strict about converting float to double * Test equivalence of round, SILU implementations Test module is commented out in CMakeLists.txt because the tests may take a long time, depending on how much the compiler optimizes. * Fix softmax in perplexity.cpp * all : prefer float over double where appropriate * perplexity : add <cmath> --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-03-28	main.cpp fixes, refactoring (#571)	anzz1
	- main: entering empty line passes back control without new input in interactive/instruct modes - instruct mode: keep prompt fix - instruct mode: duplicate instruct prompt fix - refactor: move common console code from main->common
2023-03-27	Fix missing ggml link in cmake for examples/* on w64-mingw32 (#542)	Marco Matthies

2023-03-26	[main] fix infinite generation (-n == -1) (#523)	anzz1

2023-03-26	Exit from interactive mode if input stream is bad (#491)	Harald Fernengel
	Allow exiting the interactive prompt also with CTRL-D on Unix and CTRL-Z on Windows.
2023-03-25	(Windows) Set console to UTF-8 on init (#420)	anzz1
	Sets console codepage to 65001 (CP_UTF8) on start for both input and output, should fix problems with UTF-8 characters.
2023-03-25	Fix colors enabling on WIN32	Georgi Gerganov

2023-03-25	If n_predict == -1, generate forever	Georgi Gerganov

2023-03-25	Inifinite generation via context swapping (#71)	Georgi Gerganov

2023-03-25	Overhaul the examples structure	Georgi Gerganov
	- main -> examples - utils -> examples (renamed to "common") - quantize -> examples - separate tools for "perplexity" and "embedding" Hope I didn't break something !