Age | Commit message (Collapse) | Author |
|
* Check for reverse prompt by characters instead of tokens (#292)
* Update main.cpp
Wording.
* Cleanup.
* Remove unnecessary use of std::stringstream.
---------
Co-authored-by: Johnman <tjohnman@github>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
|
|
|
|
* cmdline option for custom amount of model parts (--n_parts N)
* Update main.cpp
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
|
|
* Add test-tokenizer-0 to do a few tokenizations - feel free to expand
* Added option to convert-pth-to-ggml.py script to dump just the vocabulary
* Added ./models/ggml-vocab.bin containing just LLaMA vocab data (used for tests)
* Added utility to load vocabulary file from previous point (temporary implementation)
* Avoid using std::string_view and drop back to C++11 (hope I didn't break something)
* Rename gpt_vocab -> llama_vocab
* All CMake binaries go into ./bin/ now
|
|
|
|
* potential out of bounds read
* fix quantize
* style
* Update convert-pth-to-ggml.py
* mild cleanup
* don't need the space-prefixing here rn since main.cpp already does it
* new file magic + version header field
* readme notice
* missing newlines
Co-authored-by: slaren <2141330+slaren@users.noreply.github.com>
|
|
|
|
* fix coloring of last `n_batch` of prompt, and refactor line input
* forgot the newline that needs to be sent to the model
* (per #283) try to force flush of color reset in SIGINT handler
|
|
Co-authored-by: Johnman <>
Co-authored-by: Johnman <tjohnman@github>
|
|
Co-authored-by: Johnman <>
|
|
Co-authored-by: Johnman <johnman@github>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
|
|
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
|
|
ensure color reset. (#283)
|
|
(#294)
* Use F16 for memory_k and memory_v
* add command line switch to use f16 instead of f32 for memory k+v
---------
Co-authored-by: Ty Everett <ty@tyweb.us>
|
|
|
|
|
|
Also start adding prompts in "./prompts"
|
|
LLaMA doesn't support more than 2048 token context sizes, and going above that produces terrible results.
|
|
|
|
fixed warning with std::ignore about unused function result
|
|
* Implement non-greedy tokenizer that tries to maximize token lengths
* Insert single space in front of the prompt
- this is to match original llama tokenizer behavior
---------
Co-authored-by: Jakub Horak <jakub.horak@ibawizard.net>
|
|
* add ggml_rms_norm
* update op num
|
|
* add SIGINT support for _WIN32 environments
* perhaps more consistent
|
|
* added ctx_size parameter
* added it in more places
* Apply suggestions from code review
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
|
|
* fixed color reset on exit
* added sigint handler for ansi_color_reset
* Update main.cpp
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
|
|
|
|
keep printf only for printing model output
one can now use ./main ... 2>dev/null to suppress any diagnostic output
|
|
* Use buffering
* Use vector
* Minor
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
|
|
|
|
|
|
|
|
(cherry picked from commit 7eb2987619feee04c40eff69b604017d09919cb6)
|
|
* Initial work on interactive mode.
* Improve interactive mode. Make rev. prompt optional.
* Update README to explain interactive mode.
* Fix OS X build
|
|
* Add back top_k
* Update utils.cpp
* Update utils.h
---------
Co-authored-by: Bill Hamilton <bill.hamilton@shopify.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
|
|
* Apply fixes suggested to build on windows
Issue: https://github.com/ggerganov/llama.cpp/issues/22
* Remove unsupported VLAs
* MSVC: Remove features that are only available on MSVC C++20.
* Fix zero initialization of the other fields.
* Change the use of vector for stack allocations.
|
|
* Adding repeat penalization
* Update utils.h
* Update utils.cpp
* Numeric fix
Should probably still scale by temp even if penalized
* Update comments, more proper application
I see that numbers can go negative so a fix from a referenced commit
* Minor formatting
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
|
|
|
|
|
|
|
|
|
|
|