Age | Commit message (Collapse) | Author |
|
|
|
|
|
|
|
* Refactor get_n_parts function to simplify code and improve readability
* Use f-strings instead of concatenation
* Refactoring: more concise and readable
* modularize
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
|
|
|
|
|
|
Also start adding prompts in "./prompts"
|
|
I think this is what is used in the Python code
|
|
LLaMA doesn't support more than 2048 token context sizes, and going above that produces terrible results.
|
|
|
|
|
|
|
|
|
|
fixed warning with std::ignore about unused function result
|
|
This causes long prompts to parse very slowly.
|
|
* CI Improvements
Manual build feature, autoreleases for Windows
* better CI naming convention
use branch name in releases and tags
|
|
* Nix flake
* Nix: only add Accelerate framework on macOS
* Nix: development shel, direnv and compatibility
* Nix: use python packages supplied by withPackages
* Nix: remove channel compatibility
* Nix: fix ARM neon dotproduct on macOS
---------
Co-authored-by: Pavol Rusnak <pavol@rusnak.io>
|
|
* Implement non-greedy tokenizer that tries to maximize token lengths
* Insert single space in front of the prompt
- this is to match original llama tokenizer behavior
---------
Co-authored-by: Jakub Horak <jakub.horak@ibawizard.net>
|
|
|
|
|
|
The readme tells people to use the command line option "-t 8", causing 8
threads to be started. On systems with fewer than 8 cores, this causes a
significant slowdown. Remove the option from the example command lines
and use /proc/cpuinfo on Linux to determine a sensible default.
|
|
* add ptread link to fix cmake build under linux
* add cmake to linux and macos platform
* separate make and cmake workflow
---------
Co-authored-by: SebastiƔn A <sebastian.aedo29@gmail.com>
|
|
* feat: dockerize llamacpp
* feat: split build & runtime stages
* split dockerfile into main & tools
* add quantize into tool docker image
* Update .devops/tools.sh
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* add docker action pipeline
* change CI to publish at github docker registry
* fix name runs-on macOS-latest is macos-latest (lowercase)
* include docker versioned images
* fix github action docker
* fix docker.yml
* feat: include all-in-one command tool & update readme.md
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
|
|
* Add AVX2 version of ggml_vec_dot_q4_1
* Small optimisations to q4_1 dot product (@Const-me)
* Rearrange Q4_1 quantization to work for multipart models. (Fix #152)
* Fix ggml_vec_mad_q4_1 too
* Fix non-vectorised q4_1 vec mul
|
|
|
|
|
|
|
|
|
|
* add ggml_rms_norm
* update op num
|
|
|
|
* add SIGINT support for _WIN32 environments
* perhaps more consistent
|
|
* added ctx_size parameter
* added it in more places
* Apply suggestions from code review
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
|
|
* fixed color reset on exit
* added sigint handler for ansi_color_reset
* Update main.cpp
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
|
|
* Update README.md
* Update README.md
remove facebook
|
|
convert-pth-to-ggml.py (#142)
There are ways that special tokens or other new tokens could be added to the tokenizer; therefore it's probably best not to assume the vocabulary is only 32000 tokens.
|
|
Without "static" prefix, it fails to compile in clang
|
|
* Don't use vdotq_s32 if it's not available
`dotprod` extensions aren't available on some ARM CPUs (e.g. Raspberry Pi 4), so check for them and only use them if they're available.
Reintroduces the code removed in 84d9015 if `__ARM_FEATURE_DOTPROD` isn't defined.
* Update ggml.c
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
keep printf only for printing model output
one can now use ./main ... 2>dev/null to suppress any diagnostic output
|
|
* 10% performance boost on ARM
* Back to original change
|
|
* Use buffering
* Use vector
* Minor
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
|
|
|
|
* Add quantize script for batch quantization
* Indentation
* README for new quantize.sh
* Fix script name
* Fix file list on Mac OS
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
|