Age | Commit message (Collapse) | Author | |
---|---|---|---|
2023-03-13 | Print system information | Georgi Gerganov | |
2023-03-13 | Initial support for CMake (#75) | Sebastián A | |
2023-03-13 | Add NetBSD support. (#90) | Thomas Klausner | |
2023-03-13 | Use fprintf for diagnostic output (#48) | Pavol Rusnak | |
keep printf only for printing model output one can now use ./main ... 2>dev/null to suppress any diagnostic output | |||
2023-03-13 | Use vdotq_s32 to improve performance (#67) | Georgi Gerganov | |
* 10% performance boost on ARM * Back to original change | |||
2023-03-13 | Reduce model loading time (#43) | uint256_t | |
* Use buffering * Use vector * Minor --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> | |||
2023-03-13 | Fix UTF-8 handling (including colors) (#79) | Val Kharitonov | |
2023-03-13 | Add quantize script for batch quantization (#92) | Pavol Rusnak | |
* Add quantize script for batch quantization * Indentation * README for new quantize.sh * Fix script name * Fix file list on Mac OS --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> | |||
2023-03-13 | Add initial contribution guidelines | Georgi Gerganov | |
2023-03-13 | Gate signal support on being on a unixoid system. (#74) | Matvey Soloviev | |
2023-03-13 | Fix token count accounting | Matvey Soloviev | |
2023-03-13 | Revert "10% performance boost on ARM" | Georgi Gerganov | |
This reverts commit 113a9e83ebc0f788f861394437087bf3ca0e019b. There are some reports for illegal instruction. Moved this stuff to vdotq_s32 branch until resolve | |||
2023-03-13 | Check for vdotq_s32 availability | Georgi Gerganov | |
2023-03-13 | Ammend to previous commit - forgot to update non-QRDMX branch | Georgi Gerganov | |
2023-03-13 | 10% performance boost on ARM | Georgi Gerganov | |
2023-03-13 | Fix color getting reset before prompt output done (#65) | Matvey Soloviev | |
(cherry picked from commit 7eb2987619feee04c40eff69b604017d09919cb6) | |||
2023-03-12 | Update README.md | Georgi Gerganov | |
2023-03-12 | Add interactive mode (#61) | Matvey Soloviev | |
* Initial work on interactive mode. * Improve interactive mode. Make rev. prompt optional. * Update README to explain interactive mode. * Fix OS X build | |||
2023-03-12 | Fix typo in README (#45) | Marc Köhlbrugge | |
2023-03-12 | Allow using prompt files (#59) | Ben Garney | |
2023-03-12 | Add back top_k (#56) | beiller | |
* Add back top_k * Update utils.cpp * Update utils.h --------- Co-authored-by: Bill Hamilton <bill.hamilton@shopify.com> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> | |||
2023-03-12 | Windows fixes (#31) | Sebastián A | |
* Apply fixes suggested to build on windows Issue: https://github.com/ggerganov/llama.cpp/issues/22 * Remove unsupported VLAs * MSVC: Remove features that are only available on MSVC C++20. * Fix zero initialization of the other fields. * Change the use of vector for stack allocations. | |||
2023-03-12 | Update README.md | Georgi Gerganov | |
2023-03-12 | Add CI (#60) | Georgi Gerganov | |
2023-03-12 | Revert "weights_only" arg - this causing more trouble than help | Georgi Gerganov | |
2023-03-12 | python/pytorch compat notes (#44) | Oleksandr Nikitin | |
2023-03-12 | Add repetition penalty (#20) | beiller | |
* Adding repeat penalization * Update utils.h * Update utils.cpp * Numeric fix Should probably still scale by temp even if penalized * Update comments, more proper application I see that numbers can go negative so a fix from a referenced commit * Minor formatting --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> | |||
2023-03-12 | Clarify meaning of hacking | Georgi Gerganov | |
2023-03-12 | README: add "Supported platforms" + update hot topics | Georgi Gerganov | |
2023-03-12 | use weights_only in conversion script (#32) | deepdiffuser | |
this restricts malicious weights from executing arbitrary code by restricting the unpickler to only loading tensors, primitive types, and dictionaries | |||
2023-03-12 | Add LICENSE (#21) | Pavol Rusnak | |
2023-03-12 | Update README.md | Georgi Gerganov | |
2023-03-11 | Fix a typo in model name (#16) | Juraj Bednar | |
2023-03-11 | Update README.md | Georgi Gerganov | |
2023-03-11 | Add AVX2 support for x86 architectures thanks to @Const-me ! | Georgi Gerganov | |
2023-03-11 | Fix un-initialized FP16 tables on x86 (#15, #2) | Georgi Gerganov | |
2023-03-11 | Bump memory buffer | Georgi Gerganov | |
2023-03-11 | Update README.md | Georgi Gerganov | |
2023-03-11 | .gitignore models/ | Georgi Gerganov | |
2023-03-11 | Update Makefile var + add comment | Georgi Gerganov | |
2023-03-11 | Update README.md | Georgi Gerganov | |
2023-03-11 | Update README.md | Georgi Gerganov | |
2023-03-11 | Support all LLaMA models + change Q4_0 quantization storage | Georgi Gerganov | |
2023-03-11 | Include Python dependencies in README (#6) | Simon Willison | |
2023-03-11 | Update README.md | Georgi Gerganov | |
2023-03-11 | Update README.md | Georgi Gerganov | |
2023-03-11 | Update README.md | Georgi Gerganov | |
2023-03-11 | Add missing headers for memcpy and assert (#3) | Jean-Michaël Celerier | |
2023-03-11 | Update README.md | Georgi Gerganov | |
2023-03-11 | Update README.md | Georgi Gerganov | |