aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-03-15inline -> static inline for "bytesFromNibbles" (#161)hoangmit
Without "static" prefix, it fails to compile in clang
2023-03-14Don't use vdotq_s32 if it's not available (#139)Ronsor
* Don't use vdotq_s32 if it's not available `dotprod` extensions aren't available on some ARM CPUs (e.g. Raspberry Pi 4), so check for them and only use them if they're available. Reintroduces the code removed in 84d9015 if `__ARM_FEATURE_DOTPROD` isn't defined. * Update ggml.c --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-03-14Add section to README on how to run the project on Android (#130)Radoslav Gerganov
2023-03-14Add Misc section + update hot topics + minor fixesGeorgi Gerganov
2023-03-13Add windows to the CI (#98)Sebastián A
2023-03-13CMake build in Release by default (#75)Georgi Gerganov
2023-03-13Update contribution section, hot topics, limitations, etc.Georgi Gerganov
2023-03-13Print system informationGeorgi Gerganov
2023-03-13Initial support for CMake (#75)Sebastián A
2023-03-13Add NetBSD support. (#90)Thomas Klausner
2023-03-13Use fprintf for diagnostic output (#48)Pavol Rusnak
keep printf only for printing model output one can now use ./main ... 2>dev/null to suppress any diagnostic output
2023-03-13Use vdotq_s32 to improve performance (#67)Georgi Gerganov
* 10% performance boost on ARM * Back to original change
2023-03-13Reduce model loading time (#43)uint256_t
* Use buffering * Use vector * Minor --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-03-13Fix UTF-8 handling (including colors) (#79)Val Kharitonov
2023-03-13Add quantize script for batch quantization (#92)Pavol Rusnak
* Add quantize script for batch quantization * Indentation * README for new quantize.sh * Fix script name * Fix file list on Mac OS --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-03-13Add initial contribution guidelinesGeorgi Gerganov
2023-03-13Gate signal support on being on a unixoid system. (#74)Matvey Soloviev
2023-03-13Fix token count accountingMatvey Soloviev
2023-03-13Revert "10% performance boost on ARM"Georgi Gerganov
This reverts commit 113a9e83ebc0f788f861394437087bf3ca0e019b. There are some reports for illegal instruction. Moved this stuff to vdotq_s32 branch until resolve
2023-03-13Check for vdotq_s32 availabilityGeorgi Gerganov
2023-03-13Ammend to previous commit - forgot to update non-QRDMX branchGeorgi Gerganov
2023-03-1310% performance boost on ARMGeorgi Gerganov
2023-03-13Fix color getting reset before prompt output done (#65)Matvey Soloviev
(cherry picked from commit 7eb2987619feee04c40eff69b604017d09919cb6)
2023-03-12Update README.mdGeorgi Gerganov
2023-03-12Add interactive mode (#61)Matvey Soloviev
* Initial work on interactive mode. * Improve interactive mode. Make rev. prompt optional. * Update README to explain interactive mode. * Fix OS X build
2023-03-12Fix typo in README (#45)Marc Köhlbrugge
2023-03-12Allow using prompt files (#59)Ben Garney
2023-03-12Add back top_k (#56)beiller
* Add back top_k * Update utils.cpp * Update utils.h --------- Co-authored-by: Bill Hamilton <bill.hamilton@shopify.com> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-03-12Windows fixes (#31)Sebastián A
* Apply fixes suggested to build on windows Issue: https://github.com/ggerganov/llama.cpp/issues/22 * Remove unsupported VLAs * MSVC: Remove features that are only available on MSVC C++20. * Fix zero initialization of the other fields. * Change the use of vector for stack allocations.
2023-03-12Update README.mdGeorgi Gerganov
2023-03-12Add CI (#60)Georgi Gerganov
2023-03-12Revert "weights_only" arg - this causing more trouble than helpGeorgi Gerganov
2023-03-12python/pytorch compat notes (#44)Oleksandr Nikitin
2023-03-12Add repetition penalty (#20)beiller
* Adding repeat penalization * Update utils.h * Update utils.cpp * Numeric fix Should probably still scale by temp even if penalized * Update comments, more proper application I see that numbers can go negative so a fix from a referenced commit * Minor formatting --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-03-12Clarify meaning of hackingGeorgi Gerganov
2023-03-12README: add "Supported platforms" + update hot topicsGeorgi Gerganov
2023-03-12use weights_only in conversion script (#32)deepdiffuser
this restricts malicious weights from executing arbitrary code by restricting the unpickler to only loading tensors, primitive types, and dictionaries
2023-03-12Add LICENSE (#21)Pavol Rusnak
2023-03-12Update README.mdGeorgi Gerganov
2023-03-11Fix a typo in model name (#16)Juraj Bednar
2023-03-11Update README.mdGeorgi Gerganov
2023-03-11Add AVX2 support for x86 architectures thanks to @Const-me !Georgi Gerganov
2023-03-11Fix un-initialized FP16 tables on x86 (#15, #2)Georgi Gerganov
2023-03-11Bump memory bufferGeorgi Gerganov
2023-03-11Update README.mdGeorgi Gerganov
2023-03-11.gitignore models/Georgi Gerganov
2023-03-11Update Makefile var + add commentGeorgi Gerganov
2023-03-11Update README.mdGeorgi Gerganov
2023-03-11Update README.mdGeorgi Gerganov
2023-03-11Support all LLaMA models + change Q4_0 quantization storageGeorgi Gerganov