aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-03-29rename convert_ggml_to_pth.py -> convert-ggml-to-pth.py (#600)Pavol Rusnak
to match filenames of other converters
2023-03-29Create chat-13B.bat (#592)Thérence
* Create chat-13B.bat Same script than chat-13B.sh, but for windows users. Tested and working on windows 10/11 v 22H2 * Apply suggestions from code review --------- Co-authored-by: anzz1 <anzz1@live.com>
2023-03-29readme : fix typosGeorgi Gerganov
2023-03-29readme : add GPT4All instructions (close #588)Georgi Gerganov
2023-03-29py : add GPT4All conversion scriptGeorgi Gerganov
For now: copy-paste Too much time for me to deduplicate the python code
2023-03-29llama : use the same threshold for OpenBLAS and ggml thread limiting (#577)Maël Kerbiriou
2023-03-29add example of re-act pattern (#583)Tobias Lütke
* add example of re-act pattern * spelling... * fixed whitespace in reverse prompt issue
2023-03-29Fix GCC warning about binary literal (#595)anzz1
0b10101010 -> 0xAA /* 0b10101010 */
2023-03-29Fix typo in llama.h (#593)anzz1
2023-03-28Enable Fused-Multiply-Add (FMA) and F16C/CVT16 vector extensions on MSVC (#375)anzz1
* Enable Fused-Multiply-Add (FMA) instructions on MSVC __FMA__ macro does not exist in MSVC * Enable F16C/CVT16 vector extensions on MSVC __F16C__ macro does not exist in MSVC, but is implied with AVX2/AVX512 * MSVC cvt intrinsics * Add __SSE3__ macro for MSVC too because why not even though it's not currently used for anything when AVX is defined
2023-03-28CI: fix subdirectory path globbing (#546)anzz1
- Changes in subdirectories will now be detecter properly - (Windows-MSVC) AVX512 tests temporarily disabled
2023-03-28llama : fix linkage with mingw (#551)anzz1
* Revert 7e53955 (#542) Still needs to be fixed properly * Fix linking on mingw32
2023-03-28ggml : add AVX2 implementation of quantize_row_q4_1 (#515)slaren
* Add AVX2 implementation of quantize_row_q4_1 * Actually use AVX2 * Make quantize_row_q4_1 static Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-03-28py : add temporary script to convert old ggml files to newer version (#539)thement
Co-authored-by: Jakub Horak <jakub.horak@ibawizard.net>
2023-03-28py : add capabiliy to convert from ggml back to torch or hf format for ↵Tai Duc Nguyen
further consumption/training/finetuning (#403)
2023-03-28ggml : refactor quantized processing functions (#509)Stephan Walter
* Refactor quantized processing functions * ggml : minor --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-03-28py : removed unused `model` variable and verified that the code functions ↵DooWoong Lee (David)
correctly with `vocab_only` setting. Also confirmed that the code works as expected after running with reduced memory usage due to deletion of no-longer-needed variable. (#547)
2023-03-28ci : make ctest verbose, hopefully we see what is wrong with the sanitizerGeorgi Gerganov
2023-03-28tests : free llama context at the end of the testGeorgi Gerganov
2023-03-28all : be more strict about converting float to double (#458)Stephan Walter
* Be more strict about converting float to double * Test equivalence of round, SILU implementations Test module is commented out in CMakeLists.txt because the tests may take a long time, depending on how much the compiler optimizes. * Fix softmax in perplexity.cpp * all : prefer float over double where appropriate * perplexity : add <cmath> --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-03-28deploy : add a Package.swift for SwiftPM support (#393)Jed Fox
* Add a Package.swift for SwiftPM support * Swap from exclusions to allowlist
2023-03-28ggml : introduce structs for the q4 data blocks (#356)Stephan Walter
* Introduce structs for the q4 data blocks * ggml : rename quant struct variables + fix ARM_NEON --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-03-28gitignore : add "embedding"Georgi Gerganov
2023-03-28Check the existence of f16_model_path_base in quantize.py (#574)dotpy314
Co-authored-by: Jincheng Miao <jincheng.miao@gmail.com>
2023-03-28Fix usage of F16C intrinsics in AVX code (#563)slaren
* Fix usage of F16C intrinsics in AVX code when F16C is not defined
2023-03-28main.cpp fixes, refactoring (#571)anzz1
- main: entering empty line passes back control without new input in interactive/instruct modes - instruct mode: keep prompt fix - instruct mode: duplicate instruct prompt fix - refactor: move common console code from main->common
2023-03-28Add embedding example to Makefile (#540)RJ Adriaansen
2023-03-27Fix missing ggml link in cmake for examples/* on w64-mingw32 (#542)Marco Matthies
2023-03-26ci: add debug build to sanitizer build matrix (#527)Erik Scholz
2023-03-26Fix undefined variables in debug build, remove unused variables (#531)Stephan Walter
2023-03-26Add support for linux/arm64 platform during Docker Builds (#514)Juan Calderon-Perez
* Add support for linux/arm64 platform * Add platform to versioned builds
2023-03-26Update README and comments for standalone perplexity tool (#525)Stephan Walter
2023-03-26[main] fix infinite generation (-n == -1) (#523)anzz1
2023-03-26Add logo to README.mdGeorgi Gerganov
2023-03-26Exit from interactive mode if input stream is bad (#491)Harald Fernengel
Allow exiting the interactive prompt also with CTRL-D on Unix and CTRL-Z on Windows.
2023-03-26CI: Run other sanitizer builds even if one fails (#511)anzz1
applies only to sanitizer builds so they wont be cancelled
2023-03-25Clarify console output in convert-pth-to-ggml.py (#512)jp-x-g
"Processing part 1 of 3" instead of "Processing part 0"
2023-03-25CMake / CI additions (#497)anzz1
* CMake: Add AVX512 option * CI: Add AVX/AVX512 builds (Windows) (AVX512 tests can only be run when the worker happens to support it, building works anyway) * CMake: Fix sanitizer linkage ( merged #468 ) * CI: Add sanitizer builds (Ubuntu) * CI: Fix release tagging (change @zendesk/action-create-release to @anzz1/action-create-release until upstream PR Added commitish as input zendesk/action-create-release#32 is merged)
2023-03-25(Windows) Set console to UTF-8 on init (#420)anzz1
Sets console codepage to 65001 (CP_UTF8) on start for both input and output, should fix problems with UTF-8 characters.
2023-03-25Fix colors enabling on WIN32Georgi Gerganov
2023-03-25If n_predict == -1, generate foreverGeorgi Gerganov
2023-03-25Inifinite generation via context swapping (#71)Georgi Gerganov
2023-03-25Cleanup STL headers + fix embedding examples + minor stuffGeorgi Gerganov
2023-03-25Move chat scripts into "./examples"Georgi Gerganov
2023-03-25Add AVX2 implementation of dequantize_row_q4_1 (#505)slaren
2023-03-25Overhaul the examples structureGeorgi Gerganov
- main -> examples - utils -> examples (renamed to "common") - quantize -> examples - separate tools for "perplexity" and "embedding" Hope I didn't break something !
2023-03-25Retire the ggml_mul_mat() branch for transposed src0 (#500)Georgi Gerganov
* Retire the ggml_mul_mat() for transposed src0 - It can always be made contiguous with ggml_cpy() - The code is now simplified - The results are deterministic in respect to num threads * SIMD-ify dequantize_row_q4_0() for ARM_NEON (#502) * Attempt to SIMD-ify dequantize_row_q4_0() for ARM_NEON * Fix dequantization - forgot to interleave the quants
2023-03-25Disable prompt verbosity by default and add option to enable (#480)Georgi Gerganov
2023-03-25Add AVX2 implementation of dequantize_row_q4_0 (#467)slaren
2023-03-25Don't interefe with BLAS for large prompts by running only 1 threadGeorgi Gerganov