Age | Commit message (Collapse) | Author |
|
to match filenames of other converters
|
|
* Create chat-13B.bat
Same script than chat-13B.sh, but for windows users.
Tested and working on windows 10/11 v 22H2
* Apply suggestions from code review
---------
Co-authored-by: anzz1 <anzz1@live.com>
|
|
|
|
|
|
For now: copy-paste
Too much time for me to deduplicate the python code
|
|
|
|
* add example of re-act pattern
* spelling...
* fixed whitespace in reverse prompt issue
|
|
0b10101010 -> 0xAA /* 0b10101010 */
|
|
|
|
* Enable Fused-Multiply-Add (FMA) instructions on MSVC
__FMA__ macro does not exist in MSVC
* Enable F16C/CVT16 vector extensions on MSVC
__F16C__ macro does not exist in MSVC, but is implied with AVX2/AVX512
* MSVC cvt intrinsics
* Add __SSE3__ macro for MSVC too because why not
even though it's not currently used for anything when AVX is defined
|
|
- Changes in subdirectories will now be detecter properly
- (Windows-MSVC) AVX512 tests temporarily disabled
|
|
* Revert 7e53955 (#542)
Still needs to be fixed properly
* Fix linking on mingw32
|
|
* Add AVX2 implementation of quantize_row_q4_1
* Actually use AVX2
* Make quantize_row_q4_1 static
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
|
|
Co-authored-by: Jakub Horak <jakub.horak@ibawizard.net>
|
|
further consumption/training/finetuning (#403)
|
|
* Refactor quantized processing functions
* ggml : minor
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
|
|
correctly with `vocab_only` setting. Also confirmed that the code works as expected after running with reduced memory usage due to deletion of no-longer-needed variable. (#547)
|
|
|
|
|
|
* Be more strict about converting float to double
* Test equivalence of round, SILU implementations
Test module is commented out in CMakeLists.txt because the tests may
take a long time, depending on how much the compiler optimizes.
* Fix softmax in perplexity.cpp
* all : prefer float over double where appropriate
* perplexity : add <cmath>
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
|
|
* Add a Package.swift for SwiftPM support
* Swap from exclusions to allowlist
|
|
* Introduce structs for the q4 data blocks
* ggml : rename quant struct variables + fix ARM_NEON
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
|
|
|
|
Co-authored-by: Jincheng Miao <jincheng.miao@gmail.com>
|
|
* Fix usage of F16C intrinsics in AVX code when F16C is not defined
|
|
- main: entering empty line passes back control without new input in interactive/instruct modes
- instruct mode: keep prompt fix
- instruct mode: duplicate instruct prompt fix
- refactor: move common console code from main->common
|
|
|
|
|
|
|
|
|
|
* Add support for linux/arm64 platform
* Add platform to versioned builds
|
|
|
|
|
|
|
|
Allow exiting the interactive prompt also with CTRL-D on Unix and CTRL-Z
on Windows.
|
|
applies only to sanitizer builds so they wont be cancelled
|
|
"Processing part 1 of 3" instead of "Processing part 0"
|
|
* CMake: Add AVX512 option
* CI: Add AVX/AVX512 builds (Windows)
(AVX512 tests can only be run when the worker happens to support it, building works anyway)
* CMake: Fix sanitizer linkage ( merged #468 )
* CI: Add sanitizer builds (Ubuntu)
* CI: Fix release tagging
(change @zendesk/action-create-release to @anzz1/action-create-release until upstream PR Added commitish as input zendesk/action-create-release#32 is merged)
|
|
Sets console codepage to 65001 (CP_UTF8) on start for both input and output, should fix problems with UTF-8 characters.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
- main -> examples
- utils -> examples (renamed to "common")
- quantize -> examples
- separate tools for "perplexity" and "embedding"
Hope I didn't break something !
|
|
* Retire the ggml_mul_mat() for transposed src0
- It can always be made contiguous with ggml_cpy()
- The code is now simplified
- The results are deterministic in respect to num threads
* SIMD-ify dequantize_row_q4_0() for ARM_NEON (#502)
* Attempt to SIMD-ify dequantize_row_q4_0() for ARM_NEON
* Fix dequantization - forgot to interleave the quants
|
|
|
|
|
|
|