Age | Commit message (Collapse) | Author |
|
* python script to verify the checksum of the llama models
Added Python script for verifying SHA256 checksums of files in a directory, which can run on multiple platforms. Improved the formatting of the output results for better readability.
* Update README.md
update to the readme for improved readability and to explain the usage of the python checksum verification script
* update the verification script
I've extended the script based on suggestions by @prusnak
The script now checks the available RAM, is there is enough to check the file at once it will do so. If not the file is read in chunks.
* minor improvment
small change so that the available ram is checked and not the total ram
* remove the part of the code that reads the file at once if enough ram is available
based on suggestions from @prusnak i removed the part of the code that checks whether the user had enough ram to read the entire model at once. the file is now always read in chunks.
* Update verify-checksum-models.py
quick fix to pass the git check
|
|
* fix dan.txt
* miku prompt improvements
* use common characters
|
|
* llama : only copy used KV cache in get / set state
* switch to ggml for copying k, v
* avoid designated initializers
|
|
|
|
|
|
|
|
* make git build info work with submodules
---------
Co-authored-by: Green Sky <green@g-s.xyz>
|
|
|
|
Signed-off-by: deadprogram <ron@hybridgroup.com>
|
|
|
|
|
|
Signed-off-by: deadprogram <ron@hybridgroup.com>
|
|
* Fix ppc64le build issue
* Added support to detect ppc64* processors
|
|
|
|
Signed-off-by: deadprogram <ron@hybridgroup.com>
|
|
* ggml: add names to tensors
* minor improvements to dot file formatting
|
|
* Add git-based build information for better issue tracking
* macOS fix
* "build (hash)" and "CMAKE_SOURCE_DIR" changes
* Redo "CMAKE_CURRENT_SOURCE_DIR" and clearer build messages
* Fix conditional dependency on missing target
* Broke out build-info.cmake, added find_package fallback, and added build into to all examples, added dependencies to Makefile
* 4 space indenting for cmake, attempt to clean up my mess in Makefile
* Short hash, less fancy Makefile, and don't modify build-info.h if it wouldn't change it
|
|
* cuBLAS: refactor, convert fp16 to fp32 on device
* cuBLAS: use multiple streams, choose smartly between mul_mat_q and mul_mat_f16
* fix build
* cuBLAS: update block_q5_1
|
|
Co-authored-by: John Doe <john.doe@example.com>
|
|
|
|
|
|
* cuBLAS: fall back to pageable memory if pinned alloc fails
* cuBLAS: do not use pinned memory if env variable GGML_CUDA_NO_PINNED is set
|
|
|
|
|
|
- flags copied from Makefile
- updated comments in both CMakeLists.txt and Makefile to match reality
|
|
* commit
* fix
* try-catch
* apply code review
* improve
* improve
* add macos headers
* done
* remove color
* fix windows
* minor
* fix
* Apply suggestions from code review
Co-authored-by: DannyDaemonic <DannyDaemonic@gmail.com>
* remove
* minor
* minor
---------
Co-authored-by: jon-chuang <jon-chuang@users.noreply.github.com>
Co-authored-by: DannyDaemonic <DannyDaemonic@gmail.com>
|
|
* Implement q5_0, q5_1 and q8_0
* Work around q5_0 OpenCL issue
* Fix q8_0 dequant kernel
* Move cl kernels into ggml-opencl.c
* Use two memcpy calls for q5_0 buffer transfer
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
* llama : minor - remove explicity int64_t cast
* ggml : reduce memory buffer for F16 mul_mat when not using cuBLAS
* ggml : add asserts to guard for incorrect wsize
|
|
|
|
|
|
|
|
* Sample interface, new samplers.
New samplers:
- locally typical sampling
- tail free sampling
- frequency and presence penalty
- mirostat
Ignore EOS fix: -inf should be used.
* mirostat
* Added --logit-bias and --no-penalize-nl, removed std::span
* Use C++11, clarify llama API documentation, rename Mirostat parameters to --mirostat_lr and --mirostat_ent, add temperature sampling for Mirostat, simplify Mirostat sampling API parameters (removed N and *k)
Use C++11, clarify llama API documentation, rename Mirostat parameters to --mirostat_lr and --mirostat_ent, add temperature sampling for Mirostat, simplify Mirostat sampling API parameters (removed N and *k)
* Save and load example adjust
* Tests
* Windows build fix
* Windows test fix
|
|
* cuBLAS: dequantize simultaneously while copying memory
* cuBLAS: use host pinned memory
* cuBLAS: improve ggml_compute_forward_mul_mat_f16_f32 with pinned memory
* cuBLAS: also pin kv cache
* fix rebase
|
|
* Cuda: non-contiguous tensor support
* remove extra stuff
* rename
* fix error
* more fixes, now OpenBLAS and CLBlast build too
* now then?
|
|
|
|
|
|
|
|
* Basic Setup
* Prevent Results.txt from coming up
* Prefixes, Line separators, etc
* editorcheck
* introduction to give more consistent results
* Basic graph thing
* Grading, ready for testing!
* Y'all ready to get funky?
* fix column removal stuff
* missed a few
|
|
|
|
|
|
* Allow use of OpenCL GPU-based BLAS using ClBlast instead of OpenBLAS for context processing
* Improve ClBlast implementation, avoid recreating buffers, remove redundant transfers
* Finish merge of ClBlast support
* Move CLBlast implementation to separate file
Add buffer reuse code (adapted from slaren's cuda implementation)
* Add q4_2 and q4_3 CLBlast support, improve code
* Double CLBlast speed by disabling OpenBLAS thread workaround
Co-authored-by: Concedo <39025047+LostRuins@users.noreply.github.com>
Co-authored-by: slaren <2141330+slaren@users.noreply.github.com>
* Fix device selection env variable names
* Fix cast in opencl kernels
* Add CLBlast to CMakeLists.txt
* Replace buffer pool with static buffers a, b, qb, c
Fix compile warnings
* Fix typos, use GGML_TYPE defines, improve code
* Improve btype dequant kernel selection code, add error if type is unsupported
* Improve code quality
* Move internal stuff out of header
* Use internal enums instead of CLBlast enums
* Remove leftover C++ includes and defines
* Make event use easier to read
Co-authored-by: Henri Vasserman <henv@hot.ee>
* Use c compiler for opencl files
* Simplify code, fix include
* First check error, then release event
* Make globals static, fix indentation
* Rename dequant kernels file to conform with other file names
* Fix import cl file name
---------
Co-authored-by: Concedo <39025047+LostRuins@users.noreply.github.com>
Co-authored-by: slaren <2141330+slaren@users.noreply.github.com>
Co-authored-by: Henri Vasserman <henv@hot.ee>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
|
|
Correcting link to w64devkit (change seeto to skeeto).
|
|
|