Age | Commit message (Collapse) | Author | |
---|---|---|---|
2023-03-22 | Deduplicate q4 quantization functions (#383) | Stephan Walter | |
* Deduplicate q4 quantization functions * Use const; add basic test * Re-enable quantization test * Disable AVX2 flags in CI --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> | |||
2023-03-22 | Introduce C-style API (#370) | Georgi Gerganov | |
* Major refactoring - introduce C-style API * Clean up * Add <cassert> * Add <iterator> * Add <algorithm> .... * Fix timing reporting and accumulation * Measure eval time only for single-token calls * Change llama_tokenize return meaning | |||
2023-03-16 | Add RMS norm and use it (#187) | hoangmit | |
* add ggml_rms_norm * update op num | |||
2023-03-10 | Initial release | Georgi Gerganov | |