llama.cpp.git - llama.cpp

Age	Commit message (Collapse)	Author
2023-03-22	Introduce C-style API (#370)	Georgi Gerganov
	* Major refactoring - introduce C-style API * Clean up * Add <cassert> * Add <iterator> * Add <algorithm> .... * Fix timing reporting and accumulation * Measure eval time only for single-token calls * Change llama_tokenize return meaning
2023-03-21	Fix convert script, warnings alpaca instructions, default params	Georgi Gerganov

2023-03-21	fix typo in comment (#318)	Mack Straight

2023-03-21	Add tokenizer test + revert to C++11 (#355)	Georgi Gerganov
	* Add test-tokenizer-0 to do a few tokenizations - feel free to expand * Added option to convert-pth-to-ggml.py script to dump just the vocabulary * Added ./models/ggml-vocab.bin containing just LLaMA vocab data (used for tests) * Added utility to load vocabulary file from previous point (temporary implementation) * Avoid using std::string_view and drop back to C++11 (hope I didn't break something) * Rename gpt_vocab -> llama_vocab * All CMake binaries go into ./bin/ now
2023-03-20	Fixed tokenizer.model not found error when model dir is symlink (#325)	Qingyou Meng

2023-03-20	sentencepiece bpe compatible tokenizer (#252)	Mack Straight
	* potential out of bounds read * fix quantize * style * Update convert-pth-to-ggml.py * mild cleanup * don't need the space-prefixing here rn since main.cpp already does it * new file magic + version header field * readme notice * missing newlines Co-authored-by: slaren <2141330+slaren@users.noreply.github.com>
2023-03-19	Fix python stuff (#109)	Georgi Gerganov

2023-03-19	Refactoring `convert-pth-to-ggml.py`: more concise and readable (#109)	qunash
	* Refactor get_n_parts function to simplify code and improve readability * Use f-strings instead of concatenation * Refactoring: more concise and readable * modularize --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-03-17	🚀 Dockerize llamacpp (#132)	Bernat Vadell
	* feat: dockerize llamacpp * feat: split build & runtime stages * split dockerfile into main & tools * add quantize into tool docker image * Update .devops/tools.sh Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * add docker action pipeline * change CI to publish at github docker registry * fix name runs-on macOS-latest is macos-latest (lowercase) * include docker versioned images * fix github action docker * fix docker.yml * feat: include all-in-one command tool & update readme.md --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-03-15	Use `tokenizer.vocab_size()` instead of hardcoding 32000 in ↵	Ronsor
	convert-pth-to-ggml.py (#142) There are ways that special tokens or other new tokens could be added to the tokenizer; therefore it's probably best not to assume the vocabulary is only 32000 tokens.
2023-03-13	Fix UTF-8 handling (including colors) (#79)	Val Kharitonov

2023-03-12	Revert "weights_only" arg - this causing more trouble than help	Georgi Gerganov

2023-03-12	python/pytorch compat notes (#44)	Oleksandr Nikitin

2023-03-12	use weights_only in conversion script (#32)	deepdiffuser
	this restricts malicious weights from executing arbitrary code by restricting the unpickler to only loading tensors, primitive types, and dictionaries
2023-03-11	Support all LLaMA models + change Q4_0 quantization storage	Georgi Gerganov

2023-03-10	Fix a bug in the rope calculation	Georgi Gerganov

2023-03-10	Initial release	Georgi Gerganov