Age | Commit message (Expand) | Author |
2023-06-17 | Only one CUDA stream per device for async compute (#1898) | Johannes Gäßler |
2023-06-16 | build : fix and ignore MSVC warnings (#1889) | Borislav Stanimirov |
2023-06-15 | Better error when using both LoRA + GPU layers (#1861) | Johannes Gäßler |
2023-06-14 | CUDA full GPU acceleration, KV cache in VRAM (#1827) | Johannes Gäßler |
2023-06-11 | Fix issue where interactive mode crashes when input exceeds ctx size (#1789) | Kerfuffle |
2023-06-06 | main: add the possibility to open the prompt cache read-only (#1640) | Willy Tarreau |
2023-06-06 | Multi GPU support, CUDA refactor, CUDA scratch buffer (#1703) | Johannes Gäßler |
2023-06-04 | llama : Metal inference (#1642) | Georgi Gerganov |
2023-05-28 | Only show -ngl option when relevant + other doc/arg handling updates (#1625) | Kerfuffle |
2023-05-28 | examples : add --alias option to gpt_params to set use friendly model name (#... | Vladimir Zorin |
2023-05-20 | Fix for mingw (#1462) | DannyDaemonic |
2023-05-19 | main : make reverse prompt option act as a stop token in non-interactive mode... | Jason McCartney |
2023-05-19 | minor : fix compile warnings | Georgi Gerganov |
2023-05-17 | Remove unused n_parts parameter (#1509) | Stephan Walter |
2023-05-15 | fix get_num_physical_cores() (#1436) | zrm |
2023-05-13 | ggml : GPU-accelerated token generation (#1412) | Johannes Gäßler |
2023-05-12 | CLI args use - instead of _, backwards compatible (#1416) | Johannes Gäßler |
2023-05-10 | main : add option to save full output to session (#1338) | Evan Jones |
2023-05-09 | Locale fix for Windows (#1379) | DannyDaemonic |
2023-05-08 | Interface improvements and `--multiline-input` (previously `--author-mode`) (... | DannyDaemonic |
2023-05-08 | llama : require first token to be BOS (#1303) | Georgi Gerganov |
2023-05-08 | Documented CUDA reproducibility, added warning (#1346) | Johannes Gäßler |
2023-05-04 | main : add --in-suffix option (#1318) | 44670 |
2023-05-04 | Only escape prompts when used with `-e` (#1311) | DannyDaemonic |
2023-05-02 | Process escape sequences given in prompts (#1173) | DannyDaemonic |
2023-05-03 | fix missing parameters in `llama_init_from_gpt_params` (#1293) | slaren |
2023-05-02 | examples : add llama_init_from_gpt_params() common function (#1290) | Ron Evans |
2023-05-02 | llama : allow 0 as a seed number. (#1275) | Robert Brisita |
2023-04-30 | common : better default number of threads (#934) | jon-chuang |
2023-04-29 | llama : new sampling algorithms (#1126) | Ivan Stepanov |
2023-04-28 | llama : add session file format and saved sessions in main (#1169) | Evan Jones |
2023-04-24 | examples/main README improvements and some light refactoring (#1131) | mgroeber9110 |
2023-04-17 | Add LoRA support (#820) | slaren |
2023-04-14 | Revert "main : alternative instruct mode (Vicuna support, etc.) (#863)" (#982) | Pavol Rusnak |
2023-04-14 | main : alternative instruct mode (Vicuna support, etc.) (#863) | Tomáš Pazdiora |
2023-04-13 | common : remove unnecessary includes (#947) | CRD716 |
2023-04-11 | Fix whitespace, add .editorconfig, add GitHub workflow (#883) | Pavol Rusnak |
2023-04-10 | Rewrite loading code to try to satisfy everyone: | comex |
2023-04-08 | fix for windows utf-8 input (#840) | Tomáš Pazdiora |
2023-04-02 | fix default params for examples/main (#697) | Murilo Santana |
2023-04-01 | Show error message when -f fails | Slaren |
2023-03-28 | all : be more strict about converting float to double (#458) | Stephan Walter |
2023-03-28 | main.cpp fixes, refactoring (#571) | anzz1 |
2023-03-25 | If n_predict == -1, generate forever | Georgi Gerganov |
2023-03-25 | Inifinite generation via context swapping (#71) | Georgi Gerganov |
2023-03-25 | Overhaul the examples structure | Georgi Gerganov |