aboutsummaryrefslogtreecommitdiff
path: root/examples/main
AgeCommit message (Expand)Author
2023-08-07Add --rope-scale parameter (#2544)klosax
2023-08-04Add --simple-io option for subprocesses and break out console.h and cpp (#1558)DannyDaemonic
2023-07-28readme : fix the description of the Tail free sampling (TFS) method (#2431)Weird Constructor
2023-07-25main : add `--in-prefix-bos` to prefix BOS to user inputs; keep EOS (#2304)Xiao-Yong Jin
2023-07-23llama : add grammar-based sampling (#1773)Evan Jones
2023-07-23llama : grouped-query attention + LLaMAv2 70B support (#2276)Georgi Gerganov
2023-07-22llama : optimize memory buffers (#2325)Georgi Gerganov
2023-07-21llama : remove cfg smooth factor as it is only a reparameterization of the gu...Guillaume "Vermeille" Sanchez
2023-07-19cmake : install targets (#2256)wzy
2023-07-15llama : add custom RoPE (#2054)Xiao-Yong Jin
2023-07-13Revert "Support using mmap when applying LoRA (#2095)" (#2206)Howard Su
2023-07-11llama : add classifier-free guidance (#2135)Bach Le
2023-07-11Support using mmap when applying LoRA (#2095)Howard Su
2023-07-10mpi : add support for distributed inference via MPI (#2099)Evan Miller
2023-07-06convert : update for baichuan (#2081)Judd
2023-06-29Use unsigned for random seed (#2006)Howard Su
2023-06-26ggml : add NUMA support (#1556)zrm
2023-06-24llama : make model stateless and context stateful (llama_state) (#1797)Didzis Gosko
2023-06-17minor : warning fixesGeorgi Gerganov
2023-06-16Fixed possible macro redefinition (#1892)FrankHB
2023-06-16build : fix and ignore MSVC warnings (#1889)Borislav Stanimirov
2023-06-14CUDA full GPU acceleration, KV cache in VRAM (#1827)Johannes Gäßler
2023-06-13llama : do a warm-up eval at start for better timings (#1824)Georgi Gerganov
2023-06-11Fix issue where interactive mode crashes when input exceeds ctx size (#1789)Kerfuffle
2023-06-06main: add the possibility to open the prompt cache read-only (#1640)Willy Tarreau
2023-06-06Multi GPU support, CUDA refactor, CUDA scratch buffer (#1703)Johannes Gäßler
2023-06-04llama : Metal inference (#1642)Georgi Gerganov
2023-06-03Fix prompt cache saving and chat-persistent rollover (#1678)Evan Jones
2023-05-29Work around for recalculating logits in cached prompts (Fixes #1585) (#1609)DannyDaemonic
2023-05-28Only show -ngl option when relevant + other doc/arg handling updates (#1625)Kerfuffle
2023-05-25Some improvements to loading the session with --prompt-cache (#1550)Kerfuffle
2023-05-20llama : add llama_init_backend() API (close #1527)Georgi Gerganov
2023-05-19main : make reverse prompt option act as a stop token in non-interactive mode...Jason McCartney
2023-05-18Fixes #1511 lambda issue for w64devkit (mingw) (#1513)DannyDaemonic
2023-05-16define default model path once, sync path with readme (#1366)András Salamon
2023-05-12llama : fix --mtest option (close #1414)Georgi Gerganov
2023-05-10main : add option to save full output to session (#1338)Evan Jones
2023-05-08Interface improvements and `--multiline-input` (previously `--author-mode`) (...DannyDaemonic
2023-05-08llama : require first token to be BOS (#1303)Georgi Gerganov
2023-05-06Remove default arguments from sampling functions (#1343)Jed Fox
2023-05-04main : add --in-suffix option (#1318)44670
2023-05-04Only escape prompts when used with `-e` (#1311)DannyDaemonic
2023-05-04Update main's README.md with new features (#1296)DannyDaemonic
2023-05-04fix #1224 reverse prompt and multi line (#1297)Tomas
2023-05-02Handle signals properly on Windows (#1123)DannyDaemonic
2023-05-02examples : add llama_init_from_gpt_params() common function (#1290)Ron Evans
2023-05-02examples : improve vertical alignment of a few variables (#1286)Ron Evans
2023-05-02llama : allow 0 as a seed number. (#1275)Robert Brisita
2023-05-02main : switch input_noecho to input_echo to remove negation (#979)Ron Evans
2023-05-01Add git-based build information for better issue tracking (#1232)DannyDaemonic