Age | Commit message (Collapse) | Author |
|
Co-authored-by: Johnman <>
Co-authored-by: Johnman <tjohnman@github>
|
|
Co-authored-by: Johnman <>
|
|
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
|
|
(#294)
* Use F16 for memory_k and memory_v
* add command line switch to use f16 instead of f32 for memory k+v
---------
Co-authored-by: Ty Everett <ty@tyweb.us>
|
|
Also start adding prompts in "./prompts"
|
|
|
|
The readme tells people to use the command line option "-t 8", causing 8
threads to be started. On systems with fewer than 8 cores, this causes a
significant slowdown. Remove the option from the example command lines
and use /proc/cpuinfo on Linux to determine a sensible default.
|
|
* added ctx_size parameter
* added it in more places
* Apply suggestions from code review
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
|
|
* Initial work on interactive mode.
* Improve interactive mode. Make rev. prompt optional.
* Update README to explain interactive mode.
* Fix OS X build
|
|
* Add back top_k
* Update utils.cpp
* Update utils.h
---------
Co-authored-by: Bill Hamilton <bill.hamilton@shopify.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
|
|
* Adding repeat penalization
* Update utils.h
* Update utils.cpp
* Numeric fix
Should probably still scale by temp even if penalized
* Update comments, more proper application
I see that numbers can go negative so a fix from a referenced commit
* Minor formatting
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
|
|
|
|
|
|
|