aboutsummaryrefslogtreecommitdiff
path: root/examples/common.h
diff options
context:
space:
mode:
authorWilly Tarreau <w@1wt.eu>2023-06-07 04:10:17 +0200
committerGitHub <noreply@github.com>2023-06-06 22:10:17 -0400
commit35a84916fb029905c44746127026079268216e7a (patch)
treee7baab7b8c74528460690694902eb7d79bae8c24 /examples/common.h
parent2d7bf110edd8c49209401a16132052cba706ffd0 (diff)
main: add the possibility to open the prompt cache read-only (#1640)
The prompt cache constitutes a nice speed up when using the same prompt prefix across multiple evaluations, but when using it, it will also be updated, which is not always desirable. One use case is to have a large prompt containing some context and usage rules, and a second part containing variable data of the problem being studied. In this case it's desirable to be able to save the first part once, and to always reuse it as-is without updating it with the second part. The new argument --prompt-cache-ro enables this read-only mode on the prompt cache. The prompt's contents that match the cache are loaded from the cache but the rest is not modified. This allowed to reduce a total analysis time from 112s to 49.7s here, without having to backup and restore a copy of the prompt, which takes significant time at 500 MB. Signed-off-by: Willy Tarreau <w@1wt.eu>
Diffstat (limited to 'examples/common.h')
-rw-r--r--examples/common.h1
1 files changed, 1 insertions, 0 deletions
diff --git a/examples/common.h b/examples/common.h
index 12b4973..826e2ae 100644
--- a/examples/common.h
+++ b/examples/common.h
@@ -62,6 +62,7 @@ struct gpt_params {
bool use_color = false; // use color to distinguish generations and inputs
bool interactive = false; // interactive mode
bool prompt_cache_all = false; // save user input and generations to prompt cache
+ bool prompt_cache_ro = false; // open the prompt cache read-only and do not update it
bool embedding = false; // get only sentence embedding
bool interactive_first = false; // wait for user input immediately