diff options
author | Shouzheng Liu <lshzh.hi@gmail.com> | 2023-07-25 08:00:19 -0400 |
---|---|---|
committer | GitHub <noreply@github.com> | 2023-07-25 15:00:19 +0300 |
commit | 1aa18ef994a6a2b531434eb13251ef48e56d345b (patch) | |
tree | 7ce76e5926ae0a6a48db56590f69873aca8dd917 /llama.cpp | |
parent | 9a08eaf3c4010962d0126e9e5bfbe9af64b2ac90 (diff) |
metal : concurrently dispatch commands (#2358)
* metal: concurrently dispatch commands
Function `ggml_metal_graph_find_concurrency` will run and write
commands that can be issued concurrently to metal context `concur_list`
array, when `ggml_metal_graph_compute` is called for the first time.
* metal: don't call find_concurrency automatically.
* metal : code style changes
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Diffstat (limited to 'llama.cpp')
-rw-r--r-- | llama.cpp | 3 |
1 files changed, 3 insertions, 0 deletions
@@ -1720,6 +1720,9 @@ static bool llama_eval_internal( #ifdef GGML_USE_METAL if (lctx.ctx_metal && N == 1) { + if (!ggml_metal_if_optimized(lctx.ctx_metal)) { + ggml_metal_graph_find_concurrency(lctx.ctx_metal,&gf); + } ggml_metal_set_n_cb (lctx.ctx_metal, n_threads); ggml_metal_graph_compute(lctx.ctx_metal, &gf); ggml_metal_get_tensor (lctx.ctx_metal, cur); |