ggml : introduce structs for the q4 data blocks (#356)

* Introduce structs for the q4 data blocks * ggml : rename quant struct variables + fix ARM_NEON --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
author: Stephan Walter <stephan@walter.name> 2023-03-28 15:56:03 +0000
committer: GitHub <noreply@github.com> 2023-03-28 18:56:03 +0300
commit: c1f885067c61191a07a1aedf684168dda62f3f71 (patch)
tree: 2bcb3f068942e2f16a92d70fec4bd623ac17ce28 /llama.h
parent: e0670260fb50a882b37074112b1881fb0820cf77 (diff)
1 files changed, 1 insertions, 2 deletions
diff --git a/llama.h b/llama.h
index ebf55f4..d3f4cae 100644
--- a/llama.h
+++ b/llama.h
@@ -81,8 +81,7 @@ extern "C" {
     LLAMA_API int llama_model_quantize(
             const char * fname_inp,
             const char * fname_out,
-                   int   itype,
-                   int   qk);
+                   int   itype);
 
     // Run the llama inference to obtain the logits and probabilities for the next token.
     // tokens + n_tokens is the provided batch of new tokens to process
author	Stephan Walter <stephan@walter.name>	2023-03-28 15:56:03 +0000
committer	GitHub <noreply@github.com>	2023-03-28 18:56:03 +0300
commit	c1f885067c61191a07a1aedf684168dda62f3f71 (patch)
tree	2bcb3f068942e2f16a92d70fec4bd623ac17ce28 /llama.h
parent	e0670260fb50a882b37074112b1881fb0820cf77 (diff)