llama : support input embeddings directly (#1910)

* add interface for float input * fixed inpL shape and type * add examples of input floats * add test example for embd input * fixed sampling * add free for context * fixed add end condition for generating * add examples for llava.py * add READMD for llava.py * add READMD for llava.py * add example of PandaGPT * refactor the interface and fixed the styles * add cmake build for embd-input * add cmake build for embd-input * Add MiniGPT-4 example * change the order of the args of llama_eval_internal * fix ci error
author: ningshanwutuobang <ningshanwutuobang@gmail.com> 2023-06-28 23:53:37 +0800
committer: GitHub <noreply@github.com> 2023-06-28 18:53:37 +0300
commit: cfa0750bc9dbc2d957a91b8ed09ab0035d8f3d4e (patch)
tree: c8d6d6e6548d4f03899704f64bce6939e471e4e6 /convert-lora-to-ggml.py
parent: 9d23589d638dc74577d5ff880e6d4248b795f12e (diff)
1 files changed, 5 insertions, 1 deletions
diff --git a/convert-lora-to-ggml.py b/convert-lora-to-ggml.py
index 9090e8d..f43c836 100644
--- a/convert-lora-to-ggml.py
+++ b/convert-lora-to-ggml.py
@@ -113,6 +113,10 @@ with open(output_path, "wb") as fout:
 
     write_file_header(fout, params)
     for k, v in model.items():
+        if k.endswith(".default.weight"):
+            k = k.replace(".default.weight", ".weight")
+        if k in ["llama_proj.weight", "llama_proj.bias"]:
+            continue
         if k.endswith("lora_A.weight"):
             if v.dtype != torch.float16 and v.dtype != torch.float32:
                 v = v.float()
@@ -120,7 +124,7 @@ with open(output_path, "wb") as fout:
         else:
             v = v.float()
 
-        t = v.numpy()
+        t = v.detach().numpy()
         tname = translate_tensor_name(k)
         print(f"{k} => {tname} {t.shape} {t.dtype} {t.nbytes/1024/1024:.2f}MB")
         write_tensor_header(fout, tname, t.shape, t.dtype)
author	ningshanwutuobang <ningshanwutuobang@gmail.com>	2023-06-28 23:53:37 +0800
committer	GitHub <noreply@github.com>	2023-06-28 18:53:37 +0300
commit	cfa0750bc9dbc2d957a91b8ed09ab0035d8f3d4e (patch)
tree	c8d6d6e6548d4f03899704f64bce6939e471e4e6 /convert-lora-to-ggml.py
parent	9d23589d638dc74577d5ff880e6d4248b795f12e (diff)