From d01bccde9f759b24449fdaa16306b406a50eb367 Mon Sep 17 00:00:00 2001
From: Georgi Gerganov <ggerganov@gmail.com>
Date: Tue, 18 Jul 2023 14:24:43 +0300
Subject: ci : integrate with ggml-org/ci (#2250)

* ci : run ctest

ggml-ci

* ci : add open llama 3B-v2 tests

ggml-ci

* ci : disable wget progress output

ggml-ci

* ci : add open llama 3B-v2 tg tests for q4 and q5 quantizations

ggml-ci

* tests : try to fix tail free sampling test

ggml-ci

* ci : add K-quants

ggml-ci

* ci : add short perplexity tests

ggml-ci

* ci : add README.md

* ppl : add --chunks argument to limit max number of chunks

ggml-ci

* ci : update README
---
 ci/README.md | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)
 create mode 100644 ci/README.md

(limited to 'ci/README.md')

diff --git a/ci/README.md b/ci/README.md
new file mode 100644
index 0000000..6c74c81
--- /dev/null
+++ b/ci/README.md
@@ -0,0 +1,20 @@
+# CI
+
+In addition to [Github Actions](https://github.com/ggerganov/llama.cpp/actions) `llama.cpp` uses a custom CI framework:
+
+https://github.com/ggml-org/ci
+
+It monitors the `master` branch for new commits and runs the
+[ci/run.sh](https://github.com/ggerganov/llama.cpp/blob/master/ci/run.sh) script on dedicated cloud instances. This allows us
+to execute heavier workloads compared to just using Github Actions. Also with time, the cloud instances will be scaled
+to cover various hardware architectures, including GPU and Apple Silicon instances.
+
+Collaborators can optionally trigger the CI run by adding the `ggml-ci` keyword to their commit message.
+Only the branches of this repo are monitored for this keyword.
+
+It is a good practice, before publishing changes to execute the full CI locally on your machine:
+
+```bash
+mkdir tmp
+bash ./ci/run.sh ./tmp/results ./tmp/mnt
+```
-- 
cgit v1.2.3


From 5d500e8ccf5eee3de3ae66685cc3be75e43e08b9 Mon Sep 17 00:00:00 2001
From: Georgi Gerganov <ggerganov@gmail.com>
Date: Sat, 22 Jul 2023 11:48:22 +0300
Subject: ci : add 7B CUDA tests (#2319)

* ci : add 7B CUDA tests

ggml-ci

* ci : add Q2_K to the tests

* ci : bump CUDA ppl chunks

ggml-ci

* ci : increase CUDA TG len + add --ignore-eos

* ci : reduce CUDA ppl cunks down to 4 to save time
---
 ci/README.md | 5 +++++
 1 file changed, 5 insertions(+)

(limited to 'ci/README.md')

diff --git a/ci/README.md b/ci/README.md
index 6c74c81..65cfe63 100644
--- a/ci/README.md
+++ b/ci/README.md
@@ -16,5 +16,10 @@ It is a good practice, before publishing changes to execute the full CI locally
 
 ```bash
 mkdir tmp
+
+# CPU-only build
 bash ./ci/run.sh ./tmp/results ./tmp/mnt
+
+# with CUDA support
+GG_BUILD_CUDA=1 bash ./ci/run.sh ./tmp/results ./tmp/mnt
 ```
-- 
cgit v1.2.3