llama.cpp.git - llama.cpp

Age	Commit message (Collapse)	Author
2023-07-25	ci : add non-AVX scalar build/test (#2356)	Eve
	* noavx build and test * we don't need to remove f16c in windows
2023-07-10	mpi : add support for distributed inference via MPI (#2099)	Evan Miller
	* MPI support, first cut * fix warnings, update README * fixes * wrap includes * PR comments * Update CMakeLists.txt * Add GH workflow, fix test * Add info to README * mpi : trying to move more MPI stuff into ggml-mpi (WIP) (#2099) * mpi : add names for layer inputs + prep ggml_mpi_graph_compute() * mpi : move all MPI logic into ggml-mpi Not tested yet * mpi : various fixes - communication now works but results are wrong * mpi : fix output tensor after MPI compute (still not working) * mpi : fix inference * mpi : minor * Add OpenMPI to GH action * [mpi] continue-on-error: true * mpi : fix after master merge * [mpi] Link MPI C++ libraries to fix OpenMPI * tests : fix new llama_backend API * [mpi] use MPI_INT32_T * mpi : factor out recv / send in functions and reuse * mpi : extend API to allow usage with outer backends (e.g. Metal) --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-07-07	ci : switch threads to 1 (#2138)	Georgi Gerganov

2023-07-07	ggml : change ggml_graph_compute() API to not require context (#1999)	Qingyou Meng
	* ggml_graph_compute: deprecate using ggml_context, try resolve issue #287 * rewrite: no longer consider backward compitability; plan and make_plan * minor: rename ctx as plan; const * remove ggml_graph_compute from tests/test-grad0.c, but current change breaks backward * add static ggml_graph_compute_sugar() * minor: update comments * reusable buffers * ggml : more consistent naming + metal fixes * ggml : fix docs * tests : disable grad / opt + minor naming changes * ggml : add ggml_graph_compute_with_ctx() - backwards compatible API - deduplicates a lot of copy-paste * ci : enable test-grad0 * examples : factor out plan allocation into a helper function * llama : factor out plan stuff into a helper function * ci : fix env * llama : fix duplicate symbols + refactor example benchmark * ggml : remove obsolete assert + refactor n_tasks section * ggml : fix indentation in switch * llama : avoid unnecessary bool * ggml : remove comments from source file and match order in header --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-07-05	ggml : generalize `quantize_fns` for simpler FP16 handling (#1237)	Stephan Walter
	* Generalize quantize_fns for simpler FP16 handling * Remove call to ggml_cuda_mul_mat_get_wsize * ci : disable FMA for mac os actions --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-07-04	CI: make the brew update temporarily optional. (#2092)	Erik Scholz
	until they decide to fix the brew installation in the macos runners. see the open issues. eg https://github.com/actions/runner-images/pull/7710
2023-06-12	ci : run when changing only the CUDA sources (#1800)	slaren

2023-06-05	ci : disable auto tidy (#1705)	Georgi Gerganov

2023-05-27	Include server in releases + other build system cleanups (#1610)	Kerfuffle
	Set `LLAMA_BUILD_SERVER` in workflow so the `server` example gets build. This currently only applies to Windows builds because it seems like only Windows binary artifacts are included in releases. Add `server` example target to `Makefile` (still uses `LLAMA_BUILD_SERVER` define and does not build by default) Fix issue where `vdot` binary wasn't removed when running `make clean`. Fix compile warnings in `server` example. Add `.hpp` files to trigger workflow (the server example has one).
2023-05-27	[CI] Fix openblas (#1613)	Henri Vasserman
	* Fix OpenBLAS build * Fix `LLAMA_BLAS_VENDOR` CMake variable that should be a string and not a boolean.
2023-05-27	[CI] CLBlast: Fix directory name (#1606)	Henri Vasserman

2023-05-24	Update CLBlast to 1.6.0 (#1580)	Henri Vasserman
	* Update CLBlast to 1.6.0
2023-05-20	feature : support blis and other blas implementation (#1536)	Zenix
	* feature: add blis support * feature: allow all BLA_VENDOR to be assigned in cmake arguments. align with whisper.cpp pr 927 * fix: version detection for BLA_SIZEOF_INTEGER, recover min version of cmake * Fix typo in INTEGER Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * Fix: blas changes on ci --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-05-12	Add clang-tidy reviews to CI (#1407)	slaren

2023-05-07	CI: add Windows CLBlast and OpenBLAS builds (#1277)	Henri Vasserman
	* Add OpenCL and CLBlast support * Add OpenBLAS support * Remove testing from matrix * change build name to 'clblast'
2023-05-05	ci : add cublas to windows release (#1271)	Erik Scholz

2023-04-24	Fix build for gcc 8 and test in CI (#1154)	Stephan Walter

2023-04-22	ci : trigger CI for drafts, but not most PR actions (#1125)	Stephan Walter

2023-04-22	cmake : fix build under Windows when enable BUILD_SHARED_LIBS (#1100)	Howard Su
	* Fix build under Windows when enable BUILD_SHARED_LIBS * Make AVX512 test on Windows to build the shared libs
2023-04-20	ci : remove the LLAMA_ACCELERATE matrix dimension from Ubuntu builds in the ↵	Ivan Komarov
	CI (#1074) [Accelerate](https://developer.apple.com/documentation/accelerate) is an Apple framework which can only be used on macOS, and the CMake build [ignores](https://github.com/ggerganov/llama.cpp/blob/master/CMakeLists.txt#L102) the `LLAMA_ACCELERATE` variable when run on non-Apple platforms. This implies setting `LLAMA_ACCELERATE` is a no-op on Ubuntu and can be removed. This will reduce visual noise in CI check results (in addition to reducing the number of checks we have to run for every PR). Right now every sanitized build is duplicated twice for no good reason (e.g., we have `CI / ubuntu-latest-cmake-sanitizer (ADDRESS, Debug, ON)` and `CI / ubuntu-latest-cmake-sanitizer (ADDRESS, Debug, OFF)`).
2023-04-18	ci : do not run on drafts	Georgi Gerganov

2023-04-11	Fix whitespace, add .editorconfig, add GitHub workflow (#883)	Pavol Rusnak

2023-03-29	ci : re-enable AVX512 testing (Windows-MSVC) (#584)	anzz1
	* CI: Re-enable AVX512 testing (Windows-MSVC) Now with 100% less base64 encoding * plain __cpuid is enough here
2023-03-28	CI: fix subdirectory path globbing (#546)	anzz1
	- Changes in subdirectories will now be detecter properly - (Windows-MSVC) AVX512 tests temporarily disabled
2023-03-28	ci : make ctest verbose, hopefully we see what is wrong with the sanitizer	Georgi Gerganov

2023-03-26	ci: add debug build to sanitizer build matrix (#527)	Erik Scholz

2023-03-26	Add support for linux/arm64 platform during Docker Builds (#514)	Juan Calderon-Perez
	* Add support for linux/arm64 platform * Add platform to versioned builds
2023-03-26	CI: Run other sanitizer builds even if one fails (#511)	anzz1
	applies only to sanitizer builds so they wont be cancelled
2023-03-25	CMake / CI additions (#497)	anzz1
	* CMake: Add AVX512 option * CI: Add AVX/AVX512 builds (Windows) (AVX512 tests can only be run when the worker happens to support it, building works anyway) * CMake: Fix sanitizer linkage ( merged #468 ) * CI: Add sanitizer builds (Ubuntu) * CI: Fix release tagging (change @zendesk/action-create-release to @anzz1/action-create-release until upstream PR Added commitish as input zendesk/action-create-release#32 is merged)
2023-03-23	CI: CMake: Separate build and test steps (#376)	anzz1
	* CI: Separate Build and Test steps (CMake) * CI: Make sure build passes before running tests (CMake) * CI: Standardise step id names
2023-03-22	Deduplicate q4 quantization functions (#383)	Stephan Walter
	* Deduplicate q4 quantization functions * Use const; add basic test * Re-enable quantization test * Disable AVX2 flags in CI --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-03-22	Fix bin dir for win ci	anzz1

2023-03-21	specify build type for ctest on windows (#371)	Erik Scholz

2023-03-21	Add tokenizer test + revert to C++11 (#355)	Georgi Gerganov
	* Add test-tokenizer-0 to do a few tokenizations - feel free to expand * Added option to convert-pth-to-ggml.py script to dump just the vocabulary * Added ./models/ggml-vocab.bin containing just LLaMA vocab data (used for tests) * Added utility to load vocabulary file from previous point (temporary implementation) * Avoid using std::string_view and drop back to C++11 (hope I didn't break something) * Rename gpt_vocab -> llama_vocab * All CMake binaries go into ./bin/ now
2023-03-20	Docker - Fix publish docker image in GitHub Registry (#235)	Bernat Vadell
	* fix publish permission * try to fix docker pipeline using as password github_token & username repository_owner
2023-03-18	CI Improvements (#230)	anzz1
	* CI Improvements Manual build feature, autoreleases for Windows * better CI naming convention use branch name in releases and tags
2023-03-17	add ptread link to fix cmake build under linux (#114)	mmyjona
	* add ptread link to fix cmake build under linux * add cmake to linux and macos platform * separate make and cmake workflow --------- Co-authored-by: Sebastián A <sebastian.aedo29@gmail.com>
2023-03-17	🚀 Dockerize llamacpp (#132)	Bernat Vadell
	* feat: dockerize llamacpp * feat: split build & runtime stages * split dockerfile into main & tools * add quantize into tool docker image * Update .devops/tools.sh Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * add docker action pipeline * change CI to publish at github docker registry * fix name runs-on macOS-latest is macos-latest (lowercase) * include docker versioned images * fix github action docker * fix docker.yml * feat: include all-in-one command tool & update readme.md --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-03-13	Add windows to the CI (#98)	Sebastián A

2023-03-12	Add CI (#60)	Georgi Gerganov