aboutsummaryrefslogtreecommitdiff
path: root/ggml-opencl.cpp
AgeCommit message (Collapse)Author
2023-06-04OpenCL: Fix duplication of layers in VRAM and RAM, add GPU mul kernel (#1653)0cc4m
* Use events instead of clFinish, where possible * OpenCL: Don't load gpu layers into RAM, add mul_f32 kernel * Reduce queueing overhead for contiguous tensors by using single mul kernel call * Adapt to #1612 cl_mem malloc changes * Reduce code duplication between cuda and opencl branches * Improve implementation
2023-05-28opencl : no need to allocate cl_mem on heap (#1612)Howard Su
2023-05-28opencl : use strstr to check if fp16 supported (#1611)Howard Su
* Use strstr to check if fp16 supported * Ensure ext_buffer is null terminated
2023-05-23Fix handling of "invalid property" when creating OpenCL command queue (#1565)Maarten ter Huurne
The `clCreateCommandQueue()` function will return the code `CL_INVALID_QUEUE_PROPERTIES` when passed unsupported properties, not `CL_INVALID_PROPERTY` as the original code was checking for.
2023-05-23OpenCL Token Generation Acceleration (#1459)0cc4m
* Move back to C++ for OpenCL * Refactor OpenCL code to work more like the CUDA code, add missing functions * Deduplicate dequant kernels * Add OpenCL compile options * Use compile args for preprocessing constants * Restore default platform + device selection by id behavior --------- Co-authored-by: Johannes Gäßler <johannesg@5d6.de> Co-authored-by: Henri Vasserman <henv@hot.ee>