ggml-quants : fix make_qp_quants NANs and IQ1 assertion errors (llama/15379) a575f57 compilade commited on Aug 18, 2025
vulkan: disable spirv-opt for bfloat16 shaders (llama/15352) cf24af7 jeffbolznv commited on Aug 18, 2025
vulkan: Use larger workgroups for mul_mat_vec when M is small (llama/15355) 054584a jeffbolznv OccamRazor commited on Aug 17, 2025
vulkan: Support mul_mat_id with f32 accumulators (llama/15337) 41a76e6 jeffbolznv commited on Aug 16, 2025
vulkan: Add missing bounds checking to scalar/coopmat1 mul_mat_id (llama/15334) a6fa78e jeffbolznv commited on Aug 16, 2025
opencl: add initial mxfp4 support via mv (llama/15270) 1a0281c lhez shawngu-quic commited on Aug 15, 2025
vulkan : fix out-of-bounds access in argmax kernel (llama/15342) 78a1865 ggerganov commited on Aug 15, 2025
CUDA: fix negative KV_max values in FA (llama/15321) 6e3a7b6 JohannesGaessler commited on Aug 14, 2025
HIP: Cleanup hipification header (llama/15285) 7cdf9cd uvos JohannesGaessler commited on Aug 14, 2025
finetune: SGD optimizer, more CLI args (llama/13873) f585fe7 Jonathan Graehl OccamRazor JohannesGaessler commited on Aug 14, 2025
CUDA: Optimize `reduce_rows_f32` kernel, leading up to 25x perf improvement on kernel-level and 10% perf increase for Gemma3n (llama/15132) c768824 ORippler commited on Aug 13, 2025
ggml-rpc: chunk send()/recv() to avoid EINVAL for very large tensors over RPC (macOS & others) (llama/15188) c8284f2 aixsatoshi Shinnosuke Takagi commited on Aug 13, 2025
HIP: disable sync warp shuffel operators from clr amd_warp_sync_functions.h (llama/15273) 8fca6dd uvos commited on Aug 12, 2025
sycl: Fix and disable more configurations of mul_mat (llama/15151) 7b868ed Romain Biessy commited on Aug 12, 2025
musa: fix failures in test-backend-ops for mul_mat_id op (llama/15236) 4168dda yeahdongcn commited on Aug 12, 2025
gguf-py : add Numpy MXFP4 de/quantization support (llama/15111) 324f3bd compilade commited on Aug 8, 2025
CUDA: attention sinks for mma FlashAttention (llama/15157) 0ab9aba JohannesGaessler commited on Aug 8, 2025
vulkan: Add env var to disable host visible vidmem (llama/15109) 5ec4382 jeffbolznv commited on Aug 7, 2025
HIP: add cmake option to enable compiler output of kernel resource usage metrics (llama/15103) 577f7e4 uvos commited on Aug 7, 2025
CUDA: GEMM for FP32/FP16/BF16 and ne11 <= 16 (llama/15131) 1d24833 JohannesGaessler commited on Aug 7, 2025
ggml : fix fallback to CPU for ununsupported ops (llama/15118) 2b7ae5e Diego Devesa commited on Aug 6, 2025