whisper.cpp / ggml /src /ggml-metal

Commit History

llama : add gpt-oss (llama/15091)
bf225d6

ggerganov ngxson HF Staff slaren commited on

metal : fix fusion across different encoders (llama/14849)
17d67da

ggerganov commited on

metal : fuse add, mul + add tests (llama/14596)
66ae493

ggerganov commited on

metal : Add missing unary ops Metal support (llama/14660)
2ed022e

Yavor Ivanov commited on

ggml : add ggml_scale_bias (llama/14417)
573d50a

ngxson HF Staff commited on

metal : disable fast math in all quantize kernels (llama/14528)
df9d510

ggerganov commited on

ggml : implement GEGLU_ERF and GEGLU_QUICK ops (llama/14445)
f798922

Sigbjørn Skjæret commited on

ggml : fix FA mask dim 2 and 3 (llama/14505)
a89dc81

ggerganov commited on

llama : initial Mamba-2 support (llama/9126)
1b4087e

compilade commited on

ggml : support bcast ggml_soft_max_ext, ggml_flash_attn_ext (llama/14435)
ebacb3e

ggerganov commited on

ci : disable fast-math for Metal GHA CI (llama/14478)
ec4b1b3

ggerganov commited on

metal : disable fast-math for some cpy kernels (llama/14460)
9d1185a

ggerganov commited on

metal : add special-case mat-vec mul for ne00 == 4 (llama/14385)
724622d

ggerganov commited on

metal : batch rows copy in a single threadgroup (llama/14384)
b4ff704

ggerganov commited on

metal : fix thread-safety (llama/14300)
2bd85b6

ggerganov commited on

metal : add mean kernel (llama/14267)
a726ecc

ggerganov commited on

cmake : handle whitepsaces in path during metal build (llama/14126)
8076017

ggerganov danbev commited on

metal : use less stack memory in FA kernel (llama/14088)
014afb6

ggerganov commited on

metal : use F32 accumulators in FA kernels (llama/13975)
b86860f

ggerganov commited on

ggml : add ggml_gelu_erf() (llama/13667)
6c9cd9a

ngxson HF Staff commited on

metal : fix typo in FA kernel comments (llama/13651)
4c32ada

ggerganov commited on

metal : add FA-vec kernel for head size 64 (llama/13583)
36a3b4e

ggerganov commited on

metal : use FA-vec kernel up to batch size 20 (llama/13496)
e925f17

ggerganov commited on

metal : optimize multi-sequence FA vec kernel (llama/13493)
d2f915d

ggerganov commited on

ggml : add mrope kernel for metal (llama/13457)
27b32e6

ngxson HF Staff commited on

metal : optimize MoE for large batches (llama/13388)
d51c0d3

ggerganov commited on

metal : fix floating-point range of attention scores in FA kernels (llama/13090)
e093044

ggerganov commited on

metal: add neg operator (llama/13029)
42283e1

jmorganca commited on

graph : make FA compatible with MLA + add initial Metal kernels (llama/12953)
fb0d243

ggerganov commited on

metal : add FA-vec kernels for head size 96 (llama/12952)
f1f88b8

ggerganov commited on

llama : fix FA when KV cache is not used (i.e. embeddings) (llama/12825)
e7cb2dc

ggerganov commited on

ggml : add bilinear upscale support (ggml/1185)
4c5e449

Diego Devesa commited on

metal : use F32 prec in FA kernels (llama/12688)
a49f5c2

ggerganov commited on

metal : use constexpr in FA kernels + fix typedef (llama/12659)
c699617

ggerganov commited on

metal : improve FA + improve MoE (llama/12612)
04a3389

ggerganov commited on

metal : refactor mat-vec code (llama/12569)
71d72f9

ggerganov commited on

llama: Add support for RWKV v7 architecture (llama/12412)
727de7e

mollysama commited on

metal : Cache the Metal library at the device context level (llama/12265)
e3908a2

BB-fat commited on

ggml : skip intermediate .air file when compiling .metallib (llama/12247)
32b6ec3

danbev commited on

metal : simplify kernel arguments using a struct (ggml/3229) (llama/12194)
092277a

BB-fat alexju commited on

metal : fix default.metallib build (llama/12224)
838efb6

danbev commited on

ggml : fix GGMLMetalClass ODR (llama/12200)
2094cb7

pacominev commited on

cuda/cpu: Increase support for fp16 unary operations (ggml/1125)
67e8c32

cmdr2 commited on

metal : copy kernels for quant to F32/F16 conversions (llama/12017)
6c8e7ec

Garf ggerganov commited on

metal : fix the crash caused by the lack of residency set support on Intel Macs. (llama/11904)
afbd891

Hale Chan commited on

metal : optimize dequant q6_K kernel (llama/11892)
376cbe6

Adrian Kretz commited on

repo : update links to new url (llama/11886)
9705bb5

ggerganov commited on