Spaces:
Sleeping
Sleeping
Commit History
ruby : add .gitignore entries for ext directory (#3245)
984d583
unverified
ci : update windows runner to windows-2022 (#3242)
7e96237
unverified
ruby : add cleaning of library names in dependencies (#3241)
f6dc2ad
unverified
ggml : fix weak alias win32 (#0)
d47070d
android : fix builds (#0)
4043835
sync : ggml
a890a8c
files : remove old sources (part 2)
c1c9908
sync : ggml
43cbdf7
files : remove old sources
e4ae8c6
talk-llama : sync llama.cpp
5ef1601
sync : ggml
6ac9e73
metal : use less stack memory in FA kernel (llama/14088)
014afb6
ggml-cpu : split arch-specific implementations (llama/13892)
8c833e9
cuda : fix device sync on buffer clear (llama/14033)
8f2e8d6
Diego Devesa
commited on
CANN: Simplify the environment variable setting(#13104)
f1535d7
sycl: Add reorder to Q6_K mmvq implementation (llama/13885)
56f0e48
Nicolò Scipione
commited on
cuda : fix buffer type check with integrated GPUs (llama/14069)
747ad97
Diego Devesa
commited on
SYCL: Implement few same quantized type copy kernels (llama/13739)
4c88a27
Akarshan Biswas
commited on
vulkan: Enable VK_KHR_cooperative_matrix extension for Intel Xe2 GPUs (llama/14001)
e5107fe
llama : allow using mmap without PrefetchVirtualMemory, apply GGML_WIN_VER to llama.cpp sources (llama/14013)
f0a0ac8
Diego Devesa
commited on
vulkan: automatically deduce size of push constants (llama/13936)
00a9e2f
ggml-vulkan: adds support for op CONV_TRANSPOSE_1D (llama/13813)
32985b0
releases : use dl backend for linux release, remove arm64 linux release (llama/13996)
9896625
Diego Devesa
commited on
CUDA: fix FTZ in FA for Gemma 3 (llama/13991)
40fc316
vulkan: fix warnings in perf logger querypool code (llama/13937)
11bac96
opencl: add `backend_synchronize` (llama/13939)
a9ce9a8
lhez
commited on
OpenCL: Add concat, tsembd, upscale, tanh, pad and repeat (llama/13840)
5ff8785
rmatif
commited on
metal : use F32 accumulators in FA kernels (llama/13975)
b86860f
cmake : Handle mixed-case 'Power' strings in POWER CPU detection (llama/13966)
bc1415b
sycl: quantize and reorder the input to q8_1 when reorder is enabled (llama/13826)
c4e62cd
Atharva Dubey
Alberto Cabrera Pérez
commited on
gguf: fix failure on version == 0 (llama/13956)
73547ad
ggml: check if non-native endian model is being loaded (llama/13943)
a2e9ccb
Add in-build ggml::ggml ALIAS library (ggml/1260)
faef029
Kai Pastor
commited on
ruby : output format (#3237)
63cab25
unverified
ci : build and publish main-intel image (#3231)
2c4b2dd
unverified
藍+85CD
commited on
docker : add main-intel dockerfile (#3229)
23d5a5c
unverified
藍+85CD
commited on
ruby : Add parallel transcription support (#3222)
acad667
unverified
ci : add mirror for ports.ubuntu.com (ARM packages) (#3221)
17ba7f5
unverified
bindings.java : apply whisperParams in fullTranscribeWithTime instead of ignoring them (#3201)
18fb7d6
unverified
Joas Dev
commited on
musa: correct MUSA SDK rc4.0.1 download URL (#3217)
90efe84
unverified
R0CKSTAR
commited on
ci : use mirrors.kernel.org for Ubuntu packages (#3220)
62dd144
unverified
node : add language detection support (#3190)
9994342
unverified
talk-llama : sync llama.cpp
58220b6
sync : ggml
337f4d9
threading: support for GGML_SCHED_PRIO_LOW, update thread info on Windows to avoid throttling (llama/12995)
d5d55f2
Max Krasnyansky
Diego Devesa
commited on
CUDA: add a prop in ggml_cuda_device_infor for distinguish iGPU or dGPU in cuda (#13856) (llama/13895)
a75e157
CUDA: fix typo in FlashAttention code (llama/13926)
6fb9674
sched : avoid changing cur_copy when a graph is already allocated (llama/13922)
1c0a5c0
Diego Devesa
commited on
cuda : prevent using split buffers with 3d/4d matrices (llama/13919)
6b6155b
Diego Devesa
commited on