Kimi Linear: An Expressive, Efficient Attention Architecture Paper • 2510.26692 • Published Oct 30 • 117
Running on CPU Upgrade Featured 2.57k The Smol Training Playbook 📚 2.57k The secrets to building world-class LLMs
MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention Paper • 2506.13585 • Published Jun 16 • 272
Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding Paper • 2505.22618 • Published May 28 • 44
Inference-Time Hyper-Scaling with KV Cache Compression Paper • 2506.05345 • Published Jun 5 • 27