Yuchen Cheng

rudeigerc

https://rudeigerc.dev

AI & ML interests

Kubernetes / LLMOps

Recent Activity

liked a model 9 days ago

deepseek-ai/DeepSeek-V3.2

upvoted a paper about 1 month ago

Kimi Linear: An Expressive, Efficient Attention Architecture

liked a Space about 1 month ago

HuggingFaceTB/smol-training-playbook

View all activity

Organizations

None yet

liked a model 9 days ago

deepseek-ai/DeepSeek-V3.2

Text Generation • 685B • Updated 10 days ago • 40.7k • • 863

upvoted a paper about 1 month ago

Kimi Linear: An Expressive, Efficient Attention Architecture

Paper • 2510.26692 • Published Oct 30 • 117

liked a Space about 1 month ago

The Smol Training Playbook

📚

2.57k

The secrets to building world-class LLMs

liked 2 models about 1 month ago

deepseek-ai/DeepSeek-OCR

Image-Text-to-Text • 3B • Updated Nov 4 • 5.39M • 2.95k

MiniMaxAI/MiniMax-M2

Text Generation • 229B • Updated 3 days ago • 171k • • 1.39k

liked a model 2 months ago

deepseek-ai/DeepSeek-V3.2-Exp

Text Generation • 685B • Updated 23 days ago • 72.5k • • 911

liked 8 models 4 months ago

liked a model 5 months ago

moonshotai/Kimi-K2-Instruct

Text Generation • 1T • Updated Nov 7 • 168k • • 2.27k

upvoted a paper 6 months ago

MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention

Paper • 2506.13585 • Published Jun 16 • 272

liked 2 models 6 months ago

MiniMaxAI/MiniMax-M1-80k

Text Generation • 456B • Updated Jul 7 • 226 • • 685

mistralai/Magistral-Small-2506

24B • Updated Jul 28 • 13.9k • 605

upvoted 2 papers 6 months ago

Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding

Paper • 2505.22618 • Published May 28 • 44

Inference-Time Hyper-Scaling with KV Cache Compression

Paper • 2506.05345 • Published Jun 5 • 27

Yuchen Cheng

AI & ML interests

Recent Activity

Organizations

rudeigerc's activity

The Smol Training Playbook