BizGenEval: A Systematic Benchmark for Commercial Visual Content Generation Paper • 2603.25732 • Published 13 days ago • 11
Why Does Self-Distillation (Sometimes) Degrade the Reasoning Capability of LLMs? Paper • 2603.24472 • Published 14 days ago • 51
Understanding Reasoning in LLMs through Strategic Information Allocation under Uncertainty Paper • 2603.15500 • Published 23 days ago • 11
Scaling Data Difficulty: Improving Coding Models via Reinforcement Learning on Fresh and Challenging Problems Paper • 2603.07779 • Published Mar 8 • 5
Breaking Training Bottlenecks: Effective and Stable Reinforcement Learning for Coding Models Paper • 2603.07777 • Published Mar 8 • 5
Sparse-BitNet: 1.58-bit LLMs are Naturally Friendly to Semi-Structured Sparsity Paper • 2603.05168 • Published Mar 5 • 5
Scaling Agentic Capabilities, Not Context: Efficient Reinforcement Finetuning for Large Toolspaces Paper • 2603.06713 • Published Mar 5 • 16
Learning When to Act or Refuse: Guarding Agentic Reasoning Models for Safe Multi-Step Tool Use Paper • 2603.03205 • Published Mar 3 • 13
Self-Hinting Language Models Enhance Reinforcement Learning Paper • 2602.03143 • Published Feb 3 • 31
Training LLMs for Divide-and-Conquer Reasoning Elevates Test-Time Scalability Paper • 2602.02477 • Published Feb 2 • 11
Multiplex Thinking: Reasoning via Token-wise Branch-and-Merge Paper • 2601.08808 • Published Jan 13 • 39