Exploration and Exploitation Errors Are Measurable for Language Model Agents Paper • 2604.13151 • Published 3 days ago • 13
UI-Zoomer: Uncertainty-Driven Adaptive Zoom-In for GUI Grounding Paper • 2604.14113 • Published 1 day ago • 8
TREX: Automating LLM Fine-tuning via Agent-Driven Tree-based Exploration Paper • 2604.14116 • Published 1 day ago • 7
SpatialEvo: Self-Evolving Spatial Intelligence via Deterministic Geometric Environments Paper • 2604.14144 • Published 1 day ago • 60
OccuBench: Evaluating AI Agents on Real-World Professional Tasks via Language World Models Paper • 2604.10866 • Published 4 days ago • 45
GameWorld: Towards Standardized and Verifiable Evaluation of Multimodal Game Agents Paper • 2604.07429 • Published 9 days ago • 105
Seedance 2.0: Advancing Video Generation for World Complexity Paper • 2604.14148 • Published 1 day ago • 102
Accelerating Speculative Decoding with Block Diffusion Draft Trees Paper • 2604.12989 • Published 3 days ago • 5
Hierarchical SVG Tokenization: Learning Compact Visual Programs for Scalable Vector Graphics Modeling Paper • 2604.05072 • Published 7 days ago • 4
You Only Judge Once: Multi-response Reward Modeling in a Single Forward Pass Paper • 2604.10966 • Published 4 days ago • 8
Lightning OPD: Efficient Post-Training for Large Reasoning Models with Offline On-Policy Distillation Paper • 2604.13010 • Published 3 days ago • 8
Nemotron 3 Super: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning Paper • 2604.12374 • Published 3 days ago • 26
Rethinking the Diffusion Model from a Langevin Perspective Paper • 2604.10465 • Published 5 days ago • 12
Toward Autonomous Long-Horizon Engineering for ML Research Paper • 2604.13018 • Published 3 days ago • 28
Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe Paper • 2604.13016 • Published 3 days ago • 71
IceCache: Memory-efficient KV-cache Management for Long-Sequence LLMs Paper • 2604.10539 • Published 5 days ago • 1