Bottom-up Policy Optimization: Your Language Model Policy Secretly Contains Internal Policies Paper • 2512.19673 • Published 3 days ago • 57
DualTHOR: A Dual-Arm Humanoid Simulation Platform for Contingency-Aware Planning Paper • 2506.16012 • Published Jun 19 • 22
Think on your Feet: Adaptive Thinking via Reinforcement Learning for Social Agents Paper • 2505.02156 • Published May 4 • 18