RubricEM: Meta-RL with Rubric-guided Policy Decomposition beyond Verifiable Rewards Paper • 2605.10899 • Published 7 days ago • 72
MolmoAct2: Action Reasoning Models for Real-world Deployment Paper • 2605.02881 • Published 14 days ago • 328