ApsaraStackMaaS/EvoQwen2.5-VL-Retriever-7B-v1 Visual Document Retrieval • 8B • Updated Nov 4 • 75 • 17
ApsaraStackMaaS/EvoQwen2.5-VL-Retriever-3B-v1 Visual Document Retrieval • 4B • Updated Nov 4 • 8.52k • 12
VCRL: Variance-based Curriculum Reinforcement Learning for Large Language Models Paper • 2509.19803 • Published Sep 24 • 120
PVPO: Pre-Estimated Value-Based Policy Optimization for Agentic Reasoning Paper • 2508.21104 • Published Aug 28 • 35
Beyond Pass@1: Self-Play with Variational Problem Synthesis Sustains RLVR Paper • 2508.14029 • Published Aug 19 • 118