CUDA Agent: Large-Scale Agentic RL for High-Performance CUDA Kernel Generation Paper β’ 2602.24286 β’ Published 25 days ago β’ 97
Light-R1: Curriculum SFT, DPO and RL for Long COT from Scratch and Beyond Paper β’ 2503.10460 β’ Published Mar 13, 2025 β’ 30