WebWatcher: Breaking New Frontier of Vision-Language Deep Research Agent Paper • 2508.05748 • Published Aug 7 • 141
OneThinker: All-in-one Reasoning Model for Image and Video Paper • 2512.03043 • Published 26 days ago • 32
ARES: Multimodal Adaptive Reasoning via Difficulty-Aware Token-Level Entropy Shaping Paper • 2510.08457 • Published Oct 9 • 12
ARES: Multimodal Adaptive Reasoning via Difficulty-Aware Token-Level Entropy Shaping Paper • 2510.08457 • Published Oct 9 • 12
ARES Collection 🌴ARES is an open-source framework for adaptive multimodal reasoning, using difficulty-aware training and entropy-shaped policy optimization. • 5 items • Updated Oct 12 • 2
Emergent Hierarchical Reasoning in LLMs through Reinforcement Learning Paper • 2509.03646 • Published Sep 3 • 32
Reconstruction Alignment Improves Unified Multimodal Models Paper • 2509.07295 • Published Sep 8 • 40
Interleaving Reasoning for Better Text-to-Image Generation Paper • 2509.06945 • Published Sep 8 • 14
Revisual-R1 Collection 🚀ReVisual-R1 is a 7B open-source multimodal language model that follows a three-stage curriculum—cold-start pre-training, multimodal reinforcement. • 6 items • Updated Oct 9 • 3
Revisual-R1 Collection 🚀ReVisual-R1 is a 7B open-source multimodal language model that follows a three-stage curriculum—cold-start pre-training, multimodal reinforcement. • 6 items • Updated Oct 9 • 3
Revisual-R1 Collection 🚀ReVisual-R1 is a 7B open-source multimodal language model that follows a three-stage curriculum—cold-start pre-training, multimodal reinforcement. • 6 items • Updated Oct 9 • 3
Revisual-R1 Collection 🚀ReVisual-R1 is a 7B open-source multimodal language model that follows a three-stage curriculum—cold-start pre-training, multimodal reinforcement. • 6 items • Updated Oct 9 • 3