GRPO-CARE: Consistency-Aware Reinforcement Learning for Multimodal Reasoning Paper • 2506.16141 • Published Jun 19, 2025 • 27