Too Good to be Bad: On the Failure of LLMs to Role-Play Villains Paper • 2511.04962 • Published Nov 7, 2025 • 53
Annotation-Efficient Universal Honesty Alignment Paper • 2510.17509 • Published Oct 20, 2025 • 21
mangopy/OpenReward-Qwen2.5-3B-instruct-half-correct-half-wrong-84-step 3B • Updated Sep 25, 2025 • 7
mangopy/OpenReward-Qwen2.5-3B-instruct-half-correct-half-wrong-84-step 3B • Updated Sep 25, 2025 • 7
mangopy/OpenReward-Qwen2.5-7B-instruct-half-correct-half-wrong-84-step 8B • Updated Sep 22, 2025 • 8
mangopy/OpenReward-Qwen2.5-7B-instruct-half-correct-half-wrong-84-step 8B • Updated Sep 22, 2025 • 8
RLVER: Reinforcement Learning with Verifiable Emotion Rewards for Empathetic Agents Paper • 2507.03112 • Published Jul 3, 2025 • 32