koutch/qwen3-thinking-4b_train_grpo_v1_train_no_think Text Generation • 4B • Updated 11 days ago • 84
koutch/qwen3-instruct-4b_train_grpo_v1_train_no_think Text Generation • 4B • Updated 12 days ago • 64