Easyr1: An efficient, scalable, multi-modality rl training framework

Yaowei Zheng, Junting Lu, Shenzhi Wang, Zhangchi Feng andDongdong Kuang, Yuwen Xiong · 2025

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

cs.AI · 2025-03-07 · conditional · novelty 7.0

RL on Qwen2-VL-2B with SAT dataset produces R1-like reasoning and 59.47% CVBench accuracy, outperforming base model by ~30% and SFT by ~2%.

Showing 1 of 1 citing paper.

R1-Zero's "Aha Moment" in Visual Reasoning on a 2B Non-SFT Model cs.AI · 2025-03-07 · conditional · none · ref 20
RL on Qwen2-VL-2B with SAT dataset produces R1-like reasoning and 59.47% CVBench accuracy, outperforming base model by ~30% and SFT by ~2%.