pith. sign in

R1-v: Reinforcing super generaliza- tion ability in vision-language models with less than $3.https://github.com/Deep-Agent/ R1-V, 2025

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

fields

cs.AI 1 cs.CV 1

years

2025 2

representative citing papers

GRIT: Teaching MLLMs to Think with Images

cs.CV · 2025-05-21 · unverdicted · novelty 7.0

GRIT introduces a grounded reasoning paradigm for MLLMs where reasoning chains interleave text and bounding boxes, trained via GRPO-GR reinforcement learning on as few as 20 examples without annotations.

citing papers explorer

Showing 2 of 2 citing papers.

  • GRIT: Teaching MLLMs to Think with Images cs.CV · 2025-05-21 · unverdicted · none · ref 16

    GRIT introduces a grounded reasoning paradigm for MLLMs where reasoning chains interleave text and bounding boxes, trained via GRPO-GR reinforcement learning on as few as 20 examples without annotations.

  • R1-Zero's "Aha Moment" in Visual Reasoning on a 2B Non-SFT Model cs.AI · 2025-03-07 · conditional · none · ref 1

    RL on Qwen2-VL-2B with SAT dataset produces R1-like reasoning and 59.47% CVBench accuracy, outperforming base model by ~30% and SFT by ~2%.