pith. sign in

Reinforced mllm: A survey on rl-based reasoning in multimodal large language models

8 Pith papers cite this work. Polarity classification is still indexing.

8 Pith papers citing it

citation-role summary

background 3

citation-polarity summary

fields

cs.CV 6 cs.AI 2

years

2026 6 2025 2

roles

background 3

polarities

background 3

clear filters

representative citing papers

OneThinker: All-in-one Reasoning Model for Image and Video

cs.CV · 2025-12-02 · unverdicted · novelty 5.0

OneThinker unifies image and video reasoning in one model across 10 tasks via a 600k corpus, CoT-annotated SFT, and EMA-GRPO reinforcement learning, reporting strong results on 31 benchmarks plus some cross-task transfer.

citing papers explorer

Showing 3 of 3 citing papers after filters.