A two-stage RL framework first boosts text reasoning in 3B LMMs then adapts it to multimodal inputs, producing modest benchmark gains of 4.5-4.8%.
Magebench: Bridging large multimodal models to agents,
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
dataset 1
citation-polarity summary
years
2025 2verdicts
UNVERDICTED 2roles
dataset 1polarities
use dataset 1representative citing papers
LEO-RobotAgent is a general-purpose framework that enables LLMs to independently plan, use tools, and collaborate with humans while operating multiple robot types for unpredictable tasks.
citing papers explorer
-
LMM-R1: Empowering 3B LMMs with Strong Reasoning Abilities Through Two-Stage Rule-Based RL
A two-stage RL framework first boosts text reasoning in 3B LMMs then adapts it to multimodal inputs, producing modest benchmark gains of 4.5-4.8%.
-
LEO-RobotAgent: A General-purpose Robotic Agent for Language-driven Embodied Operator
LEO-RobotAgent is a general-purpose framework that enables LLMs to independently plan, use tools, and collaborate with humans while operating multiple robot types for unpredictable tasks.