pith. sign in
Pith Number

pith:SGUFOMNH

pith:2025:SGUFOMNHKQL4IFMWYUVCLHHQH6
not attested not anchored not stored refs resolved

LMM-R1: Empowering 3B LMMs with Strong Reasoning Abilities Through Two-Stage Rule-Based RL

Gongrui Zhang, Jie Liu, Kai Yang, Miaosen Zhang, Qipeng Zhu, Xin Geng, Xingzhong Xu, Xu Yang, Yingzhe Peng, Zhiyuan You

Strengthening reasoning first on text data then transferring to images improves 3B multimodal models without extra multimodal data.

arxiv:2503.07536 v2 · 2025-03-10 · cs.CL · cs.AI

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{SGUFOMNHKQL4IFMWYUVCLHHQH6}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

text-based reasoning enhancement enables effective multimodal generalization, offering a data-efficient paradigm that bypasses costly high-quality multimodal training data.

C2weakest assumption

That reasoning skills strengthened via text-only rule-based RL transfer to multimodal inputs without substantial interference from visual perception components or degradation of the foundational reasoning.

C3one line summary

A two-stage RL framework first boosts text reasoning in 3B LMMs then adapts it to multimodal inputs, producing modest benchmark gains of 4.5-4.8%.

References

117 extracted · 117 resolved · 31 Pith anchors

[1] GPT-4 Technical Report · arXiv:2303.08774
[2] Lawrence Zitnick, Devi Parikh, and Dhruv Ba- tra 2015
[3] Flamingo: A visual language model for few-shot learning
[4] Qwen Technical Report 2023 · arXiv:2309.16609
[5] Qwen-VL: A Versatile Vision-Language Model for Understanding, Localization, Text Reading, and Beyond 2023 · arXiv:2308.12966

Formal links

2 machine-checked theorem links

Cited by

35 papers in Pith

Receipt and verification
First computed 2026-05-17T23:38:47.471538Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

91a85731a75417c41596c52a259cf03f90f2493edb555761d42ec860aaa2051f

Aliases

arxiv: 2503.07536 · arxiv_version: 2503.07536v2 · doi: 10.48550/arxiv.2503.07536 · pith_short_12: SGUFOMNHKQL4 · pith_short_16: SGUFOMNHKQL4IFMW · pith_short_8: SGUFOMNH
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/SGUFOMNHKQL4IFMWYUVCLHHQH6 \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 91a85731a75417c41596c52a259cf03f90f2493edb555761d42ec860aaa2051f
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "153a64ed5dc6d65cf29c723c8da3f7bac130de117b7afe63774316ea84075c24",
    "cross_cats_sorted": [
      "cs.AI"
    ],
    "license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
    "primary_cat": "cs.CL",
    "submitted_at": "2025-03-10T17:04:14Z",
    "title_canon_sha256": "43e5c65607dd7bda3cdd188b43717b8d938f768ed61b7ba0f73373e48bba2933"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2503.07536",
    "kind": "arxiv",
    "version": 2
  }
}