pith:27P4QCOH
NoisyGRPO: Incentivizing Multimodal CoT Reasoning via Noise Injection and Bayesian Estimation
Noise injection into visual inputs and Bayesian advantage estimation improve generalization in multimodal chain-of-thought reasoning.
arxiv:2510.21122 v3 · 2025-10-24 · cs.CV
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{27P4QCOHDNSQZP674SDDZJAAFE}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge
Record completeness
Claims
Experiments on standard CoT quality, general capability, and hallucination benchmarks demonstrate that NoisyGRPO substantially improves generalization and robustness, especially in RL settings with small-scale MLLMs such as Qwen2.5-VL 3B.
The assumption that the injected Gaussian noise level can be directly used as a prior in a Bayesian model whose posterior advantage estimate reliably prefers visually grounded trajectories over those that succeed only under noise; this premise is invoked in the description of Bayesian Advantage Estimation without further justification of the likelihood model or prior calibration.
NoisyGRPO is an RL framework that perturbs visual inputs with Gaussian noise for exploration and computes trajectory advantages via Bayesian posterior fusion of noise prior and reward likelihood to improve multimodal CoT generalization.
References
Formal links
Cited by
Receipt and verification
| First computed | 2026-06-08T01:03:50.303122Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
d7dfc809c71b650cbfdfe4863ca4002937cc5242ed4283c9829873f921addd41
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/27P4QCOHDNSQZP674SDDZJAAFE \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: d7dfc809c71b650cbfdfe4863ca4002937cc5242ed4283c9829873f921addd41
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "a4f0131f9e76647b87de5c63a815a71e55aef40f8f739a33f990cc727e40a340",
"cross_cats_sorted": [],
"license": "http://creativecommons.org/licenses/by/4.0/",
"primary_cat": "cs.CV",
"submitted_at": "2025-10-24T03:23:34Z",
"title_canon_sha256": "6d195fa51f33faadd7c1379b4d88688ddf3cbbf68b8dcd5aedc87f7aad6e88f8"
},
"schema_version": "1.0",
"source": {
"id": "2510.21122",
"kind": "arxiv",
"version": 3
}
}