pith. sign in

Visual embodied brain: Let multimodal large language models see, think, and control in spaces

7 Pith papers cite this work. Polarity classification is still indexing.

7 Pith papers citing it

citation-role summary

background 1

citation-polarity summary

fields

cs.CV 4 cs.RO 3

years

2026 4 2025 3

roles

background 1

polarities

background 1

representative citing papers

SpaceDG: Benchmarking Spatial Intelligence under Visual Degradation

cs.CV · 2026-05-21 · unverdicted · novelty 7.0

SpaceDG introduces the first large-scale degradation-aware spatial reasoning dataset using 3D Gaussian Splatting synthesis, showing that visual degradations impair MLLM performance but finetuning on the data improves robustness and can exceed human levels under degradation.

Token Warping Helps MLLMs Look from Nearby Viewpoints

cs.CV · 2026-04-03 · unverdicted · novelty 7.0

Backward token warping in ViT-based MLLMs enables reliable reasoning from nearby viewpoints by preserving semantic coherence better than pixel-wise warping or fine-tuning baselines.

MiMo-Embodied: X-Embodied Foundation Model Technical Report

cs.RO · 2025-11-20 · unverdicted · novelty 6.0

MiMo-Embodied is a single foundation model that achieves state-of-the-art results on 17 embodied AI benchmarks and 12 autonomous driving benchmarks through multi-stage learning, curated data, and CoT/RL fine-tuning that produces positive cross-domain transfer.

citing papers explorer

Showing 7 of 7 citing papers.