MIRTH introduces dual-scale temporal hubs, MI-based latent reasoning tokens, and parallel decoding to VLA backbones, claiming SOTA results and error recovery on LIBERO and real LeRobot setups.
HELIOS: Hier- archical Exploration for Language-grounded Interaction in Open Scenes.ArXiv, abs/2509.22498, 2025
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.RO 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
Dynamic scene graphs serve as explicit memory to improve imitation learning policies for spatial-temporal reasoning under partial observability in mobile and tabletop manipulation.
citing papers explorer
-
MIRTH: Mutual-Information Reasoning with Temporal Hubs for Vision-Language-Action Agents
MIRTH introduces dual-scale temporal hubs, MI-based latent reasoning tokens, and parallel decoding to VLA backbones, claiming SOTA results and error recovery on LIBERO and real LeRobot setups.
-
Expanding Spatial and Temporal Context for Robotic Imitation Learning With Scene Graphs
Dynamic scene graphs serve as explicit memory to improve imitation learning policies for spatial-temporal reasoning under partial observability in mobile and tabletop manipulation.