Molmoact: Action reasoning models that can reason in space

Jason Lee, Jiafei Duan, Haoquan Fang, Yuquan Deng, Shuo Liu, Boyang Li, Bohan Fang, Jieyu Zhang, Yi Ru Wang, Sangho Lee, Winson Han, Wilbert Pumacay, Angelica Wu, Rose Hendrix, Karen Farley, Eli VanderBilt, Ali Farhadi, Dieter Fox, Ranj · 2025

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

browse 2 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

DIAL: Decoupling Intent and Action via Latent World Modeling for End-to-End VLA

cs.RO · 2026-03-31 · unverdicted · novelty 6.0

DIAL decouples intent from action in end-to-end VLAs using a latent visual foresight bottleneck and two-stage training, reaching SOTA on RoboCasa with 10x fewer demonstrations and zero-shot real-world transfer.

Action Hallucination in Generative Vision-Language-Action Models

cs.RO · 2026-02-06 · conditional · novelty 6.0

Generative VLAs hallucinate physically invalid actions due to topological, precision, and horizon mismatches between model architectures and feasible robot behavior.

citing papers explorer

Showing 2 of 2 citing papers.

DIAL: Decoupling Intent and Action via Latent World Modeling for End-to-End VLA cs.RO · 2026-03-31 · unverdicted · none · ref 31
DIAL decouples intent from action in end-to-end VLAs using a latent visual foresight bottleneck and two-stage training, reaching SOTA on RoboCasa with 10x fewer demonstrations and zero-shot real-world transfer.
Action Hallucination in Generative Vision-Language-Action Models cs.RO · 2026-02-06 · conditional · none · ref 26
Generative VLAs hallucinate physically invalid actions due to topological, precision, and horizon mismatches between model architectures and feasible robot behavior.

Molmoact: Action reasoning models that can reason in space

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer