Robots discover causal tool features through VLM suggestions and physics-based counterfactual perturbations in simulation, then transfer manipulation skills via conditioned keypoint matching.
Manipvqa: Injecting robotic affordance and physically grounded information into multi-modal large language models.CoRR, abs/2403.11289
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
background 2
citation-polarity summary
verdicts
UNVERDICTED 2roles
background 2polarities
background 2representative citing papers
HybridVLA unifies diffusion and autoregression in a single VLA model via collaborative training and ensemble to raise robot manipulation success rates by 14% in simulation and 19% in real-world tasks.
citing papers explorer
-
Creative Robot Tool Use by Counterfactual Reasoning
Robots discover causal tool features through VLM suggestions and physics-based counterfactual perturbations in simulation, then transfer manipulation skills via conditioned keypoint matching.
-
HybridVLA: Collaborative Diffusion and Autoregression in a Unified Vision-Language-Action Model
HybridVLA unifies diffusion and autoregression in a single VLA model via collaborative training and ensemble to raise robot manipulation success rates by 14% in simulation and 19% in real-world tasks.