GraspDreamer synthesizes human functional grasping demonstrations with visual generative models to enable zero-shot robot grasping with improved data efficiency and generalization.
Video depth anything: Consistent depth estimation for super-long videos,
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.RO 2representative citing papers
ESPADA uses semantic segmentation from VLMs and LLMs plus DTW to downsample non-critical segments in demonstrations, delivering about 2x faster robot execution in behavior cloning while maintaining task success rates.
citing papers explorer
-
Grasp as You Dream: Imitating Functional Grasping from Generated Human Demonstrations
GraspDreamer synthesizes human functional grasping demonstrations with visual generative models to enable zero-shot robot grasping with improved data efficiency and generalization.
-
ESPADA: Execution Speedup via Semantics Aware Demonstration Data Downsampling for Imitation Learning
ESPADA uses semantic segmentation from VLMs and LLMs plus DTW to downsample non-critical segments in demonstrations, delivering about 2x faster robot execution in behavior cloning while maintaining task success rates.