CLAMP pretrains 3D multi-view encoders with contrastive learning on point clouds and actions, then initializes diffusion policies for more sample-efficient fine-tuning on robotic tasks.
Ulip: Learning a unified representation of language, images, and point clouds for 3d understanding
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.RO 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
CLAMP: Contrastive Learning for 3D Multi-View Action-Conditioned Robotic Manipulation Pretraining
CLAMP pretrains 3D multi-view encoders with contrastive learning on point clouds and actions, then initializes diffusion policies for more sample-efficient fine-tuning on robotic tasks.