MoonSeg3R is the first method for online monocular 3D instance segmentation, achieving performance competitive with RGB-D systems by using CUT3R priors for geometric consistency and temporal query memory.
Learn- ing transferable visual models from natural language super- vision
3 Pith papers cite this work. Polarity classification is still indexing.
3
Pith papers citing it
representative citing papers
CalibAll estimates camera extrinsics on existing datasets to convert robot actions into a unified camera-frame representation, enabling stronger cross-embodiment pretraining.