CoAt-CBM improves fine-grained concept alignment in CBMs by using adaptive visual queries per concept and a contrastive loss that respects relative concept importance instead of independent BCE.
Learn- ing transferable visual models from natural language super- vision
3 Pith papers cite this work. Polarity classification is still indexing.
representative citing papers
MoonSeg3R is the first method for online monocular 3D instance segmentation, achieving performance competitive with RGB-D systems by using CUT3R priors for geometric consistency and temporal query memory.
CalibAll estimates camera extrinsics on existing datasets to convert robot actions into a unified camera-frame representation, enabling stronger cross-embodiment pretraining.
citing papers explorer
-
Concept-wise Attention for Fine-grained Concept Bottleneck Models
CoAt-CBM improves fine-grained concept alignment in CBMs by using adaptive visual queries per concept and a contrastive loss that respects relative concept importance instead of independent BCE.
-
MoonSeg3R: Monocular Online Zero-Shot Segment Anything in 3D with Reconstructive Foundation Priors
MoonSeg3R is the first method for online monocular 3D instance segmentation, achieving performance competitive with RGB-D systems by using CUT3R priors for geometric consistency and temporal query memory.
-
Unify Robot Actions in Camera Frame
CalibAll estimates camera extrinsics on existing datasets to convert robot actions into a unified camera-frame representation, enabling stronger cross-embodiment pretraining.