Learn- ing transferable visual models from natural language super- vision

Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al · 2021

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

browse 3 citing papers

representative citing papers

MoonSeg3R: Monocular Online Zero-Shot Segment Anything in 3D with Reconstructive Foundation Priors

cs.CV · 2025-12-17 · unverdicted · novelty 7.0

MoonSeg3R is the first method for online monocular 3D instance segmentation, achieving performance competitive with RGB-D systems by using CUT3R priors for geometric consistency and temporal query memory.

Unify Robot Actions in Camera Frame

cs.RO · 2025-11-21 · conditional · novelty 6.0

CalibAll estimates camera extrinsics on existing datasets to convert robot actions into a unified camera-frame representation, enabling stronger cross-embodiment pretraining.

Concept-wise Attention for Fine-grained Concept Bottleneck Models

cs.CV · 2026-04-17

citing papers explorer

Showing 1 of 1 citing paper after filters.

Concept-wise Attention for Fine-grained Concept Bottleneck Models cs.CV · 2026-04-17 · unreviewed · ref 24

Learn- ing transferable visual models from natural language super- vision

fields

years

verdicts

representative citing papers

citing papers explorer