Learning transferable visual models from natural language supervision

Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, Ilya Sutskever · 2021

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

browse 2 citing papers

representative citing papers

Stitched Value Model for Diffusion Alignment

cs.CV · 2026-05-19 · unverdicted · novelty 6.0

StitchVM stitches clean-image reward models with diffusion backbones to enable efficient value estimation for noisy latents, speeding up diffusion alignment methods like DPS by 3.2x and halving memory.

Dictionary-Aligned Concept Control for Safeguarding Multimodal LLMs

cs.LG · 2026-04-10 · unverdicted · novelty 6.0

DACO curates a 15,000-concept dictionary from 400K image-caption pairs and uses it to initialize an SAE that enables granular, concept-specific steering of MLLM activations, raising safety scores on MM-SafetyBench and JailBreakV while preserving general capabilities.

citing papers explorer

Showing 2 of 2 citing papers.

Stitched Value Model for Diffusion Alignment cs.CV · 2026-05-19 · unverdicted · none · ref 49
StitchVM stitches clean-image reward models with diffusion backbones to enable efficient value estimation for noisy latents, speeding up diffusion alignment methods like DPS by 3.2x and halving memory.
Dictionary-Aligned Concept Control for Safeguarding Multimodal LLMs cs.LG · 2026-04-10 · unverdicted · none · ref 83
DACO curates a 15,000-concept dictionary from 400K image-caption pairs and uses it to initialize an SAE that enables granular, concept-specific steering of MLLM activations, raising safety scores on MM-SafetyBench and JailBreakV while preserving general capabilities.

Learning transferable visual models from natural language supervision

fields

years

verdicts

representative citing papers

citing papers explorer