Learning transferable visual models from natural language supervision

Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, Ilya Sutskever · 2021

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

browse 3 citing papers

citation-role summary

dataset 1

citation-polarity summary

background 1

representative citing papers

CodeMMR: Bridging Natural Language, Code, and Image for Unified Retrieval

cs.SE · 2026-04-17 · unverdicted · novelty 7.0

CodeMMR creates a unified embedding space for text, code, and images, outperforming baselines by 10 nDCG@10 points and boosting RAG code generation quality.

Synthetic Data Alone is Enough? Rethinking Data Scarcity in Pediatric Rare Disease Recognition

cs.CV · 2026-05-21 · unverdicted · novelty 5.0

Synthetic facial images alone can train models for pediatric rare disease recognition to performance levels comparable to real-data baselines when generated at sufficient scale.

AgroCoT: A Chain-of-Thought Benchmark for Evaluating Reasoning in Vision-Language Models for Agriculture

cs.AI · 2025-11-28 · unverdicted · novelty 5.0

AgroCoT is a new Chain-of-Thought VQA benchmark with 4759 samples to evaluate reasoning capabilities of vision-language models in agriculture.

citing papers explorer

Showing 3 of 3 citing papers.

CodeMMR: Bridging Natural Language, Code, and Image for Unified Retrieval cs.SE · 2026-04-17 · unverdicted · none · ref 23
CodeMMR creates a unified embedding space for text, code, and images, outperforming baselines by 10 nDCG@10 points and boosting RAG code generation quality.
Synthetic Data Alone is Enough? Rethinking Data Scarcity in Pediatric Rare Disease Recognition cs.CV · 2026-05-21 · unverdicted · none · ref 22
Synthetic facial images alone can train models for pediatric rare disease recognition to performance levels comparable to real-data baselines when generated at sufficient scale.
AgroCoT: A Chain-of-Thought Benchmark for Evaluating Reasoning in Vision-Language Models for Agriculture cs.AI · 2025-11-28 · unverdicted · none · ref 35
AgroCoT is a new Chain-of-Thought VQA benchmark with 4759 samples to evaluate reasoning capabilities of vision-language models in agriculture.

Learning transferable visual models from natural language supervision

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer