Learning transferable visual models from natural language supervision

Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al · 2021

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

browse 2 citing papers

representative citing papers

Geometry-aware 4D Video Generation for Robot Manipulation

cs.CV · 2025-07-01 · unverdicted · novelty 5.0

A geometry-aware 4D video generation model trained with cross-view pointmap alignment to produce spatio-temporally consistent future videos from novel viewpoints for robot manipulation.

SMPL-GPTexture: Dual-View 3D Human Texture Estimation using Text-to-Image Generation Models

cs.GR · 2025-04-17 · unverdicted · novelty 5.0

SMPL-GPTexture uses text-to-image generation to produce dual-view human images, aligns them to SMPL meshes via 2D-to-3D recovery, projects colors to UV space, and applies diffusion inpainting to create full high-resolution textures aligned to user prompts.

citing papers explorer

Showing 2 of 2 citing papers.

Geometry-aware 4D Video Generation for Robot Manipulation cs.CV · 2025-07-01 · unverdicted · none · ref 44
A geometry-aware 4D video generation model trained with cross-view pointmap alignment to produce spatio-temporally consistent future videos from novel viewpoints for robot manipulation.
SMPL-GPTexture: Dual-View 3D Human Texture Estimation using Text-to-Image Generation Models cs.GR · 2025-04-17 · unverdicted · none · ref 16
SMPL-GPTexture uses text-to-image generation to produce dual-view human images, aligns them to SMPL meshes via 2D-to-3D recovery, projects colors to UV space, and applies diffusion inpainting to create full high-resolution textures aligned to user prompts.

Learning transferable visual models from natural language supervision

fields

years

verdicts

representative citing papers

citing papers explorer