Is synthetic data from generative models ready for image recognition?

Chuhui Xue; Philip Torr; Ruifei He; Shuyang Sun; Song Bai; Wenqing Zhang; Xiaojuan Qi; Xin Yu

arxiv: 2210.07574 · v2 · pith:N5JSKTAInew · submitted 2022-10-14 · 💻 cs.CV

Is synthetic data from generative models ready for image recognition?

Ruifei He , Shuyang Sun , Xin Yu , Chuhui Xue , Wenqing Zhang , Philip Torr , Song Bai , Xiaojuan Qi This is my paper

classification 💻 cs.CV

keywords syntheticdatamodelsrecognitionimagestasksgeneratedgeneration

0 comments

read the original abstract

Recent text-to-image generation models have shown promising results in generating high-fidelity photo-realistic images. Though the results are astonishing to human eyes, how applicable these generated images are for recognition tasks remains under-explored. In this work, we extensively study whether and how synthetic images generated from state-of-the-art text-to-image generation models can be used for image recognition tasks, and focus on two perspectives: synthetic data for improving classification models in data-scarce settings (i.e. zero-shot and few-shot), and synthetic data for large-scale model pre-training for transfer learning. We showcase the powerfulness and shortcomings of synthetic data from existing generative models, and propose strategies for better applying synthetic data for recognition tasks. Code: https://github.com/CVMI-Lab/SyntheticData.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 7 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Exploring Cross-Modal Flows for Few-Shot Learning
cs.CV 2025-10 unverdicted novelty 7.0

FMA introduces flow matching for multi-step cross-modal feature alignment in few-shot learning, using fixed coupling, noise augmentation, and early-stopping to outperform one-step PEFT methods.
An Empirical Study of Validating Synthetic Data for Text-Based Person Retrieval
cs.CV 2025-03 unverdicted novelty 7.0

Empirical study of a fully synthetic data generation pipeline for text-based person retrieval that tests its use as a replacement or augmentation for real data across scenarios.
What Makes Synthetic Data Effective in Image Segmentation
cs.CV 2026-05 unverdicted novelty 6.0

Dense scene composition and instance fidelity in synthetic diffusion images drive better segmentation performance; SENSE framework exploits this to improve models on Cityscapes, COCO, and ADE20K.
All in One: A Unified Synthetic Data Pipeline for Multimodal Video Understanding
cs.CV 2026-04 unverdicted novelty 6.0

A unified synthetic data generation pipeline produces unlimited annotated multimodal video data across multiple tasks, enabling models trained mostly on synthetic data to generalize effectively to real-world video und...
AC3S: Adaptive Conditioning for 3D-Aware Synthetic Data Generation
cs.CV 2026-06 unverdicted novelty 5.0

AC3S adds a self-supervised visual prompt modulator to ControlNet diffusion and a multi-agent VLM prompt composer to generate photorealistic images with accurate 2D/3D annotations while avoiding over-conditioning.
Personalized Generative Models for Contextual Debiasing
cs.CV 2026-05 unverdicted novelty 5.0

DecoupleGen personalizes diffusion models to create images with uncommon contexts for debiasing object recognition, yielding consistent gains on scene classification tasks.
Mutual Enhancement Between Global Tokens and Patch Tokens: From Theory to Practice
cs.CV 2026-05 unverdicted novelty 5.0

TaTok is a theoretically grounded adaptive tokenization method that uses global tokens and cumulative conditional entropy filtering to reduce redundancy while improving reconstruction quality over fixed-rate patch tok...