FMA introduces flow matching for multi-step cross-modal feature alignment in few-shot learning, using fixed coupling, noise augmentation, and early-stopping to outperform one-step PEFT methods.
Is synthetic data from generative models ready for image recognition?
5 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
fields
cs.CV 5verdicts
UNVERDICTED 5roles
background 1polarities
background 1representative citing papers
Empirical study of a fully synthetic data generation pipeline for text-based person retrieval that tests its use as a replacement or augmentation for real data across scenarios.
Dense scene composition and instance fidelity in synthetic diffusion images drive better segmentation performance; SENSE framework exploits this to improve models on Cityscapes, COCO, and ADE20K.
A unified synthetic data generation pipeline produces unlimited annotated multimodal video data across multiple tasks, enabling models trained mostly on synthetic data to generalize effectively to real-world video understanding benchmarks.
TaTok is a theoretically grounded adaptive tokenization method that uses global tokens and cumulative conditional entropy filtering to reduce redundancy while improving reconstruction quality over fixed-rate patch tokenization.
citing papers explorer
-
Exploring Cross-Modal Flows for Few-Shot Learning
FMA introduces flow matching for multi-step cross-modal feature alignment in few-shot learning, using fixed coupling, noise augmentation, and early-stopping to outperform one-step PEFT methods.
-
An Empirical Study of Validating Synthetic Data for Text-Based Person Retrieval
Empirical study of a fully synthetic data generation pipeline for text-based person retrieval that tests its use as a replacement or augmentation for real data across scenarios.
-
What Makes Synthetic Data Effective in Image Segmentation
Dense scene composition and instance fidelity in synthetic diffusion images drive better segmentation performance; SENSE framework exploits this to improve models on Cityscapes, COCO, and ADE20K.
-
All in One: A Unified Synthetic Data Pipeline for Multimodal Video Understanding
A unified synthetic data generation pipeline produces unlimited annotated multimodal video data across multiple tasks, enabling models trained mostly on synthetic data to generalize effectively to real-world video understanding benchmarks.
-
Mutual Enhancement Between Global Tokens and Patch Tokens: From Theory to Practice
TaTok is a theoretically grounded adaptive tokenization method that uses global tokens and cumulative conditional entropy filtering to reduce redundancy while improving reconstruction quality over fixed-rate patch tokenization.