A new data pipeline using real photos, entity removal, and image-to-video models plus a cross-view attention loss enables text-driven generation of actors in reference scenes with improved alignment.
Photomaker: Customizing re- alistic human photos via stacked id embedding
3 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.CV 3representative citing papers
TokenTrace watermarks diffusion generations by jointly perturbing prompt embeddings and latent noise, enabling query-driven recovery of multiple independent concepts from one image.
A data-generation pipeline plus pairwise subject-consistency rewards in RL improve consistency and prompt adherence for multi-subject personalized image generation.
citing papers explorer
-
Setting the Stage: Text-Driven Scene-Consistent Image Generation
A new data pipeline using real photos, entity removal, and image-to-video models plus a cross-view attention loss enables text-driven generation of actors in reference scenes with improved alignment.
-
TokenTrace: Multi-Concept Attribution through Watermarked Token Recovery
TokenTrace watermarks diffusion generations by jointly perturbing prompt embeddings and latent noise, enabling query-driven recovery of multiple independent concepts from one image.
-
PSR: Scaling Multi-Subject Personalized Image Generation with Pairwise Subject-Consistency Rewards
A data-generation pipeline plus pairwise subject-consistency rewards in RL improve consistency and prompt adherence for multi-subject personalized image generation.