Golden RPG improves compositional text-to-image generation via a region-aware noise predictor with per-region FiLM adapters, injected cross-attention, and adaptive blending, yielding higher cross-region coherence on benchmarks while matching baselines on CLIP metrics.
Sdxl: Improving latent diffusion models for high-resolution image synthesis
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CV 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Golden RPG: Confidence-Adaptive Region-Aware Noise for Compositional Text-to-Image Generation
Golden RPG improves compositional text-to-image generation via a region-aware noise predictor with per-region FiLM adapters, injected cross-attention, and adaptive blending, yielding higher cross-region coherence on benchmarks while matching baselines on CLIP metrics.