A data-generation pipeline plus pairwise subject-consistency rewards in RL improve consistency and prompt adherence for multi-subject personalized image generation.
Dpok: Reinforcement learning for fine-tuning text-to-image diffu- sion models.Advances in Neural Information Processing Systems, 36:79858–79885, 2023
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2025 2representative citing papers
Instance-level sampling schedules optimized via REINFORCE with James-Stein estimator improve text-to-image alignment and allow 5-step Flux generation to match deliberately distilled samplers.
citing papers explorer
-
PSR: Scaling Multi-Subject Personalized Image Generation with Pairwise Subject-Consistency Rewards
A data-generation pipeline plus pairwise subject-consistency rewards in RL improve consistency and prompt adherence for multi-subject personalized image generation.
-
Designing Instance-Level Sampling Schedules via REINFORCE with James-Stein Shrinkage
Instance-level sampling schedules optimized via REINFORCE with James-Stein estimator improve text-to-image alignment and allow 5-step Flux generation to match deliberately distilled samplers.