pith. sign in

Using human feedback to fine-tune diffusion models without any reward model

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

citation-role summary

baseline 1

citation-polarity summary

years

2026 2 2025 1

verdicts

UNVERDICTED 3

roles

baseline 1

polarities

baseline 1

representative citing papers

Dual-Diffusional Generative Fashion Recommendation

cs.IR · 2026-05-17 · unverdicted · novelty 6.0

DualFashion introduces a dual-diffusion Transformer with image and text branches that generates both visual items and semantic descriptions for explainable personalized fashion recommendation.

VASR: Variance-Aware Systematic Resampling for Reward-Guided Diffusion

cs.AI · 2026-04-08 · unverdicted · novelty 6.0 · 2 refs

VASR separates continuation and residual variance in reward-guided diffusion SMC, using optimal mass allocation and systematic resampling to achieve up to 26% better FID scores and faster runtimes than prior SMC and MCTS methods.

BalancedDPO: Adaptive Multi-Metric Alignment

cs.CV · 2025-03-16 · unverdicted · novelty 4.0

BalancedDPO applies majority-vote consensus from multiple preference scorers and dynamic reference model updates within DPO to achieve multi-metric alignment for text-to-image diffusion models, reporting improved win rates on Pick-a-Pic, PartiPrompt, and HPD datasets across SD 1.5, 2.1, and SDXL.

citing papers explorer

Showing 3 of 3 citing papers.

  • Dual-Diffusional Generative Fashion Recommendation cs.IR · 2026-05-17 · unverdicted · none · ref 48

    DualFashion introduces a dual-diffusion Transformer with image and text branches that generates both visual items and semantic descriptions for explainable personalized fashion recommendation.

  • VASR: Variance-Aware Systematic Resampling for Reward-Guided Diffusion cs.AI · 2026-04-08 · unverdicted · none · ref 11 · 2 links

    VASR separates continuation and residual variance in reward-guided diffusion SMC, using optimal mass allocation and systematic resampling to achieve up to 26% better FID scores and faster runtimes than prior SMC and MCTS methods.

  • BalancedDPO: Adaptive Multi-Metric Alignment cs.CV · 2025-03-16 · unverdicted · none · ref 24

    BalancedDPO applies majority-vote consensus from multiple preference scorers and dynamic reference model updates within DPO to achieve multi-metric alignment for text-to-image diffusion models, reporting improved win rates on Pick-a-Pic, PartiPrompt, and HPD datasets across SD 1.5, 2.1, and SDXL.