hub Baseline reference

Pick-a-pic: An open dataset of user preferences for text-to-image generation

URL https://arxiv · 2023 · arXiv 2305.01569

Baseline reference. 50% of citing Pith papers use this work as a benchmark or comparison.

14 Pith papers citing it

Baseline 50% of classified citations

read on arXiv browse 14 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

dataset 3 background 2 other 2 baseline 1

citation-polarity summary

use dataset 3 background 2 unclear 2 baseline 1

representative citing papers

Functionalization via Structure Completion and Motion Rectification

cs.CV · 2026-05-18 · unverdicted · novelty 7.0

Object functionalization is cast as neural graph completion over a functional graph of parts, contacts, and motions, followed by geometry realization that also rectifies erroneous motions, demonstrated on furniture with a new paired dataset.

Improved techniques for fine-tuning flow models via adjoint matching: a deterministic control pipeline

cs.AI · 2026-05-07 · unverdicted · novelty 7.0

A new adjoint matching framework formulates flow model alignment as optimal control, enabling direct regression training and terminal-trajectory truncation for efficiency gains on models like SiT-XL and FLUX.

Oracle Noise: Faster Semantic Spherical Alignment for Interpretable Latent Optimization

cs.CV · 2026-04-26 · unverdicted · novelty 7.0

Oracle Noise optimizes diffusion model noise on a Riemannian hypersphere guided by key prompt words to preserve the Gaussian prior, eliminate norm inflation, and achieve faster semantic alignment than Euclidean methods.

$Z^2$-Sampling: Zero-Cost Zigzag Trajectories for Semantic Alignment in Diffusion Models

cs.CV · 2026-04-26 · unverdicted · novelty 7.0

Z²-Sampling implicitly realizes zero-cost zigzag trajectories for curvature-aware semantic alignment in diffusion models by reducing multi-step paths via operator dualities and temporal caching while synthesizing a directional derivative penalty.

Setting-Matched and Semantics-Scaled Benchmarking of One-Step Generative Models Against Multistep Diffusion and Flow Models

cs.CV · 2026-03-15 · unverdicted · novelty 7.0

Matched benchmarking reveals FID misleads in few-step regimes under CFG, prompting CLIP-scaled and PickScore-scaled FID and IS variants for better semantic evaluation of one-step image generators.

Delta Score Matters! Spatial Adaptive Multi Guidance in Diffusion Models

cs.CV · 2026-04-29 · unverdicted · novelty 6.0

SAMG uses spatially adaptive guidance scales derived from a geometric analysis of classifier-free guidance to resolve the detail-artifact dilemma in diffusion-based image and video generation.

MaskAttn-SDXL: Controllable Region-Level Text-To-Image Generation

cs.CV · 2025-09-18 · unverdicted · novelty 6.0

MaskAttn-SDXL adds token-conditioned spatial gating to SDXL cross-attention to sparsify irrelevant token-to-location bindings and improve region-level controllability without retraining or inference edits.

Directly Fine-Tuning Diffusion Models on Differentiable Rewards

cs.CV · 2023-09-29 · conditional · novelty 6.0

DRaFT fine-tunes diffusion models by differentiating through sampling to maximize rewards, outperforming RL baselines and improving aesthetics on Stable Diffusion 1.4.

SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis

cs.CV · 2023-07-04 · conditional · novelty 6.0

SDXL improves upon prior Stable Diffusion versions through a larger UNet backbone, dual text encoders, novel conditioning, and a refinement model, producing higher-fidelity images competitive with black-box state-of-the-art generators.

Human Preference Score v2: A Solid Benchmark for Evaluating Human Preferences of Text-to-Image Synthesis

cs.CV · 2023-06-15 · conditional · novelty 6.0

HPD v2 is the largest human preference dataset for text-to-image images with 798k choices, and HPS v2 is the resulting CLIP-based scorer that better predicts human judgments and responds to model improvements.

Towards General Preference Alignment: Diffusion Models at Nash Equilibrium

cs.LG · 2026-05-06 · unverdicted · novelty 5.0

Diff.-NPO frames diffusion alignment as a self-play game reaching Nash equilibrium and reports better text-to-image results than prior DPO-style methods.

ACPO: Anchor-Constrained Perceptual Optimization for Diffusion Models with No-Reference Quality Guidance

cs.CV · 2026-04-29 · unverdicted · novelty 5.0

ACPO uses anchor-based regularization with NR-IQA guidance to enable stable perceptual quality improvements in diffusion model fine-tuning.

Evaluating AI-Generated Images of Cultural Artifacts with Community-Informed Rubrics

cs.CY · 2026-04-02 · unverdicted · novelty 5.0 · 2 refs

Case studies with blind UK residents and people from Kerala and Tamil Nadu demonstrate that community input at the systematization stage produces culturally grounded definitions of appropriateness for text-to-image model outputs.

Alignment and Safety of Diffusion Models via Reinforcement Learning and Reward Modeling: A Survey

cs.CV · 2025-05-23 · accept · novelty 4.0

A literature survey that organizes diffusion model alignment methods along five axes (feedback source, reward form, optimization mechanism, distribution shift handling, and explicit safety constraints) and identifies open challenges for reliable deployment.

citing papers explorer

Showing 14 of 14 citing papers.

Functionalization via Structure Completion and Motion Rectification cs.CV · 2026-05-18 · unverdicted · none · ref 266
Object functionalization is cast as neural graph completion over a functional graph of parts, contacts, and motions, followed by geometry realization that also rectifies erroneous motions, demonstrated on furniture with a new paired dataset.
Improved techniques for fine-tuning flow models via adjoint matching: a deterministic control pipeline cs.AI · 2026-05-07 · unverdicted · none · ref 16
A new adjoint matching framework formulates flow model alignment as optimal control, enabling direct regression training and terminal-trajectory truncation for efficiency gains on models like SiT-XL and FLUX.
Oracle Noise: Faster Semantic Spherical Alignment for Interpretable Latent Optimization cs.CV · 2026-04-26 · unverdicted · none · ref 20
Oracle Noise optimizes diffusion model noise on a Riemannian hypersphere guided by key prompt words to preserve the Gaussian prior, eliminate norm inflation, and achieve faster semantic alignment than Euclidean methods.
$Z^2$-Sampling: Zero-Cost Zigzag Trajectories for Semantic Alignment in Diffusion Models cs.CV · 2026-04-26 · unverdicted · none · ref 18
Z²-Sampling implicitly realizes zero-cost zigzag trajectories for curvature-aware semantic alignment in diffusion models by reducing multi-step paths via operator dualities and temporal caching while synthesizing a directional derivative penalty.
Setting-Matched and Semantics-Scaled Benchmarking of One-Step Generative Models Against Multistep Diffusion and Flow Models cs.CV · 2026-03-15 · unverdicted · none · ref 11
Matched benchmarking reveals FID misleads in few-step regimes under CFG, prompting CLIP-scaled and PickScore-scaled FID and IS variants for better semantic evaluation of one-step image generators.
Delta Score Matters! Spatial Adaptive Multi Guidance in Diffusion Models cs.CV · 2026-04-29 · unverdicted · none · ref 12
SAMG uses spatially adaptive guidance scales derived from a geometric analysis of classifier-free guidance to resolve the detail-artifact dilemma in diffusion-based image and video generation.
MaskAttn-SDXL: Controllable Region-Level Text-To-Image Generation cs.CV · 2025-09-18 · unverdicted · none · ref 21
MaskAttn-SDXL adds token-conditioned spatial gating to SDXL cross-attention to sparsify irrelevant token-to-location bindings and improve region-level controllability without retraining or inference edits.
Directly Fine-Tuning Diffusion Models on Differentiable Rewards cs.CV · 2023-09-29 · conditional · none · ref 11
DRaFT fine-tunes diffusion models by differentiating through sampling to maximize rewards, outperforming RL baselines and improving aesthetics on Stable Diffusion 1.4.
SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis cs.CV · 2023-07-04 · conditional · none · ref 23
SDXL improves upon prior Stable Diffusion versions through a larger UNet backbone, dual text encoders, novel conditioning, and a refinement model, producing higher-fidelity images competitive with black-box state-of-the-art generators.
Human Preference Score v2: A Solid Benchmark for Evaluating Human Preferences of Text-to-Image Synthesis cs.CV · 2023-06-15 · conditional · none · ref 10
HPD v2 is the largest human preference dataset for text-to-image images with 798k choices, and HPS v2 is the resulting CLIP-based scorer that better predicts human judgments and responds to model improvements.
Towards General Preference Alignment: Diffusion Models at Nash Equilibrium cs.LG · 2026-05-06 · unverdicted · none · ref 14
Diff.-NPO frames diffusion alignment as a self-play game reaching Nash equilibrium and reports better text-to-image results than prior DPO-style methods.
ACPO: Anchor-Constrained Perceptual Optimization for Diffusion Models with No-Reference Quality Guidance cs.CV · 2026-04-29 · unverdicted · none · ref 8
ACPO uses anchor-based regularization with NR-IQA guidance to enable stable perceptual quality improvements in diffusion model fine-tuning.
Evaluating AI-Generated Images of Cultural Artifacts with Community-Informed Rubrics cs.CY · 2026-04-02 · unverdicted · none · ref 68 · 2 links
Case studies with blind UK residents and people from Kerala and Tamil Nadu demonstrate that community input at the systematization stage produces culturally grounded definitions of appropriateness for text-to-image model outputs.
Alignment and Safety of Diffusion Models via Reinforcement Learning and Reward Modeling: A Survey cs.CV · 2025-05-23 · accept · none · ref 8
A literature survey that organizes diffusion model alignment methods along five axes (feedback source, reward form, optimization mechanism, distribution shift handling, and explicit safety constraints) and identifies open challenges for reliable deployment.

Pick-a-pic: An open dataset of user preferences for text-to-image generation

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer