SurrogateSHAP: Training-Free Contributor Attribution for Text-to-Image (T2I) Models

Chanwoo Kim; Chris Lin; Mingyu Lu; Soham Gadgil; Su-In Lee

arxiv: 2601.22276 · v2 · pith:ZOBOXVEXnew · submitted 2026-01-29 · 💻 cs.LG · cs.CV

SurrogateSHAP: Training-Free Contributor Attribution for Text-to-Image (T2I) Models

Mingyu Lu , Soham Gadgil , Chris Lin , Chanwoo Kim , Su-In Lee This is my paper

classification 💻 cs.LG cs.CV

keywords datasurrogateshapacrossattributioncontributorsmodelmodelscomputational

0 comments

read the original abstract

As Text-to-Image (T2I) diffusion models are increasingly used in real-world creative workflows, a principled framework for valuing contributors who provide a collection of data is essential for fair compensation and sustainable data marketplaces. While the Shapley value offers a theoretically grounded approach to attribution, it faces a dual computational bottleneck: (i) the prohibitive cost of exhaustive model retraining for each sampled subset of players (i.e., data contributors) and (ii) the combinatorial number of subsets needed to estimate marginal contributions due to contributor interactions. To this end, we propose SurrogateSHAP, a retraining-free framework that approximates the expensive retraining game through inference from a pretrained model. To further improve efficiency, we employ a gradient-boosted tree to approximate the utility function and derive Shapley values analytically from the tree-based model. We evaluate SurrogateSHAP across three diverse attribution tasks: (i) image quality for DDPM-CFG on CIFAR-20, (ii) aesthetics for Stable Diffusion on Post-Impressionist artworks, and (iii) product diversity for FLUX.1 on Fashion-Product data. Across settings, SurrogateSHAP outperforms prior methods while substantially reducing computational overhead, consistently identifying influential contributors across multiple utility metrics. Finally, we demonstrate that SurrogateSHAP effectively localizes data sources responsible for spurious correlations in clinical images, providing a scalable path toward auditing safety-critical generative models.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

SafeDiffusion-R1: Online Reward Steering for Safe Diffusion Post-Training
cs.CV 2026-05 unverdicted novelty 6.0

SafeDiffusion-R1 uses online GRPO with CLIP embedding steering to cut inappropriate content from 48.9% to 18.07% and nudity detections from 646 to 15 in diffusion models while raising GenEval scores from 42.08% to 47....