Gaussian probing infers harmful model specialization from parameter perturbations and internal representation responses to Gaussian latent ensembles rather than from generated outputs.
Stable bias: Evaluating societal representations in diffusion models.Advances in Neural Information Processing Systems, 36:56338–56351
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Evaluation without Generation: Non-Generative Assessment of Harmful Model Specialization with Applications to CSAM
Gaussian probing infers harmful model specialization from parameter perturbations and internal representation responses to Gaussian latent ensembles rather than from generated outputs.