Child safety necessitates new ap- proaches to ai safety

Neil Kale*, Rebecca Portnoff*, Pratiksha Thaker, Michael Simpson, Robertson Wang, Kevin Kuo, Chhavi Yadav, Virginia Smith · 2026

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

Evaluation without Generation: Non-Generative Assessment of Harmful Model Specialization with Applications to CSAM

cs.LG · 2026-04-28 · unverdicted · novelty 6.0

Gaussian probing infers harmful model specialization from parameter perturbations and internal representation responses to Gaussian latent ensembles rather than from generated outputs.

citing papers explorer

Showing 1 of 1 citing paper after filters.

Evaluation without Generation: Non-Generative Assessment of Harmful Model Specialization with Applications to CSAM cs.LG · 2026-04-28 · unverdicted · none · ref 23
Gaussian probing infers harmful model specialization from parameter perturbations and internal representation responses to Gaussian latent ensembles rather than from generated outputs.

Child safety necessitates new ap- proaches to ai safety

fields

years

verdicts

representative citing papers

citing papers explorer