pith. sign in

arxiv: 2512.05707 · v2 · submitted 2025-12-05 · 💻 cs.CR

Evaluating Concept Filtering Defenses against Child Sexual Abuse Material Generation by Text-to-Image Models

Pith reviewed 2026-05-17 01:11 UTC · model grok-4.3

classification 💻 cs.CR
keywords CSAM generationtext-to-image modelsconcept filteringmachine learning securityfine-tuninggenerative modelsdataset filteringchild images
0
0 comments X

The pith

Current child filtering methods offer limited protection to closed-weight text-to-image models and none to open-weight models against CSAM generation.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper examines whether removing images of children from training datasets can prevent text-to-image models from being misused to create child sexual abuse material. It frames the defense problem with a game-based security definition that accounts for attacker prompting strategies and query budgets. Experiments using an ethical proxy task of generating images of children wearing glasses show that even small amounts of residual child data allow generation with only modestly more queries than on unfiltered models. Fine-tuning further reduces this overhead and can re-introduce the child concept even after perfect filtering. The results indicate that filtering reduces the model's ability to handle child-related concepts in general while providing only partial or no real defense.

Core claim

The authors establish that detection methods cannot remove all child images from datasets, so residual examples remain available to attackers. With the child-wearing-glasses proxy, they demonstrate that prompting strategies succeed in generating the target concept using only a few more queries than on unfiltered training data, and that fine-tuning on child images eliminates most of the added cost. Even perfect filtering can be bypassed by subsequent fine-tuning that re-introduces the concept. These outcomes translate to limited protection for closed-weight models and no protection for open-weight models, accompanied by reduced model generality through hindered or altered representation of 7

What carries the argument

The game-based security definition that models defender filtering against attacker prompting and query budgets, evaluated through the ethical proxy of generating images of a child wearing glasses.

Load-bearing premise

That the proxy task of generating images of a child wearing glasses sufficiently captures the dynamics of generating actual CSAM and that the game-based security definition accurately reflects realistic attacker capabilities and query budgets.

What would settle it

An experiment in which no sequence of prompts or fine-tuning on child images succeeds in producing child-related outputs on a model trained after complete filtering, or in which the additional query overhead remains orders of magnitude higher than on unfiltered data.

Figures

Figures reproduced from arXiv: 2512.05707 by Amro Abdalla, Ana-Maria Cretu, Carmela Troncoso, Elissa M. Redmiles, Klim Kireev, Raphael Meier, Sarah Adel Bargal, Wisdom Obinna.

Figure 1
Figure 1. Figure 1: Examples of CWG images from each experiment with raters’ median confidence that the image depicts a [PITH_FULL_IMAGE:figures/full_fig_p011_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Raters’ confidence that images in each experi [PITH_FULL_IMAGE:figures/full_fig_p012_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: The estimated probability of obtaining at least [PITH_FULL_IMAGE:figures/full_fig_p013_3.png] view at source ↗
Figure 5
Figure 5. Figure 5: Age shift in images produced by CC3M (left) and LAION-face (right) models in response to heuristic [PITH_FULL_IMAGE:figures/full_fig_p014_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Examples of images generated in the Sprigatito experiments (one row per experiment). Images in the [PITH_FULL_IMAGE:figures/full_fig_p016_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Convergence curves for unfiltered CC3M- and LAION-Face trained models, shown as CMMD scores [PITH_FULL_IMAGE:figures/full_fig_p026_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Examples of images produced by CC3M (left) and LAION-Face (right) models in response to prompts [PITH_FULL_IMAGE:figures/full_fig_p031_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Examples of images produced by CC3M (left) and LAION-Face (right) models in response to prompts [PITH_FULL_IMAGE:figures/full_fig_p032_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Examples of images produced by CC3M (left) and LAION-Face (right) models in response to prompts [PITH_FULL_IMAGE:figures/full_fig_p033_10.png] view at source ↗
read the original abstract

We evaluate the effectiveness of filtering child images from training datasets of text-to-image models to prevent model misuse to create child sexual abuse material (CSAM). First, we capture the complexity of preventing CSAM generation using a game-based security definition. Second, we show that current detection methods cannot remove all children from a dataset. Third, using an ethical proxy for CSAM (a child wearing glasses), we show that even when only a small percentage of child images are left in the training dataset after filtering, there exist prompting strategies that generate a child wearing glasses using only a few more queries than when the model is trained on the unfiltered data. Fine-tuning the filtered model on child images further reduces the additional query overhead. We also show that re-introducing a concept is possible via fine-tuning even if filtering is perfect. Our results show that current child filtering methods offer limited protection to closed-weight models and no protection to open-weight models, while reducing the generality of the model by hindering the generation of child-related concepts or changing their representation. We conclude by outlining challenges in conducting evaluations that establish robust evidence on the impact of concept filtering defenses for CSAM.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper evaluates the effectiveness of filtering child images from training datasets of text-to-image models to prevent CSAM generation. It introduces a game-based security definition, shows that current detection methods leave residual child images in datasets, and uses an ethical proxy task (generating images of a child wearing glasses) to demonstrate that prompting strategies can produce the proxy concept with only modestly more queries than on unfiltered models. Fine-tuning on child images further reduces query overhead, and the work concludes that filtering offers limited protection to closed-weight models and none to open-weight models while also reducing model generality for child-related concepts.

Significance. If the proxy results generalize, the findings would highlight important practical limitations of concept filtering as a defense against misuse of T2I models. The game-based security definition provides a structured threat model, and the empirical demonstration of fine-tuning recovery even under perfect filtering is a useful observation for the AI safety community.

major comments (2)
  1. [Section describing the ethical proxy and experimental results] The central claim that filtering provides only limited or no protection against CSAM rests on experiments with the proxy of generating images of a child wearing glasses. The manuscript provides no direct comparison, ablation, or analysis showing that this non-sexual child concept exhibits the same filtering resistance, prompting sensitivity, or fine-tuning recovery dynamics as explicit CSAM concepts (which involve sexual content that may engage different internal representations or safety alignments). Without such validation, the measured query overheads and protection levels do not necessarily generalize to actual CSAM.
  2. [Experimental evaluation and results sections] The reported experimental outcomes lack sufficient detail on exact models, training dataset sizes, number of trials or queries per condition, statistical significance testing, or error bars. This directly affects the ability to assess the reliability of the claims about 'a few more queries' and the differential protection levels between closed- and open-weight models.
minor comments (2)
  1. Clarify the precise attacker query budget and capabilities assumed in the game-based security definition, including any concrete examples of prompting strategies tested.
  2. Add discussion of potential limitations or failure modes of the proxy approach in the conclusion or dedicated limitations section.

Simulated Author's Rebuttal

2 responses · 1 unresolved

We thank the referee for their constructive comments, which have helped clarify the scope and presentation of our results. We respond point-by-point to the major comments below, indicating revisions made to the manuscript.

read point-by-point responses
  1. Referee: [Section describing the ethical proxy and experimental results] The central claim that filtering provides only limited or no protection against CSAM rests on experiments with the proxy of generating images of a child wearing glasses. The manuscript provides no direct comparison, ablation, or analysis showing that this non-sexual child concept exhibits the same filtering resistance, prompting sensitivity, or fine-tuning recovery dynamics as explicit CSAM concepts (which involve sexual content that may engage different internal representations or safety alignments). Without such validation, the measured query overheads and protection levels do not necessarily generalize to actual CSAM.

    Authors: We agree that direct validation against explicit CSAM would strengthen the work but is not feasible. The proxy was chosen to isolate the child-generation capability that underlies CSAM prompts while remaining within ethical bounds. In the revised manuscript we have added a new subsection in the Discussion that explains this rationale, references prior studies on hierarchical concept learning in diffusion models, and explicitly states that results pertain to child-concept filtering rather than claiming identical dynamics for all sexualized variants. Claims have been tempered accordingly. revision: partial

  2. Referee: [Experimental evaluation and results sections] The reported experimental outcomes lack sufficient detail on exact models, training dataset sizes, number of trials or queries per condition, statistical significance testing, or error bars. This directly affects the ability to assess the reliability of the claims about 'a few more queries' and the differential protection levels between closed- and open-weight models.

    Authors: We accept this criticism. The revised Experimental Setup and Results sections now specify the exact models (Stable Diffusion v1.5 and v2.1), pre- and post-filter dataset sizes, number of trials (30 independent runs per condition), query counts per strategy, standard-error bars, and statistical comparisons (two-sided t-tests with reported p-values). These additions directly address concerns about reliability and reproducibility. revision: yes

standing simulated objections not resolved
  • Direct empirical comparison or ablation using explicit CSAM prompts or training data, which is prohibited by ethical review boards and applicable laws.

Circularity Check

0 steps flagged

No circularity: purely empirical evaluation with independent experimental measurements

full rationale

The paper conducts an empirical study of filtering effectiveness using a game-based security definition, proxy tasks (child wearing glasses), and measurements of query overheads and fine-tuning recovery. No derivations, equations, or fitted parameters are presented as predictions that reduce to the inputs by construction. Claims rest on replicable experimental results rather than self-referential definitions or self-citation chains that bear the central load. The proxy choice and security model are stated assumptions open to external validation, not tautologies.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claims rest on the representativeness of the chosen proxy and the adequacy of the security game to model real attackers; no free parameters or invented entities are introduced.

axioms (2)
  • domain assumption The proxy concept (child wearing glasses) exhibits filtering and generation behavior sufficiently similar to actual CSAM concepts for the purpose of evaluating defenses.
    Invoked to enable ethical experimentation while claiming the results generalize to CSAM prevention.
  • domain assumption The game-based security definition captures the relevant attacker model including query budget and access level.
    Used to frame the evaluation of filtering effectiveness.

pith-pipeline@v0.9.0 · 5535 in / 1406 out tokens · 42161 ms · 2026-05-17T01:11:48.715151+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 4 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. How to Stop Playing Whack-a-Mole: Mapping the Ecosystem of Technologies Facilitating AI-Generated Non-Consensual Intimate Images

    cs.CY 2026-02 unverdicted novelty 7.0

    The paper introduces the first comprehensive taxonomy and visualization of 11 categories of technologies facilitating AI-generated non-consensual intimate images, derived from synthesis of primary sources and demonstr...

  2. "Unlimited Realm of Exploration and Experimentation": Methods and Motivations of AI-Generated Sexual Content Creators

    cs.CY 2026-01 conditional novelty 7.0

    Interviews with 28 AIG-SC creators show motivations spanning sexual exploration, creative expression, technical experimentation, and occasional production of non-consensual intimate imagery.

  3. Evaluation without Generation: Non-Generative Assessment of Harmful Model Specialization with Applications to CSAM

    cs.LG 2026-04 unverdicted novelty 6.0

    Gaussian probing infers harmful model specialization from parameter perturbations and internal representation responses to Gaussian latent ensembles rather than from generated outputs.

  4. The Algorithmic Gaze of Image Quality Assessment: An Audit and Trace Ethnography of the LAION-Aesthetics Predictor

    cs.HC 2026-01 conditional novelty 6.0

    LAION-Aesthetics Predictor reinforces Western and male biases by preferentially selecting images associated with women and realistic Western/Japanese art while excluding men, LGBTQ+ references, and other styles.

Reference graph

Works this paper leans on

50 extracted references · 50 canonical work pages · cited by 4 Pith papers · 3 internal anchors

  1. [1]

    United States Code, Title 18, Crimes and Criminal Procedure, Chapter 71

    18 u.s.c.§1466a obscene visual representations of the sexual abuse of children, 2003. United States Code, Title 18, Crimes and Criminal Procedure, Chapter 71

  2. [2]

    Neglected risks: The disturbing reality of children’s images in datasets and the urgent call for accountability

    Carlos Caetano, Gabriel O dos Santos, Caio Petrucci, Artur Barros, Camila Laranjeira, Leo Sampaio Ferraz Ribeiro, J´ ulia Fernandes de Mendon¸ ca, Jefersson A dos Santos, and Sandra Avila. Neglected risks: The disturbing reality of children’s images in datasets and the urgent call for accountability. InACM FACCT, 2025

  3. [3]

    Psychological perspectives of virtual child sexual abuse material.Sexuality & Culture, 2021

    Larissa S Christensen, Dominique Moritz, and Ashley Pearson. Psychological perspectives of virtual child sexual abuse material.Sexuality & Culture, 2021

  4. [4]

    Stable diffusion v1-4 model card.https://huggingface.co/CompVis/stable-diffusion-v1-4,

    CompVis. Stable diffusion v1-4 model card.https://huggingface.co/CompVis/stable-diffusion-v1-4,

  5. [5]

    Accessed: 2025-11-03

  6. [6]

    Feder Cooper, Christopher A

    A Feder Cooper, Christopher A Choquette-Choo, Miranda Bogen, Matthew Jagielski, Katja Filippova, Ken Ziyu Liu, Alexandra Chouldechova, Jamie Hayes, Yangsibo Huang, Niloofar Mireshghallah, et al. Machine Unlearning Doesn’t Do What You Think: Lessons for Generative AI Policy, Research, and Practice.arXiv preprint arXiv:2412.06966, 2024

  7. [7]

    The General-Purpose AI Code of Practice, 2025

    European Commission. The General-Purpose AI Code of Practice, 2025

  8. [8]

    Child Sexual Abuse Material Created by Generative AI and Similar Online Tools is Illegal, 2024

    FBI. Child Sexual Abuse Material Created by Generative AI and Similar Online Tools is Illegal, 2024

  9. [9]

    Unified concept editing in diffusion models

    Rohit Gandikota, Hadas Orgad, Yonatan Belinkov, Joanna Materzy´ nska, and David Bau. Unified concept editing in diffusion models. InWACV, 2024

  10. [10]

    Measuring the uncanny valley effect: Refinements to indices for perceived humanness, attractiveness, and eeriness.International Journal of Social Robotics, 2017

    Chin-Chang Ho and Karl F MacDorman. Measuring the uncanny valley effect: Refinements to indices for perceived humanness, attractiveness, and eeriness.International Journal of Social Robotics, 2017

  11. [11]

    Lora: Low-rank adaptation of large language models.ICLR, 2022

    Edward J Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen. Lora: Low-rank adaptation of large language models.ICLR, 2022

  12. [12]

    Global CSAM Legislative Overview: An overview of national CSAM legislations in INHOPE Member Countries and the Lanzarote Convention State Parties

    International Association of Internet Hotlines. Global CSAM Legislative Overview: An overview of national CSAM legislations in INHOPE Member Countries and the Lanzarote Convention State Parties. Technical report, 2024. Second edition. 18

  13. [13]

    Rethinking FID: Towards a Better Evaluation Metric for Image Generation

    Sadeep Jayasumana, Srikumar Ramalingam, Andreas Veit, Daniel Glasner, Ayan Chakrabarti, and Sanjiv Kumar. Rethinking FID: Towards a Better Evaluation Metric for Image Generation. InCVPR, 2024

  14. [14]

    Fairface: Face attribute dataset for balanced race, gender, and age for bias measurement and mitigation

    Kimmo Karkkainen and Jungseock Joo. Fairface: Face attribute dataset for balanced race, gender, and age for bias measurement and mitigation. InWACV, 2021

  15. [15]

    A manually annotated image-caption dataset for detecting children in the wild.arXiv preprint arXiv:2506.10117, 2025

    Klim Kireev, Ana-Maria Cret ¸u, Raphael Meier, Sarah Adel Bargal, Elissa Redmiles, and Carmela Troncoso. A manually annotated image-caption dataset for detecting children in the wild.arXiv preprint arXiv:2506.10117, 2025

  16. [16]

    The challenges of identifying and classifying child sexual abuse material.Sexual Abuse, 2019

    Juliane A Kloess, Jessica Woodhams, Helen Whittle, Tim Grant, and Catherine E Hamilton-Giachritsis. The challenges of identifying and classifying child sexual abuse material.Sexual Abuse, 2019

  17. [17]

    Unveiling AI’s Threats to Child Protection: Regula- tory efforts to Criminalize AI-Generated CSAM and Emerging Children’s Rights Violations.arXiv preprint arXiv:2503.00433, 2025

    Emmanouela Kokolaki and Paraskevi Fragopoulou. Unveiling AI’s Threats to Child Protection: Regula- tory efforts to Criminalize AI-Generated CSAM and Emerging Children’s Rights Violations.arXiv preprint arXiv:2503.00433, 2025

  18. [18]

    Mivolo: Multi-input transformer for age and gender estimation

    Maksim Kuprashevich and Irina Tolstykh. Mivolo: Multi-input transformer for age and gender estimation. In AIST, 2023

  19. [19]

    Schr¨ odinger’s Crime: AI-generated Child Sexual Abuse Material as a Victimless Offense

    Maria Lazaridou. Schr¨ odinger’s Crime: AI-generated Child Sexual Abuse Material as a Victimless Offense. Master’s thesis, Utrecht University, 2025

  20. [20]

    Microsoft coco: Common objects in context

    Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Doll´ ar, and C Lawrence Zitnick. Microsoft coco: Common objects in context. InComputer vision–ECCV 2014: 13th European conference, zurich, Switzerland, September 6-12, 2014, proceedings, part v 13, pages 740–755. Springer, 2014

  21. [21]

    DeepSeek-V3 Technical Report

    Aixin Liu, Bei Feng, Bing Xue, Bingxuan Wang, Bochao Wu, Chengda Lu, Chenggang Zhao, Chengqi Deng, Chenyu Zhang, Chong Ruan, et al. Deepseek-v3 technical report.arXiv preprint arXiv:2412.19437, 2024

  22. [22]

    Visual instruction tuning.NeurIPS, 2023

    Haotian Liu, Chunyuan Li, Qingyang Wu, and Yong Jae Lee. Visual instruction tuning.NeurIPS, 2023

  23. [23]

    Public comment: CSAM Sentencing Enhancements 50-State Comparison, 2025

    Mary-Dulany James. Public comment: CSAM Sentencing Enhancements 50-State Comparison, 2025

  24. [24]

    GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models

    Alex Nichol, Prafulla Dhariwal, Aditya Ramesh, Pranav Shyam, Pamela Mishkin, Bob McGrew, Ilya Sutskever, and Mark Chen. Glide: Towards photorealistic image generation and editing with text-guided diffusion models. arXiv preprint arXiv:2112.10741, 2021

  25. [25]

    Compositional abilities emerge multi- plicatively: Exploring diffusion models on a synthetic task.NeurIPS, 2024

    Maya Okawa, Ekdeep S Lubana, Robert Dick, and Hidenori Tanaka. Compositional abilities emerge multi- plicatively: Exploring diffusion models on a synthetic task.NeurIPS, 2024

  26. [26]

    One shot lora.https://oneshotlora.com/gudrun/index.html, 2025

    OneShotLoRA. One shot lora.https://oneshotlora.com/gudrun/index.html, 2025. Accessed: 2025-11-03

  27. [27]

    Introducing vision to the fine-tuning API.https://openai.com/index/introducing-vision-t o-the-fine-tuning-api/

    Open AI. Introducing vision to the fine-tuning API.https://openai.com/index/introducing-vision-t o-the-fine-tuning-api/. Accessed: 10-07-2025

  28. [28]

    A call to reflect on evaluation practices for age estimation: comparative analysis of the state-of-the-art and a unified benchmark

    Jakub Paplh´ am, Vojt Franc, et al. A call to reflect on evaluation practices for age estimation: comparative analysis of the state-of-the-art and a unified benchmark. InCVPR, 2024

  29. [29]

    Face recognition accuracy of forensic examiners, superrecognizers, and face recognition algorithms.PNAS, 2018

    P Jonathon Phillips, Amy N Yates, Ying Hu, Carina A Hahn, Eilidh Noyes, Kelsey Jackson, Jacqueline G Cavazos, G´ eraldine Jeckeln, Rajeev Ranjan, Swami Sankaranarayanan, et al. Face recognition accuracy of forensic examiners, superrecognizers, and face recognition algorithms.PNAS, 2018

  30. [30]

    Learning transferable visual models from natural language supervision

    Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. Learning transferable visual models from natural language supervision. InICML, 2021

  31. [31]

    One day this could happen to me

    Children’s Commisisioner’s. “One day this could happen to me” Children, nudification tools and sexually explicit deepfakes, 2025

  32. [32]

    State Laws Criminalizing AI-generated or Computer-Edited CSAM, 2025

    Enough abuse. State Laws Criminalizing AI-generated or Computer-Edited CSAM, 2025. 19

  33. [33]

    How AI is being abused to create child sexual abuse imagery

    Internet Watch Foundation. How AI is being abused to create child sexual abuse imagery. Technical report, 2023

  34. [34]

    Stablediffusion training with mosaic ml.https://github.com/mosaicml/diffusion, 2023

    Mosaic ML. Stablediffusion training with mosaic ml.https://github.com/mosaicml/diffusion, 2023. Accessed: 2025-11-03

  35. [35]

    Reducing risks posed by synthetic content an overview of technical approaches to digital content transparency., 2024

    National Institute of Standards and Technology. Reducing risks posed by synthetic content an overview of technical approaches to digital content transparency., 2024

  36. [36]

    High-resolution image synthesis with latent diffusion models

    Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Bj¨ orn Ommer. High-resolution image synthesis with latent diffusion models. InCVPR, 2022

  37. [37]

    Dream- booth: Fine tuning text-to-image diffusion models for subject-driven generation

    Nataniel Ruiz, Yuanzhen Li, Varun Jampani, Yael Pritch, Michael Rubinstein, and Kfir Aberman. Dream- booth: Fine tuning text-to-image diffusion models for subject-driven generation. InCVPR, 2023

  38. [38]

    Photorealistic text-to-image diffusion models with deep language understanding.NeurIPS, 2022

    Chitwan Saharia, William Chan, Saurabh Saxena, Lala Li, Jay Whang, Emily L Denton, Kamyar Ghasemipour, Raphael Gontijo Lopes, Burcu Karagol Ayan, Tim Salimans, et al. Photorealistic text-to-image diffusion models with deep language understanding.NeurIPS, 2022

  39. [39]

    Laion-5b: An open large-scale dataset for training next generation image-text models.NeurIPS, 2022

    Christoph Schuhmann, Romain Beaumont, Richard Vencu, Cade Gordon, Ross Wightman, Mehdi Cherti, Theo Coombes, Aarush Katta, Clayton Mullis, Mitchell Wortsman, et al. Laion-5b: An open large-scale dataset for training next generation image-text models.NeurIPS, 2022

  40. [40]

    LAION-400M: Open Dataset of CLIP-Filtered 400 Million Image-Text Pairs

    Christoph Schuhmann, Richard Vencu, Romain Beaumont, Robert Kaczmarczyk, Clayton Mullis, Aarush Katta, Theo Coombes, Jenia Jitsev, and Aran Komatsuzaki. Laion-400m: Open dataset of clip-filtered 400 million image-text pairs.arXiv preprint arXiv:2111.02114, 2021

  41. [41]

    OpenAI, Meta and Google Sign On to New Child Exploitation Safety Measures

    Deepa Seetharaman. OpenAI, Meta and Google Sign On to New Child Exploitation Safety Measures. Wall Street Journal, 2024

  42. [42]

    Stretching each dollar: Diffusion training from scratch on a micro-budget

    Vikash Sehwag, Xianghao Kong, Jingtao Li, Michael Spranger, and Lingjuan Lyu. Stretching each dollar: Diffusion training from scratch on a micro-budget. InCVPR, 2025

  43. [43]

    Conceptual captions: A cleaned, hyper- nymed, image alt-text dataset for automatic image captioning

    Piyush Sharma, Nan Ding, Sebastian Goodman, and Radu Soricut. Conceptual captions: A cleaned, hyper- nymed, image alt-text dataset for automatic image captioning. InACL, 2018

  44. [44]

    Generative ML and CSAM: Implications and mitigations

    David Thiel, Melissa Stroebel, and Rebecca Portnoff. Generative ML and CSAM: Implications and mitigations. InStanford digital repository. 2023

  45. [45]

    A Pedophile Filmed Kids At Disney World To Make AI Child Abuse Images, Cops Say

    Brewster Thomas. A Pedophile Filmed Kids At Disney World To Make AI Child Abuse Images, Cops Say. Forbes, 2024

  46. [46]

    Thorn Safety by Design for Generative AI: Preventing Child Sexual Abuse, 2024

    Thorn & ATIH. Thorn Safety by Design for Generative AI: Preventing Child Sexual Abuse, 2024

  47. [47]

    Child Sexual Abuse Material, 2023

    United States Department of Justice. Child Sexual Abuse Material, 2023

  48. [48]

    Approach and avoidance tendencies toward picture stimuli of (pre-) pubescent children and adults: An investigation in pedophilic and nonpedophilic samples.Sexual Abuse, 2018

    K Weidacker, C K¨ argel, C Massau, S Weiß, J Kneer, THC Krueger, and B Schiffer. Approach and avoidance tendencies toward picture stimuli of (pre-) pubescent children and adults: An investigation in pedophilic and nonpedophilic samples.Sexual Abuse, 2018

  49. [49]

    Image-perfect imperfections: Safety, bias, and authenticity in the shadow of text-to-image model evolution

    Yixin Wu, Yun Shen, Michael Backes, and Yang Zhang. Image-perfect imperfections: Safety, bias, and authenticity in the shadow of text-to-image model evolution. InACM CCS, 2024

  50. [50]

    yes” or “no

    Yinglin Zheng, Hao Yang, Ting Zhang, Jianmin Bao, Dongdong Chen, Yangyu Huang, Lu Yuan, Dong Chen, Ming Zeng, and Fang Wen. General facial representation learning in a visual-linguistic manner. InCVPR, 2022. 20 Appendix A Ethics considerations Child detection benchmarking.To identify the best child detector, we have adapted existing methods to the child d...