CSA-Graphs: A Privacy-Preserving Structural Dataset for Child Sexual Abuse Research

Artur Barros; Camila Laranjeira; Carlos Caetano; Clara Ernesto; Jefersson A. dos Santos; Jo\~ao Macedo; Leo S. F. Ribeiro; Sandra Avila

arxiv: 2604.07132 · v1 · submitted 2026-04-08 · 💻 cs.CV · cs.AI· cs.LG

CSA-Graphs: A Privacy-Preserving Structural Dataset for Child Sexual Abuse Research

Carlos Caetano , Camila Laranjeira , Clara Ernesto , Artur Barros , Jo\~ao Macedo , Leo S. F. Ribeiro , Jefersson A. dos Santos , Sandra Avila This is my paper

Pith reviewed 2026-05-10 17:57 UTC · model grok-4.3

classification 💻 cs.CV cs.AIcs.LG

keywords CSAI classificationprivacy-preserving datasetscene graphsskeleton graphschild safetycomputer visiongraph-based representations

0 comments

The pith

A dataset of scene and skeleton graphs lets researchers classify child sexual abuse imagery without releasing any original images.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Legal and ethical rules block the public release of real CSAI image datasets, which stops computer vision researchers from building and testing detection tools. The paper introduces CSA-Graphs as a replacement that supplies only structural graphs instead of pictures. One set of graphs records object relationships in each scene, while the other records human body poses. Experiments show that models can still identify CSAI from these graphs alone and that merging both types of graphs raises accuracy further.

Core claim

CSA-Graphs replaces original images with two graph modalities: scene graphs that describe relationships among objects and skeleton graphs that encode human poses. Both representations retain enough information for machine learning models to classify CSAI, and their combination improves performance over either modality used separately. This structural release removes all explicit visual content while supporting reproducible research on child safety applications.

What carries the argument

CSA-Graphs dataset of scene graphs for object relationships paired with skeleton graphs for human pose, used as input representations for CSAI classification models.

If this is right

Researchers can now train and evaluate CSAI classifiers using only publicly released graph data.
Combining scene graphs with skeleton graphs produces higher classification accuracy than either graph type alone.
Computer vision work on child safety tools can advance without violating rules against sharing CSAI images.
Graph-based representations become validated as practical substitutes for raw images in restricted visual domains.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same graph substitution method could be tested on other restricted image tasks such as medical or violent content classification.
If the graphs truly block reconstruction, they could serve as a template for privacy standards when releasing other sensitive visual datasets.
New graph neural network designs could be developed and benchmarked directly on the released scene and skeleton structures.

Load-bearing premise

The scene and skeleton graphs preserve enough information to classify CSAI correctly while completely removing explicit visual content and preventing reconstruction of the original images.

What would settle it

A classifier trained only on CSA-Graphs data achieves no better than chance accuracy on a held-out test set of CSAI versus non-CSAI examples.

Figures

Figures reproduced from arXiv: 2604.07132 by Artur Barros, Camila Laranjeira, Carlos Caetano, Clara Ernesto, Jefersson A. dos Santos, Jo\~ao Macedo, Leo S. F. Ribeiro, Sandra Avila.

**Figure 2.** Figure 2: Most frequent scene graph elements in CSA-Graphs. [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗

**Figure 3.** Figure 3: Skeleton pose keypoint detection rate in CSA-Graphs. [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗

**Figure 4.** Figure 4: Qualitative examples illustrating complementary nature [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗

**Figure 6.** Figure 6: Skeleton pose keypoint statistics in CSA-Graphs. [PITH_FULL_IMAGE:figures/full_fig_p012_6.png] view at source ↗

**Figure 5.** Figure 5: Examples of skeleton pose estimation limitations in [PITH_FULL_IMAGE:figures/full_fig_p012_5.png] view at source ↗

**Figure 7.** Figure 7: Confusion matrices of the baseline models evaluated in CSA-Graphs. [PITH_FULL_IMAGE:figures/full_fig_p013_7.png] view at source ↗

read the original abstract

Child Sexual Abuse Imagery (CSAI) classification is an important yet challenging problem for computer vision research due to the strict legal and ethical restrictions that prevent the public sharing of CSAI datasets. This limitation hinders reproducibility and slows progress in developing automated methods. In this work, we introduce CSA-Graphs, a privacy-preserving structural dataset. Instead of releasing the original images, we provide structural representations that remove explicit visual content while preserving contextual information. CSA-Graphs includes two complementary graph-based modalities: scene graphs describing object relationships and skeleton graphs encoding human pose. Experiments show that both representations retain useful information for classifying CSAI, and that combining them further improves performance. This dataset enables broader research on computer vision methods for child safety while respecting legal and ethical constraints.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript introduces CSA-Graphs, a privacy-preserving structural dataset for Child Sexual Abuse Imagery (CSAI) classification research. Rather than releasing original images, the authors provide two graph-based modalities: scene graphs capturing object relationships and skeleton graphs encoding human pose. The central claim is that these representations retain discriminative information sufficient for CSAI classification, with experiments showing that each modality is useful on its own and that their combination yields further performance gains.

Significance. If the retained information and privacy properties hold, the dataset would address a major reproducibility barrier in a legally restricted domain by enabling structural analysis without explicit imagery. The dual-modality design is a reasonable attempt to capture both contextual and pose-based signals. However, the significance is limited by the absence of any rigorous privacy evaluation, which is central to the paper's premise.

major comments (2)

[Dataset Construction] The privacy-preserving claim rests on the assertion that scene and skeleton graphs remove explicit visual content with no feasible reconstruction path, yet the manuscript provides no adversarial robustness analysis, differential privacy bounds, or empirical tests against modern pose-to-image or graph-conditioned generative models. This is load-bearing for the dataset's stated purpose.
[Experiments] The experiments section asserts that both modalities retain useful information for CSAI classification and that combining them improves performance, but reports no concrete metrics (accuracy, F1, AUC), baselines, dataset sizes, train/test splits, or error bars. Without these details the performance claims cannot be evaluated.

minor comments (1)

[Abstract] The abstract would benefit from a single sentence stating the total number of graphs or source images to convey dataset scale.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. The comments highlight important areas for strengthening the privacy claims and experimental reporting. We address each major comment below and will revise the manuscript to incorporate the suggestions.

read point-by-point responses

Referee: [Dataset Construction] The privacy-preserving claim rests on the assertion that scene and skeleton graphs remove explicit visual content with no feasible reconstruction path, yet the manuscript provides no adversarial robustness analysis, differential privacy bounds, or empirical tests against modern pose-to-image or graph-conditioned generative models. This is load-bearing for the dataset's stated purpose.

Authors: We agree that the privacy evaluation is central and that the current manuscript relies primarily on the inherent abstraction of the graph representations rather than formal analysis. Scene graphs and skeleton graphs discard pixel-level details, making direct image reconstruction infeasible without external generative models and additional assumptions about the original scene. However, we did not include adversarial robustness tests or differential privacy bounds. In the revised manuscript, we will add a dedicated privacy discussion subsection that (1) elaborates on why reconstruction from these modalities is limited, (2) references existing work on the challenges of graph-conditioned image synthesis, and (3) explicitly states the absence of empirical adversarial evaluation as a limitation while outlining directions for future work. This provides a more balanced and transparent treatment of the claim. revision: partial
Referee: [Experiments] The experiments section asserts that both modalities retain useful information for CSAI classification and that combining them improves performance, but reports no concrete metrics (accuracy, F1, AUC), baselines, dataset sizes, train/test splits, or error bars. Without these details the performance claims cannot be evaluated.

Authors: We acknowledge that the experimental reporting in the submitted version is insufficiently detailed, making the performance claims difficult to assess. The manuscript describes that both modalities are useful and that their combination yields gains, but does not present the supporting quantitative results. In the revision, we will expand the experiments section to include a table with concrete metrics (accuracy, F1, AUC), explicit baselines (including single-modality and random baselines), dataset sizes, train/test split ratios, and error bars computed over multiple runs. This will allow readers to fully evaluate the claims. revision: yes

Circularity Check

0 steps flagged

No circularity: dataset introduction with no derivation chain or fitted predictions

full rationale

The paper introduces CSA-Graphs as a structural dataset using scene graphs and skeleton graphs extracted from CSAI images. It reports classification experiments showing retained discriminative information but presents no equations, models, or first-principles derivations. Claims about privacy preservation follow directly from the graph construction process (removing pixel data) rather than any reduction to prior fitted parameters or self-citations. No load-bearing steps match the enumerated circularity patterns; the work is self-contained as an empirical dataset release.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Dataset creation paper with no mathematical derivations, fitted parameters, or new postulated entities. All content is based on the abstract alone.

pith-pipeline@v0.9.0 · 5462 in / 1110 out tokens · 37578 ms · 2026-05-10T17:57:00.789830+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

51 extracted references · 51 canonical work pages

[1]

Data-free class-incremental hand gesture recognition

Shubhra Aich, Jesus Ruiz-Santaquiteria, Zhenyu Lu, Prachi Garg, K J Joseph, Alvaro Fernandez Garcia, Vineeth N Bal- asubramanian, Kenrick Kin, Chengde Wan, Necati Cihan Camgoz, Shugao Ma, and Fernando De la Torre. Data-free class-incremental hand gesture recognition. In IEEE/CVF International Conference on Computer Vision (ICCV), 2023. 2, 5

work page 2023
[2]

Classification of static poses based on key point detection for application of incriminated im- age files

Schönbrodt Antonia. Classification of static poses based on key point detection for application of incriminated im- age files. In Working Conference on Artificial Intelligence Development for a Resilient and Sustainable Tomorrow,

work page
[3]

Australian Government. Criminal code act 1995 (cth), divi- sion 474: Telecommunications offences.https://publ icdefenders.nsw.gov.au/documents/sentenc ing-tables-index/sexual-offences/s474- 22- cth- code- use- carriage- service- for- child-abuse-material.pdf, 1995. Accessed March 7, 2026. 1

work page 1995
[4]

dos Santos, and Sandra Avila

Artur Barros, Carlos Caetano, João Macedo, Jefersson A. dos Santos, and Sandra Avila. Attention over scene graphs: Indoor scene representations toward csai classification. In British Machine Vision Conference Workshops (BMVCW),

work page
[5]

How attentive are graph attention networks? In International Conference on Learning Representations (ICLR), 2022

Shaked Brody, Uri Alon, and Eran Yahav. How attentive are graph attention networks? In International Conference on Learning Representations (ICLR), 2022. 6

work page 2022
[6]

dos Santos, and Sandra Avila

Carlos Caetano, Leo Sampaio Ferraz Ribeiro, Camila Laran- jeira, Gabriel Oliveira dos Santos, Artur Barros, Caio Petrucci, Andreza Aparecida dos Santos, João Macedo, Gil Carvalho, Fabricio Benevenuto, Jefersson A. dos Santos, and Sandra Avila. Mastering scene understanding: Scene graphs to the rescue. In Conference on Graphics, Patterns and Images (SIBGRA...

work page 2024
[7]

Realtime multi-person 2d pose estimation using part affinity fields

Zhe Cao, Tomas Simon, Shih-En Wei, and Yaser Sheikh. Realtime multi-person 2d pose estimation using part affinity fields. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2017. 2, 5

work page 2017
[8]

Xgboost: A scalable tree boosting system

Tianqi Chen and Carlos Guestrin. Xgboost: A scalable tree boosting system. In ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,

work page
[9]

Transformers-based few-shot learn- ing for scene classification in child sexual abuse imagery

Thamiris Coelho, Leo Ribeiro, João Macedo, Jefersson San- tos, and Sandra Avila. Transformers-based few-shot learn- ing for scene classification in child sexual abuse imagery. In Conference on Graphics, Patterns and Images, 2024. 1, 2, 3, 4

work page 2024
[10]

dos Santos, and Sandra Avila

Thamiris Coelho, Leo Sampaio Ferraz Ribeiro, João Macedo, Jefersson A. dos Santos, and Sandra Avila. Min- imizing risk through minimizing model-data interaction: A protocol for relying on proxy tasks when designing child sex- ual abuse imagery detection models. In ACM Conference on Fairness, Accountability, and Transparency (FAccT), 2025. 1, 2, 3, 4

work page 2025
[11]

Laying foundations for ef- fective machine learning in law enforcement

Janis Dalins, Yuriy Tyshetskiy, Campbell Wilson, Mark J Carman, and Douglas Boudry. Laying foundations for ef- fective machine learning in law enforcement. majura–a la- belling schema for child exploitation materials. Digital Investigation, 2018. 3

work page 2018
[12]

Nudetective: A forensic tool to help combat child pornography through automatic nudity detection

Mateus de Castro Polastro and Pedro Monteiro da Silva Eleu- terio. Nudetective: A forensic tool to help combat child pornography through automatic nudity detection. In Workshops on Database and Expert Systems Applications,

work page
[13]

European Parliament and Council of the European Union. Directive 2011/93/eu on combating the sexual abuse and sex- ual exploitation of children and child pornography.https: //eur-lex.europa.eu/legal-content/EN/TXT /?uri=CELEX:32011L0093, 2011. Official Journal of the European Union, L 335, 17 December 2011. 1, 4

work page 2011
[14]

Pornography and child sexual abuse detection in image and video: A comparative evaluation

Abhishek Gangwar, Eduardo Fidalgo, Enrique Alegre, and Víctor González-Castro. Pornography and child sexual abuse detection in image and video: A comparative evaluation. In International Conference on Imaging for Crime Detection and Prevention (ICDP), 2017. 3

work page 2017
[15]

Attm-cnn: Attention and met- ric learning based cnn for pornography, age and child sexual abuse (csa) detection in images

Abhishek Gangwar, Víctor González-Castro, Enrique Ale- gre, and Eduardo Fidalgo. Attm-cnn: Attention and met- ric learning based cnn for pornography, age and child sexual abuse (csa) detection in images. Neurocomputing, 2021. 1, 2, 3

work page 2021
[16]

Susanna Greijer, Jaap Doek, and Interagency Working Group. Terminology guidelines for the protection of chil- dren from sexual exploitation and sexual abuse.https: //www.ecpat.org/wp-content/uploads/2016 /12/Terminology-guidelines_ENG.pdf, 2016. Adopted by the Interagency Working Group in Luxembourg. 1

work page 2016
[17]

Ex- ploiting scene graphs for human-object interaction detection

Tao He, Lianli Gao, Jingkuan Song, and Yuan-Fang Li. Ex- ploiting scene graphs for human-object interaction detection. IEEE/CVF International Conference on Computer Vision (ICCV), 2021. 2

work page 2021
[18]

Shamma, Michael S

Justin Johnson, Ranjay Krishna, Michael Stark, Li Jia Li, David A. Shamma, Michael S. Bernstein, and Fei Fei Li. Im- age retrieval using scene graphs. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2015. 4

work page 2015
[19]

Adaptive visual scene understanding: incremental scene graph gen- eration

Naitik Khandelwal, Xiao Liu, and Mengmi Zhang. Adaptive visual scene understanding: incremental scene graph gen- eration. In International Conference on Neural Information Processing Systems (Neurips), 2024. 2

work page 2024
[20]

The chal- lenges of identifying and classifying child sexual abuse ma- terial

Juliane A Kloess, Jessica Woodhams, Helen Whittle, Tim Grant, and Catherine E Hamilton-Giachritsis. The chal- lenges of identifying and classifying child sexual abuse ma- terial. Sexual Abuse, 2019. 4, 5

work page 2019
[21]

The challenges of identifying and classifying child sexual exploitation material: Moving to- wards a more ecologically valid pilot study with digital forensics analysts

Juliane A Kloess, Jessica Woodhams, and Catherine E Hamilton-Giachritsis. The challenges of identifying and classifying child sexual exploitation material: Moving to- wards a more ecologically valid pilot study with digital forensics analysts. Child Abuse & Neglect, 2021. 4

work page 2021
[22]

Visual genome: Connecting language and vision using crowdsourced dense image annotations

Ranjay Krishna, Yuke Zhu, Oliver Groth, Justin Johnson, Kenji Hata, Joshua Kravitz, Stephanie Chen, Yannis Kalan- tidis, Li-Jia Li, David A Shamma, et al. Visual genome: Connecting language and vision using crowdsourced dense image annotations. International Journal of Computer Vision (IJCV), 2017. 4

work page 2017
[23]

Seeing without looking: Analysis pipeline for child sexual abuse datasets

Camila Laranjeira, Joã Macedo, Sandra Avila, and Jefersson dos Santos. Seeing without looking: Analysis pipeline for child sexual abuse datasets. InACM Conference on Fairness, Accountability, and Transparency (FAccT), 2022. 1, 3, 4

work page 2022
[24]

dos Santos

Camila Laranjeira, João Macedo, Sandra Avila, Fabrício Benevenuto, and Jefersson A. dos Santos. Human-centric perception for child sexual abuse imagery, 2026. 3, 4

work page 2026
[25]

Detecting child sexual abuse material: A com- prehensive survey

Hee-Eun Lee, Tatiana Ermakova, Vasilis Ververis, and Ben- jamin Fabian. Detecting child sexual abuse material: A com- prehensive survey. Forensic Science International: Digital Investigation, 34, 2020. 1

work page 2020
[26]

Baiyi Li, Edmond S. L. Ho, Hubert P. H. Shum, and He Wang. Two-person interaction augmentation with skeleton priors. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2024. 2, 5

work page 2024
[27]

Scene graph generation: A comprehensive survey

Hongsheng Li, Guangming Zhu, Liang Zhang, You- liang Jiang, Yixuan Dang, Haoran Hou, Peiyi Shen, Xia Zhao, Syed Afaq Ali Shah, and Mohammed Ben- namoun. Scene graph generation: A comprehensive survey. Neurocomputing, 2024. 2

work page 2024
[28]

From pixels to graphs: Open-vocabulary scene graph generation with vision-language models

Rongjie Li, Songyang Zhang, Dahua Lin, Kai Chen, and Xuming He. From pixels to graphs: Open-vocabulary scene graph generation with vision-language models. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024. 4

work page 2024
[29]

Lawrence Zitnick

Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C. Lawrence Zitnick. Microsoft coco: Common objects in context. In IEEE/CVF European Conference on Computer Vision (ECCV), 2014. 5

work page 2014
[30]

dos Santos

Joã Macedo, Filipe Costa, and Jefersson A. dos Santos. A benchmark methodology for child pornography detection. In Conference on Graphics, Patterns and Images (SIBGRAPI),

work page
[31]

João Macedo, Camila Laranjeira, Leo S. F. Ribeiro, Car- los Caetano, Fabricio Benevenuto, Sandra Avila, and Jefers- son A. dos Santos. Child sexual abuse datasets: A systematic review. Research Square, 2025. Preprint, Version 1. 1, 2, 3

work page 2025
[32]

Open sourcing a deep learning solution for detecting nsfw images.https://ya hooeng.tumblr.com/post/151148689421/op en- sourcing- a- deep- learning- solution- for, 2016

Jay Mahadeokar and Gerry Pesavento. Open sourcing a deep learning solution for detecting nsfw images.https://ya hooeng.tumblr.com/post/151148689421/op en- sourcing- a- deep- learning- solution- for, 2016. Accessed: 2025-01-12. 3

work page arXiv 2016
[33]

Photodna — fighting the harmful content problem.https://www.microsoft.com/en- us/photodna, 2017

Microsoft Inc. Photodna — fighting the harmful content problem.https://www.microsoft.com/en- us/photodna, 2017. 1

work page 2017
[34]

Cyber- tipline report 2024.https://www.missingkids.or g/gethelpnow/cybertipline/cybertiplineda ta, 2025

National Center for Missing & Exploited Children. Cyber- tipline report 2024.https://www.missingkids.or g/gethelpnow/cybertipline/cybertiplineda ta, 2025. Accessed March 6, 2026. 1

work page 2024
[35]

Pose2room: Understanding 3d scenes from human activities

Yinyu Nie, Angela Dai, Xiaoguang Han, and Matthias Nießner. Pose2room: Understanding 3d scenes from human activities. In IEEE/CVF European Conference on Computer Vision (ECCV), 2022. 2

work page 2022
[36]

Using expert-reviewed csam to train cnns and its anthropological analysis

Wojciech Oronowicz-Ja ´skowiak, Tomasz Kozłowski, Marta Pola´nska, Jerzy Wojciechowski, Piotr Wasilewski, Dominik ´Sl˛ ezak, and Mirosław Kowaluk. Using expert-reviewed csam to train cnns and its anthropological analysis. Journal of Forensic and Legal Medicine, 2024. 2, 3

work page 2024
[37]

icop: Live forensics to re- veal previously unknown criminal media on p2p networks

Claudia Peersman, Christian Schulze, Awais Rashid, Mar- garet Brennan, and Carl Fischer. icop: Live forensics to re- veal previously unknown criminal media on p2p networks. Digital Investigation, 2016. 2, 3

work page 2016
[38]

Lei nº 11.829, de 25 de novembro de 2008.http://www.planalto.gov.b r/ccivil_03/_Ato2007-2010/2008/Lei/L118 29.htm, 2008

Presidência da República do Brasil. Lei nº 11.829, de 25 de novembro de 2008.http://www.planalto.gov.b r/ccivil_03/_Ato2007-2010/2008/Lei/L118 29.htm, 2008. Accessed March 6, 2026. 1, 4

work page 2008
[39]

Action scene graphs for long- form understanding of egocentric videos

Ivan Rodin, Antonino Furnari, Kyle Min, Subarna Tripathi, and Giovanni Maria Farinella. Action scene graphs for long- form understanding of egocentric videos. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024. 2

work page 2024
[40]

A deep learning framework for find- ing illicit images/videos of children

Jared Rondeau, Douglas Deslauriers, Thomas Howard III, and Marco Alvarez. A deep learning framework for find- ing illicit images/videos of children. Machine Vision and Applications, 2022. 1, 2, 3

work page 2022
[41]

Towards automatic detection of child pornogra- phy

Napa Sae-Bae, Xiaoxi Sun, Husrev T Sencar, and Nasir D Memon. Towards automatic detection of child pornogra- phy. In IEEE International Conference on Image Processing (ICIP), pages 5332–5336, 2014. 2

work page 2014
[42]

Yolo26: Key architectural enhancements and performance bench- marking for real-time object detection

Ranjan Sapkota, Rahul Harsha Cheppally, Ajay Sharda, and Manoj Karkee. Yolo26: Key architectural enhancements and performance benchmarking for real-time object detec- tion. arXiv 2509.25164, 2026.https://github.com /ultralytics/ultralytics. 5

work page arXiv 2026
[43]

Automatic detection of csa media by multi-modal feature fusion for law enforcement support

Christian Schulze, Dominik Henter, Damian Borth, and Andreas Dengel. Automatic detection of csa media by multi-modal feature fusion for law enforcement support. In International Conference on Multimedia Retrieval (ICMR),

work page
[44]

Porno- graphic content classification using deep-learning

André Tabone, Kenneth Camilleri, Alexandra Bonnici, Ste- fania Cristina, Reuben Farrugia, and Mark Borg. Porno- graphic content classification using deep-learning. In ACM Symposium on Document Engineering, 2021. 3

work page 2021
[45]

Automatic detection of child pornography using color visual words

Adrian Ulges and Armin Stahl. Automatic detection of child pornography using color visual words. In IEEE international conference on multimedia and expo, 2011. 2, 3

work page 2011
[46]

National strategy for child exploitation prevention and interdiction: Child sexual abuse material.https://www.justice.gov/d9/2 023-06/child_sexual_abuse_material_2.p df, 2023

United States Department of Justice. National strategy for child exploitation prevention and interdiction: Child sexual abuse material.https://www.justice.gov/d9/2 023-06/child_sexual_abuse_material_2.p df, 2023. Accessed March 6, 2026. 1, 4

work page 2023
[47]

Valois, João Macedo, Leo S.F

Pedro H.V . Valois, João Macedo, Leo S.F. Ribeiro, Jefers- son A. dos Santos, and Sandra Avila. Leveraging self- supervised learning for scene classification in child sex- ual abuse imagery. Forensic Science International: Digital Investigation, 2025. 1, 2, 3, 4

work page 2025
[48]

Leveraging deep neural networks to fight child pornography in the age of social media

Paulo Vitorino, Sandra Avila, Mauricio Perez, and Ander- son Rocha. Leveraging deep neural networks to fight child pornography in the age of social media. Journal of Visual Communication and Image Representation, 2018. 3

work page 2018
[49]

Heterogeneous Skeleton-Based Action Representation Learning

Hongsong Wang, Xiaoyan Ma, Jidong Kuang, and Jie Gui. Heterogeneous Skeleton-Based Action Representation Learning . In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025. 5

work page 2025
[50]

On the detection of images containing child- pornographic material

Emilios Yiallourou, Rafaella Demetriou, and Andreas Lanitis. On the detection of images containing child- pornographic material. In IEEE International Conference on Telecommunications (ICT), 2017. 5

work page 2017
[51]

Deep learning-based human pose estimation: A survey

Ce Zheng, Wenhan Wu, Chen Chen, Taojiannan Yang, Sijie Zhu, Ju Shen, Nasser Kehtarnavaz, and Mubarak Shah. Deep learning-based human pose estimation: A survey. ACM Computing Surveys, 2023. 2 CSA-Graphs: A Privacy-Preserving Structural Dataset for Child Sexual Abuse Research Supplementary Material This supplementary material provides additional analy- ses ...

work page 2023

[1] [1]

Data-free class-incremental hand gesture recognition

Shubhra Aich, Jesus Ruiz-Santaquiteria, Zhenyu Lu, Prachi Garg, K J Joseph, Alvaro Fernandez Garcia, Vineeth N Bal- asubramanian, Kenrick Kin, Chengde Wan, Necati Cihan Camgoz, Shugao Ma, and Fernando De la Torre. Data-free class-incremental hand gesture recognition. In IEEE/CVF International Conference on Computer Vision (ICCV), 2023. 2, 5

work page 2023

[2] [2]

Classification of static poses based on key point detection for application of incriminated im- age files

Schönbrodt Antonia. Classification of static poses based on key point detection for application of incriminated im- age files. In Working Conference on Artificial Intelligence Development for a Resilient and Sustainable Tomorrow,

work page

[3] [3]

Australian Government. Criminal code act 1995 (cth), divi- sion 474: Telecommunications offences.https://publ icdefenders.nsw.gov.au/documents/sentenc ing-tables-index/sexual-offences/s474- 22- cth- code- use- carriage- service- for- child-abuse-material.pdf, 1995. Accessed March 7, 2026. 1

work page 1995

[4] [4]

dos Santos, and Sandra Avila

Artur Barros, Carlos Caetano, João Macedo, Jefersson A. dos Santos, and Sandra Avila. Attention over scene graphs: Indoor scene representations toward csai classification. In British Machine Vision Conference Workshops (BMVCW),

work page

[5] [5]

How attentive are graph attention networks? In International Conference on Learning Representations (ICLR), 2022

Shaked Brody, Uri Alon, and Eran Yahav. How attentive are graph attention networks? In International Conference on Learning Representations (ICLR), 2022. 6

work page 2022

[6] [6]

dos Santos, and Sandra Avila

Carlos Caetano, Leo Sampaio Ferraz Ribeiro, Camila Laran- jeira, Gabriel Oliveira dos Santos, Artur Barros, Caio Petrucci, Andreza Aparecida dos Santos, João Macedo, Gil Carvalho, Fabricio Benevenuto, Jefersson A. dos Santos, and Sandra Avila. Mastering scene understanding: Scene graphs to the rescue. In Conference on Graphics, Patterns and Images (SIBGRA...

work page 2024

[7] [7]

Realtime multi-person 2d pose estimation using part affinity fields

Zhe Cao, Tomas Simon, Shih-En Wei, and Yaser Sheikh. Realtime multi-person 2d pose estimation using part affinity fields. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2017. 2, 5

work page 2017

[8] [8]

Xgboost: A scalable tree boosting system

Tianqi Chen and Carlos Guestrin. Xgboost: A scalable tree boosting system. In ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,

work page

[9] [9]

Transformers-based few-shot learn- ing for scene classification in child sexual abuse imagery

Thamiris Coelho, Leo Ribeiro, João Macedo, Jefersson San- tos, and Sandra Avila. Transformers-based few-shot learn- ing for scene classification in child sexual abuse imagery. In Conference on Graphics, Patterns and Images, 2024. 1, 2, 3, 4

work page 2024

[10] [10]

dos Santos, and Sandra Avila

Thamiris Coelho, Leo Sampaio Ferraz Ribeiro, João Macedo, Jefersson A. dos Santos, and Sandra Avila. Min- imizing risk through minimizing model-data interaction: A protocol for relying on proxy tasks when designing child sex- ual abuse imagery detection models. In ACM Conference on Fairness, Accountability, and Transparency (FAccT), 2025. 1, 2, 3, 4

work page 2025

[11] [11]

Laying foundations for ef- fective machine learning in law enforcement

Janis Dalins, Yuriy Tyshetskiy, Campbell Wilson, Mark J Carman, and Douglas Boudry. Laying foundations for ef- fective machine learning in law enforcement. majura–a la- belling schema for child exploitation materials. Digital Investigation, 2018. 3

work page 2018

[12] [12]

Nudetective: A forensic tool to help combat child pornography through automatic nudity detection

Mateus de Castro Polastro and Pedro Monteiro da Silva Eleu- terio. Nudetective: A forensic tool to help combat child pornography through automatic nudity detection. In Workshops on Database and Expert Systems Applications,

work page

[13] [13]

European Parliament and Council of the European Union. Directive 2011/93/eu on combating the sexual abuse and sex- ual exploitation of children and child pornography.https: //eur-lex.europa.eu/legal-content/EN/TXT /?uri=CELEX:32011L0093, 2011. Official Journal of the European Union, L 335, 17 December 2011. 1, 4

work page 2011

[14] [14]

Pornography and child sexual abuse detection in image and video: A comparative evaluation

Abhishek Gangwar, Eduardo Fidalgo, Enrique Alegre, and Víctor González-Castro. Pornography and child sexual abuse detection in image and video: A comparative evaluation. In International Conference on Imaging for Crime Detection and Prevention (ICDP), 2017. 3

work page 2017

[15] [15]

Attm-cnn: Attention and met- ric learning based cnn for pornography, age and child sexual abuse (csa) detection in images

Abhishek Gangwar, Víctor González-Castro, Enrique Ale- gre, and Eduardo Fidalgo. Attm-cnn: Attention and met- ric learning based cnn for pornography, age and child sexual abuse (csa) detection in images. Neurocomputing, 2021. 1, 2, 3

work page 2021

[16] [16]

Susanna Greijer, Jaap Doek, and Interagency Working Group. Terminology guidelines for the protection of chil- dren from sexual exploitation and sexual abuse.https: //www.ecpat.org/wp-content/uploads/2016 /12/Terminology-guidelines_ENG.pdf, 2016. Adopted by the Interagency Working Group in Luxembourg. 1

work page 2016

[17] [17]

Ex- ploiting scene graphs for human-object interaction detection

Tao He, Lianli Gao, Jingkuan Song, and Yuan-Fang Li. Ex- ploiting scene graphs for human-object interaction detection. IEEE/CVF International Conference on Computer Vision (ICCV), 2021. 2

work page 2021

[18] [18]

Shamma, Michael S

Justin Johnson, Ranjay Krishna, Michael Stark, Li Jia Li, David A. Shamma, Michael S. Bernstein, and Fei Fei Li. Im- age retrieval using scene graphs. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2015. 4

work page 2015

[19] [19]

Adaptive visual scene understanding: incremental scene graph gen- eration

Naitik Khandelwal, Xiao Liu, and Mengmi Zhang. Adaptive visual scene understanding: incremental scene graph gen- eration. In International Conference on Neural Information Processing Systems (Neurips), 2024. 2

work page 2024

[20] [20]

The chal- lenges of identifying and classifying child sexual abuse ma- terial

Juliane A Kloess, Jessica Woodhams, Helen Whittle, Tim Grant, and Catherine E Hamilton-Giachritsis. The chal- lenges of identifying and classifying child sexual abuse ma- terial. Sexual Abuse, 2019. 4, 5

work page 2019

[21] [21]

The challenges of identifying and classifying child sexual exploitation material: Moving to- wards a more ecologically valid pilot study with digital forensics analysts

Juliane A Kloess, Jessica Woodhams, and Catherine E Hamilton-Giachritsis. The challenges of identifying and classifying child sexual exploitation material: Moving to- wards a more ecologically valid pilot study with digital forensics analysts. Child Abuse & Neglect, 2021. 4

work page 2021

[22] [22]

Visual genome: Connecting language and vision using crowdsourced dense image annotations

Ranjay Krishna, Yuke Zhu, Oliver Groth, Justin Johnson, Kenji Hata, Joshua Kravitz, Stephanie Chen, Yannis Kalan- tidis, Li-Jia Li, David A Shamma, et al. Visual genome: Connecting language and vision using crowdsourced dense image annotations. International Journal of Computer Vision (IJCV), 2017. 4

work page 2017

[23] [23]

Seeing without looking: Analysis pipeline for child sexual abuse datasets

Camila Laranjeira, Joã Macedo, Sandra Avila, and Jefersson dos Santos. Seeing without looking: Analysis pipeline for child sexual abuse datasets. InACM Conference on Fairness, Accountability, and Transparency (FAccT), 2022. 1, 3, 4

work page 2022

[24] [24]

dos Santos

Camila Laranjeira, João Macedo, Sandra Avila, Fabrício Benevenuto, and Jefersson A. dos Santos. Human-centric perception for child sexual abuse imagery, 2026. 3, 4

work page 2026

[25] [25]

Detecting child sexual abuse material: A com- prehensive survey

Hee-Eun Lee, Tatiana Ermakova, Vasilis Ververis, and Ben- jamin Fabian. Detecting child sexual abuse material: A com- prehensive survey. Forensic Science International: Digital Investigation, 34, 2020. 1

work page 2020

[26] [26]

Baiyi Li, Edmond S. L. Ho, Hubert P. H. Shum, and He Wang. Two-person interaction augmentation with skeleton priors. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2024. 2, 5

work page 2024

[27] [27]

Scene graph generation: A comprehensive survey

Hongsheng Li, Guangming Zhu, Liang Zhang, You- liang Jiang, Yixuan Dang, Haoran Hou, Peiyi Shen, Xia Zhao, Syed Afaq Ali Shah, and Mohammed Ben- namoun. Scene graph generation: A comprehensive survey. Neurocomputing, 2024. 2

work page 2024

[28] [28]

From pixels to graphs: Open-vocabulary scene graph generation with vision-language models

Rongjie Li, Songyang Zhang, Dahua Lin, Kai Chen, and Xuming He. From pixels to graphs: Open-vocabulary scene graph generation with vision-language models. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024. 4

work page 2024

[29] [29]

Lawrence Zitnick

Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C. Lawrence Zitnick. Microsoft coco: Common objects in context. In IEEE/CVF European Conference on Computer Vision (ECCV), 2014. 5

work page 2014

[30] [30]

dos Santos

Joã Macedo, Filipe Costa, and Jefersson A. dos Santos. A benchmark methodology for child pornography detection. In Conference on Graphics, Patterns and Images (SIBGRAPI),

work page

[31] [31]

João Macedo, Camila Laranjeira, Leo S. F. Ribeiro, Car- los Caetano, Fabricio Benevenuto, Sandra Avila, and Jefers- son A. dos Santos. Child sexual abuse datasets: A systematic review. Research Square, 2025. Preprint, Version 1. 1, 2, 3

work page 2025

[32] [32]

Open sourcing a deep learning solution for detecting nsfw images.https://ya hooeng.tumblr.com/post/151148689421/op en- sourcing- a- deep- learning- solution- for, 2016

Jay Mahadeokar and Gerry Pesavento. Open sourcing a deep learning solution for detecting nsfw images.https://ya hooeng.tumblr.com/post/151148689421/op en- sourcing- a- deep- learning- solution- for, 2016. Accessed: 2025-01-12. 3

work page arXiv 2016

[33] [33]

Photodna — fighting the harmful content problem.https://www.microsoft.com/en- us/photodna, 2017

Microsoft Inc. Photodna — fighting the harmful content problem.https://www.microsoft.com/en- us/photodna, 2017. 1

work page 2017

[34] [34]

Cyber- tipline report 2024.https://www.missingkids.or g/gethelpnow/cybertipline/cybertiplineda ta, 2025

National Center for Missing & Exploited Children. Cyber- tipline report 2024.https://www.missingkids.or g/gethelpnow/cybertipline/cybertiplineda ta, 2025. Accessed March 6, 2026. 1

work page 2024

[35] [35]

Pose2room: Understanding 3d scenes from human activities

Yinyu Nie, Angela Dai, Xiaoguang Han, and Matthias Nießner. Pose2room: Understanding 3d scenes from human activities. In IEEE/CVF European Conference on Computer Vision (ECCV), 2022. 2

work page 2022

[36] [36]

Using expert-reviewed csam to train cnns and its anthropological analysis

Wojciech Oronowicz-Ja ´skowiak, Tomasz Kozłowski, Marta Pola´nska, Jerzy Wojciechowski, Piotr Wasilewski, Dominik ´Sl˛ ezak, and Mirosław Kowaluk. Using expert-reviewed csam to train cnns and its anthropological analysis. Journal of Forensic and Legal Medicine, 2024. 2, 3

work page 2024

[37] [37]

icop: Live forensics to re- veal previously unknown criminal media on p2p networks

Claudia Peersman, Christian Schulze, Awais Rashid, Mar- garet Brennan, and Carl Fischer. icop: Live forensics to re- veal previously unknown criminal media on p2p networks. Digital Investigation, 2016. 2, 3

work page 2016

[38] [38]

Lei nº 11.829, de 25 de novembro de 2008.http://www.planalto.gov.b r/ccivil_03/_Ato2007-2010/2008/Lei/L118 29.htm, 2008

Presidência da República do Brasil. Lei nº 11.829, de 25 de novembro de 2008.http://www.planalto.gov.b r/ccivil_03/_Ato2007-2010/2008/Lei/L118 29.htm, 2008. Accessed March 6, 2026. 1, 4

work page 2008

[39] [39]

Action scene graphs for long- form understanding of egocentric videos

Ivan Rodin, Antonino Furnari, Kyle Min, Subarna Tripathi, and Giovanni Maria Farinella. Action scene graphs for long- form understanding of egocentric videos. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024. 2

work page 2024

[40] [40]

A deep learning framework for find- ing illicit images/videos of children

Jared Rondeau, Douglas Deslauriers, Thomas Howard III, and Marco Alvarez. A deep learning framework for find- ing illicit images/videos of children. Machine Vision and Applications, 2022. 1, 2, 3

work page 2022

[41] [41]

Towards automatic detection of child pornogra- phy

Napa Sae-Bae, Xiaoxi Sun, Husrev T Sencar, and Nasir D Memon. Towards automatic detection of child pornogra- phy. In IEEE International Conference on Image Processing (ICIP), pages 5332–5336, 2014. 2

work page 2014

[42] [42]

Yolo26: Key architectural enhancements and performance bench- marking for real-time object detection

Ranjan Sapkota, Rahul Harsha Cheppally, Ajay Sharda, and Manoj Karkee. Yolo26: Key architectural enhancements and performance benchmarking for real-time object detec- tion. arXiv 2509.25164, 2026.https://github.com /ultralytics/ultralytics. 5

work page arXiv 2026

[43] [43]

Automatic detection of csa media by multi-modal feature fusion for law enforcement support

Christian Schulze, Dominik Henter, Damian Borth, and Andreas Dengel. Automatic detection of csa media by multi-modal feature fusion for law enforcement support. In International Conference on Multimedia Retrieval (ICMR),

work page

[44] [44]

Porno- graphic content classification using deep-learning

André Tabone, Kenneth Camilleri, Alexandra Bonnici, Ste- fania Cristina, Reuben Farrugia, and Mark Borg. Porno- graphic content classification using deep-learning. In ACM Symposium on Document Engineering, 2021. 3

work page 2021

[45] [45]

Automatic detection of child pornography using color visual words

Adrian Ulges and Armin Stahl. Automatic detection of child pornography using color visual words. In IEEE international conference on multimedia and expo, 2011. 2, 3

work page 2011

[46] [46]

National strategy for child exploitation prevention and interdiction: Child sexual abuse material.https://www.justice.gov/d9/2 023-06/child_sexual_abuse_material_2.p df, 2023

United States Department of Justice. National strategy for child exploitation prevention and interdiction: Child sexual abuse material.https://www.justice.gov/d9/2 023-06/child_sexual_abuse_material_2.p df, 2023. Accessed March 6, 2026. 1, 4

work page 2023

[47] [47]

Valois, João Macedo, Leo S.F

Pedro H.V . Valois, João Macedo, Leo S.F. Ribeiro, Jefers- son A. dos Santos, and Sandra Avila. Leveraging self- supervised learning for scene classification in child sex- ual abuse imagery. Forensic Science International: Digital Investigation, 2025. 1, 2, 3, 4

work page 2025

[48] [48]

Leveraging deep neural networks to fight child pornography in the age of social media

Paulo Vitorino, Sandra Avila, Mauricio Perez, and Ander- son Rocha. Leveraging deep neural networks to fight child pornography in the age of social media. Journal of Visual Communication and Image Representation, 2018. 3

work page 2018

[49] [49]

Heterogeneous Skeleton-Based Action Representation Learning

Hongsong Wang, Xiaoyan Ma, Jidong Kuang, and Jie Gui. Heterogeneous Skeleton-Based Action Representation Learning . In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025. 5

work page 2025

[50] [50]

On the detection of images containing child- pornographic material

Emilios Yiallourou, Rafaella Demetriou, and Andreas Lanitis. On the detection of images containing child- pornographic material. In IEEE International Conference on Telecommunications (ICT), 2017. 5

work page 2017

[51] [51]

Deep learning-based human pose estimation: A survey

Ce Zheng, Wenhan Wu, Chen Chen, Taojiannan Yang, Sijie Zhu, Ju Shen, Nasser Kehtarnavaz, and Mubarak Shah. Deep learning-based human pose estimation: A survey. ACM Computing Surveys, 2023. 2 CSA-Graphs: A Privacy-Preserving Structural Dataset for Child Sexual Abuse Research Supplementary Material This supplementary material provides additional analy- ses ...

work page 2023