Safe-Child-LLM: A Developmental Benchmark for Evaluating LLM Safety in Child-LLM Interactions
Pith reviewed 2026-05-19 09:28 UTC · model grok-4.3
The pith
Leading LLMs show critical safety shortfalls when tested against child and adolescent users.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We introduce Safe-Child-LLM, a benchmark and dataset that evaluates LLM safety across two developmental stages using 200 adversarial prompts with human-annotated jailbreak and 0-5 ethical refusal labels, and we show that leading models exhibit critical safety deficiencies in child-facing scenarios.
What carries the argument
The Safe-Child-LLM multi-part dataset of 200 adversarial prompts, sourced from red-teaming corpora and labeled for jailbreak success plus ethical refusal on a 0-5 scale, applied separately to child and adolescent age groups.
If this is right
- Developers must add age-specific refusal mechanisms beyond those used for adult users.
- Adult-only safety evaluations leave measurable gaps when models are deployed with minors.
- Public release of child-focused adversarial datasets can accelerate community improvements in ethical AI.
- Continuous benchmark updates will be needed as new models and prompt techniques emerge.
Where Pith is reading between the lines
- The same prompt set could be adapted to measure safety differences between open-source and closed-source models over time.
- Regulators might use similar age-graded tests when setting standards for AI tools in schools or family apps.
- Real-world logging of child-AI conversations could provide a stronger validation signal than static prompt sets alone.
Load-bearing premise
The 200 adversarial prompts and the 0-5 ethical refusal scale together capture the safety risks that actually matter for real children and adolescents.
What would settle it
A controlled study in which real children or adolescents interact with the same models and the models refuse all harmful requests that the benchmark prompts were meant to elicit.
read the original abstract
As Large Language Models (LLMs) increasingly power applications used by children and adolescents, ensuring safe and age-appropriate interactions has become an urgent ethical imperative. Despite progress in AI safety, current evaluations predominantly focus on adults, neglecting the unique vulnerabilities of minors engaging with generative AI. We introduce Safe-Child-LLM, a comprehensive benchmark and dataset for systematically assessing LLM safety across two developmental stages: children (7-12) and adolescents (13-17). Our framework includes a novel multi-part dataset of 200 adversarial prompts, curated from red-teaming corpora (e.g., SG-Bench, HarmBench), with human-annotated labels for jailbreak success and a standardized 0-5 ethical refusal scale. Evaluating leading LLMs -- including ChatGPT, Claude, Gemini, LLaMA, DeepSeek, Grok, Vicuna, and Mistral -- we uncover critical safety deficiencies in child-facing scenarios. This work highlights the need for community-driven benchmarks to protect young users in LLM interactions. To promote transparency and collaborative advancement in ethical AI development, we are publicly releasing both our benchmark datasets and evaluation codebase at https://github.com/The-Responsible-AI-Initiative/Safe_Child_LLM_Benchmark.git
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper presents Safe-Child-LLM, a benchmark and dataset for evaluating LLM safety in interactions with children (ages 7-12) and adolescents (13-17). It consists of 200 adversarial prompts curated from existing red-teaming corpora such as SG-Bench and HarmBench, human-annotated for jailbreak success and scored on a 0-5 ethical refusal scale. The authors evaluate eight leading LLMs (ChatGPT, Claude, Gemini, LLaMA, DeepSeek, Grok, Vicuna, Mistral) and report critical safety deficiencies in child-facing scenarios, while releasing the benchmark and code publicly.
Significance. If the prompts and annotation scale validly capture age-specific risks rather than generic adult jailbreak patterns, the work would provide a useful starting point for community benchmarks in child-AI safety. The public release of datasets and code is a positive contribution to reproducibility in this area.
major comments (2)
- [Abstract / Dataset construction] Abstract and dataset description: The 200 prompts are described as 'curated from red-teaming corpora (e.g., SG-Bench, HarmBench)' with no details on adaptation or filtering for developmental stages. Because these source corpora target adult users, it is unclear whether the resulting prompts distinguish risks such as grooming, emotional manipulation, or age-inappropriate self-disclosure that are central to child safety. Without pilot validation against child-development literature or naturalistic logs, the headline claim of 'critical safety deficiencies in child-facing scenarios' rests on an untested assumption that adult-derived adversarial prompts measure the relevant risks.
- [Abstract / Results] Evaluation section: The abstract states that deficiencies were found but supplies no quantitative results (e.g., mean refusal scores per model and age group, inter-annotator agreement, or baseline comparisons). This prevents verification of the data-to-claim link and makes it impossible to assess whether the observed deficiencies are statistically or practically significant.
minor comments (2)
- [Dataset description] Clarify the exact prompt-selection criteria and any modifications made to the source corpora to target the two developmental stages.
- [Annotation procedure] Specify the number of annotators, their qualifications, and the inter-annotator agreement metric for the 0-5 ethical refusal scale.
Simulated Author's Rebuttal
We thank the referee for their constructive comments, which help clarify key aspects of our benchmark's construction and reporting. We address each major comment below and indicate revisions to the manuscript.
read point-by-point responses
-
Referee: [Abstract / Dataset construction] Abstract and dataset description: The 200 prompts are described as 'curated from red-teaming corpora (e.g., SG-Bench, HarmBench)' with no details on adaptation or filtering for developmental stages. Because these source corpora target adult users, it is unclear whether the resulting prompts distinguish risks such as grooming, emotional manipulation, or age-inappropriate self-disclosure that are central to child safety. Without pilot validation against child-development literature or naturalistic logs, the headline claim of 'critical safety deficiencies in child-facing scenarios' rests on an untested assumption that adult-derived adversarial prompts measure the relevant risks.
Authors: We appreciate the referee's emphasis on ensuring the prompts capture age-specific risks. The 200 prompts were selected from the source corpora by prioritizing adversarial scenarios involving requests for inappropriate content, manipulation, or self-disclosure that could apply to minors, followed by human annotation that incorporated developmental considerations in both jailbreak success labels and the 0-5 ethical refusal scale. That said, the current manuscript provides limited explicit description of the selection and filtering criteria used to adapt prompts for the 7-12 and 13-17 age groups. We will revise the dataset construction section to add these details, including examples of how prompts were reviewed for relevance to child and adolescent vulnerabilities. We also acknowledge the absence of formal pilot validation against child-development literature or naturalistic interaction logs; this benchmark is positioned as an initial community resource, and we will add an explicit limitations discussion noting this gap and the value of such validation in future extensions. revision: partial
-
Referee: [Abstract / Results] Evaluation section: The abstract states that deficiencies were found but supplies no quantitative results (e.g., mean refusal scores per model and age group, inter-annotator agreement, or baseline comparisons). This prevents verification of the data-to-claim link and makes it impossible to assess whether the observed deficiencies are statistically or practically significant.
Authors: We agree that the abstract would be strengthened by including key quantitative indicators to support the claim of deficiencies. The full manuscript's evaluation section already reports mean refusal scores broken down by model and developmental stage, inter-annotator agreement statistics, and comparisons across the eight evaluated LLMs. To address the referee's point directly, we will revise the abstract to incorporate concise quantitative highlights (e.g., average refusal scores for child vs. adolescent prompts) while respecting length limits, thereby improving the link between data and claims. revision: yes
Circularity Check
No circularity: empirical benchmark with independent evaluation
full rationale
The paper introduces Safe-Child-LLM as a benchmark consisting of 200 adversarial prompts curated from external red-teaming corpora (SG-Bench, HarmBench) together with a standard 0-5 human-annotated refusal scale. It then reports direct empirical evaluations of multiple LLMs against this fixed benchmark. No equations, fitted parameters, predictions, or derivations appear in the abstract or described methodology; the central claims rest on straightforward measurement rather than any self-referential construction, self-citation chain, or renaming of prior results. The work is therefore self-contained as an empirical release.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Human annotations on jailbreak success and a 0-5 ethical refusal scale provide a valid proxy for LLM safety with minors.
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Our framework includes a novel multi-part dataset of 200 adversarial prompts, curated from red-teaming corpora (e.g., SG-Bench, HarmBench), with human-annotated labels for jailbreak success and a standardized 0-5 ethical refusal scale.
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Evaluating leading LLMs ... we uncover critical safety deficiencies in child-facing scenarios.
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Forward citations
Cited by 2 Pith papers
-
Evaluating Cognitive Age Alignment in Interactive AI Agents
The paper presents ChildAgentEval as the first psychometrically grounded benchmark comparing MLLM-based agents' reasoning performance to age-specific human cognitive stages.
-
LLM Harms: A Taxonomy and Discussion
This paper proposes a taxonomy of LLM harms in five categories and suggests mitigation strategies plus a dynamic auditing system for responsible development.
Reference graph
Works this paper leans on
-
[1]
Safe-Child-LLM: A Developmental Benchmark for Evaluating LLM Safety in Child-LLM Interactions
For exam-ple, younger children aged 7-12 may unintentionally elicit harmful guidance regarding self-harm or pranks, whereas teenagers aged 13-17 might seek information related to sub-stance abuse, illegal activities, or extremist beliefs. As pae-diatric clinicians caution that ChatGPT may supply unveri-fied mental-health or medication advice, so any child...
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[2]
Empir-ical research identify three child–LLM use-profiles—test-ing, socializing, and exploring—each exposing distinct safety gaps that current filters rarely anticipate [8]. Moreo-ver, policy scholars warn that hyper-personalised “hyper-nudging” can covertly steer children’s choices, amplifying the urgency for age-specific safeguards [9]. Studies have als...
work page 2024
-
[3]
[Online]. Available: https://www.re-searchgate.net/publication/386207860_Not_every-thing_is_online_grooming_False_risk_find-ing_in_large_language_model_assessments_of_hu-man_conversations
-
[4]
2.Chat-GPT: Op-portunities and Challenges in Child Mental Healthcare,
N. Imran, A. Hashmi, and A. Imran, “2.Chat-GPT: Op-portunities and Challenges in Child Mental Healthcare,” Pak J Med Sci, vol. 39, no. 4, p. 1191, Jul. 2023, doi: 10.12669/PJMS.39.4.8118
-
[5]
T. C. McFayden, S. Bristol, O. Putnam, and C. Harrop, “16.ChatGPT: Artificial Intelligence as a Potential Tool for Parents Seeking Information About Autism,” https://home.liebertpub.com/cyber, vol. 27, no. 2, pp. 135–148, Feb. 2024, doi: 10.1089/CYBER.2023.0202
-
[6]
J. M. Higgs and A. Stornaiuolo, “67.Being Human in the Age of Generative AI: Young People’s Ethical Concerns about Writing and Living with Machines,” Read Res Q, vol. 59, no. 4, pp. 632–650, Oct. 2024, doi: 10.1002/RRQ.552
-
[7]
68.Teens’ Ethical Sensemaking about Emerging Technologies,
R. Landesman, “68.Teens’ Ethical Sensemaking about Emerging Technologies,” ICER 2024 - ACM Conference on International Computing Education Research, vol. 2, pp. 557–559, Aug. 2024, doi: 10.1145/3632621.3671415
-
[8]
A. McStay and G. Rosner, “3.Emotional artificial intelli-gence in children’s toys and devices: Ethics, governance and practical remedies,” Big Data Soc, vol. 8, no. 1, 2021, doi: 10.1177/2053951721994877;WEBSITE:WEB-SITE:SAGE;WGROUP:STRING:PUBLICATION
work page doi:10.1177/2053951721994877;website:web-site:sage;wgroup:string:publication 2021
-
[9]
N. Kurian, “1.‘No, Alexa, no!’: designing child-safe AI and protecting children from the risks of the ‘empathy gap’ in large language models,” Learn Media Technol, Jun. 2024, doi: 10.1080/17439884.2024.2367052
-
[10]
Y. Belghith, A. M. Goloujeh, B. Magerko, D. Long, T. McKlin, and J. Roberts, “60.Testing, Socializing, Explor-ing: Characterizing Middle Schoolers’ Approaches to and Conceptions of ChatGPT,” Conference on Human Fac-tors in Computing Systems - Proceedings, May 2024, doi: 10.1145/3613904.3642332/SUPPL_FILE/3613904.3642332-TALK-VIDEO.VTT
work page doi:10.1145/3613904.3642332/suppl_file/3613904.3642332-talk-video.vtt 2024
-
[11]
5.Hey, Google, leave those kids alone: Against hypernudging children in the age of big data,
J. Smith and T. de Villiers-Botha, “5.Hey, Google, leave those kids alone: Against hypernudging children in the age of big data,” AI Soc, vol. 38, no. 4, pp. 1639–1649, Aug. 2023, doi: 10.1007/S00146-021-01314-W/MET-RICS
-
[12]
Z. Hu, H. Hou, and S. Ni, “61.Grow with Your AI Buddy: Designing an LLMs-based Conversational Agent for the Measurement and Cultivation of Children?s Mental Re-silience,” Proceedings of ACM Interaction Design and 12 Children Conference: Inclusive Happiness, IDC 2024, pp. 811–817, Jun. 2024, doi: 10.1145/3628516.3659399
-
[13]
45.‘Can we just Please slow it all Down?’ School Lead-ers Take on ChatGPT,
J. Dunnigan, D. Henriksen, P. Mishra, and R. Lake, “45.‘Can we just Please slow it all Down?’ School Lead-ers Take on ChatGPT,” TechTrends, vol. 67, no. 6, pp. 878–884, Nov. 2023, doi: 10.1007/S11528-023-00914-1/METRICS
-
[14]
32.Generative AI and K-12 Education: An MIT Perspec-tive,
E. Klopfer, J. Reich, H. Abelson, and C. Breazeal, “32.Generative AI and K-12 Education: An MIT Perspec-tive,” An MIT Exploration of Generative AI, Mar. 2024, doi: 10.21428/E4BAEDD9.81164B06
-
[15]
41.Better AI for Kids: Learning from the AI-OPiNE Study,
P. P. Rafful and S. R. Teixeira, “41.Better AI for Kids: Learning from the AI-OPiNE Study,” https://doi.org/10.1148/ryai.240376, vol. 6, no. 5, Aug. 2024, doi: 10.1148/RYAI.240376
-
[16]
53.Beginning and first-year language teachers’ readiness for the generative AI age,
B. L. Moorhouse, “53.Beginning and first-year language teachers’ readiness for the generative AI age,” Computers and Education: Artificial Intelligence, vol. 6, p. 100201, Jun. 2024, doi: 10.1016/J.CAEAI.2024.100201
-
[17]
J. Su and W. Yang, “62.Powerful or mediocre? Kinder-garten teachers’ perspectives on using ChatGPT in early childhood education,” Interactive Learning Environ-ments, Nov. 2024, doi: 10.1080/10494820.2023.2266490
-
[18]
W. Luo et al., “63.Aladdin’s Genie or Pandora’s Box for Early Childhood Education? Experts Chat on the Roles, Challenges, and Developments of ChatGPT,” Early Educ Dev, Jan. 2024, doi: 10.1080/10409289.2023.2214181
-
[19]
S. H. Allehyani and M. A. Algamdi, “51.Digital Compe-tences: Early Childhood Teachers’ Beliefs and Percep-tions of ChatGPT Application in Teaching English as a Second Language (ESL),” International Journal of Learning, Teaching and Educational Research, vol. 20, no. 11, pp. 343–363, Nov. 2023, doi: 10.26803/IJLTER.22.11.18
-
[20]
50.Young Children and ChatGPT: Parents’ Use of ChatGPT in Parenting,
S. Quan, Y. Du, and Y. Ding, “50.Young Children and ChatGPT: Parents’ Use of ChatGPT in Parenting,” Con-ference on Human Factors in Computing Systems - Pro-ceedings, May 2024, doi: 10.1145/3613905.3650880/SUPPL_FILE/3613905.3650880-TALK-VIDEO.VTT
work page doi:10.1145/3613905.3650880/suppl_file/3613905.3650880-talk-video.vtt 2024
-
[21]
14.Parenting in the Age of Artificial Intelli-gence: Digital Guardians,
Ruqia Safdar Bajwa, Asma Yunus, Hina Saeed, and Asia Zulfqar, “14.Parenting in the Age of Artificial Intelli-gence: Digital Guardians,” vol. 16, no. 2, 2024, Accessed: Jan. 14,
work page 2024
-
[22]
64.Digital Dia-logue—How Youth Are Interacting With Chatbots,
N. Pratt, R. Madhavan, and J. Weleff, “64.Digital Dia-logue—How Youth Are Interacting With Chatbots,” JAMA Pediatr, vol. 178, no. 5, pp. 429–430, May 2024, doi: 10.1001/JAMAPEDIATRICS.2024.0084
-
[23]
A. Kawakami et al., “71.Improving Human-AI Partner-ships in Child Welfare: Understanding Worker Practices, Challenges, and Desires for Algorithmic Decision Sup-port,” Conference on Human Factors in Computing Sys-tems - Proceedings, Apr. 2022, doi: 10.1145/3491102.3517439/SUPPL_FILE/3491102.3517439-TALK-VIDEO.MP4
work page doi:10.1145/3491102.3517439/suppl_file/3491102.3517439-talk-video.mp4 2022
-
[24]
72.Can Robots Help in the Evaluation of Mental Wellbeing in Children? An Empirical Study,
N. I. Abbasi, M. Spitale, J. Anderson, T. Ford, P. B. Jones, and H. Gunes, “72.Can Robots Help in the Evaluation of Mental Wellbeing in Children? An Empirical Study,” RO-MAN 2022 - 31st IEEE International Conference on Ro- bot and Human Interactive Communication: Social, Aso-cial, and Antisocial Robots, pp. 1459–1466, 2022, doi: 10.1109/RO-MAN53752.2022.9900843
-
[25]
70.AI-generated characters for supporting personalized learning and well-being,
P. Pataranutaporn et al., “70.AI-generated characters for supporting personalized learning and well-being,” Nature Machine Intelligence 2021 3:12, vol. 3, no. 12, pp. 1013–1022, Dec. 2021, doi: 10.1038/s42256-021-00417-9
-
[26]
18.A systematic review of ChatGPT use in K-12 education,
P. Zhang and G. Tur, “18.A systematic review of ChatGPT use in K-12 education,” Eur J Educ, vol. 59, no. 2, p. e12599, Jun. 2024, doi: 10.1111/EJED.12599
-
[27]
2022, AI and ethics, 2, doi: 10.1007/s43681-021-00096-7
S. Akgun and C. Greenhow, “19.Artificial intelligence in education: Addressing ethical challenges in K-12 set-tings,” AI and Ethics 2021 2:3, vol. 2, no. 3, pp. 431–440, Sep. 2021, doi: 10.1007/S43681-021-00096-7
-
[28]
43.TPACK in the age of ChatGPT and Generative AI,
P. Mishra, M. Warr, and R. Islam, “43.TPACK in the age of ChatGPT and Generative AI,” Journal of Digital Learning in Teacher Education, vol. 39, no. 4, pp. 235–251, 2023, doi: 10.1080/21532974.2023.2247480
-
[29]
20.Kids AI Design Thinking Education for Creativity Development,
J. Rong, K. Terzidis, and J. Ding, “20.Kids AI Design Thinking Education for Creativity Development,” Ar-chives of Design Research, vol. 37, no. 3, pp. 119–133, 2024, doi: 10.15187/ADR.2024.07.37.3.119
-
[30]
22.Family as a Third Space for AI Literacies: How do children and par-ents learn about AI together?,
S. Druga, F. L. Christoph, and A. J. Ko, “22.Family as a Third Space for AI Literacies: How do children and par-ents learn about AI together?,” Conference on Human Factors in Computing Systems - Proceedings, vol. 17, Apr. 2022, doi: 10.1145/3491102.3502031/SUPPL_FILE/3491102.3502031-TALK-VIDEO.MP4
work page doi:10.1145/3491102.3502031/suppl_file/3491102.3502031-talk-video.mp4 2022
-
[31]
[Online]. Available: https://www.researchgate.net/publica-tion/386452281_ChatGPT_for_educational-aca-demic_activities_Preschool_student_teachers%27_expe-riences
-
[32]
A. ; Bozkurt et al., “35.Speculative Futures on ChatGPT and Generative Artificial Intelligence (AI): A Collective Reflection from the Educational Landscape,” Asian Jour-nal of Distance Education, vol. 18, no. 1, pp. 53–130, Mar. 2023, doi: 10.5281/ZENODO.7636568
-
[33]
MinorBench: A hand-built benchmark for content-based risks for children,
S. Khoo, G. Chua, and R. Shong, “MinorBench: A hand-built benchmark for content-based risks for children,” Mar. 2025, Accessed: Apr. 14,
work page 2025
-
[34]
Available: https://arxiv.org/abs/2503.10242v1
[Online]. Available: https://arxiv.org/abs/2503.10242v1
-
[35]
JailbreakBench: An Open Robustness Benchmark for Jailbreaking Large Language Models,
P. Chao et al., “JailbreakBench: An Open Robustness Benchmark for Jailbreaking Large Language Models,” Mar. 2024, Accessed: Apr. 14,
work page 2024
-
[36]
JailbreakBench: An Open Robustness Benchmark for Jailbreaking Large Language Models
[Online]. Available: https://arxiv.org/abs/2404.01318v5
work page internal anchor Pith review Pith/arXiv arXiv
-
[37]
HarmBench: A Standardized Evalua-tion Framework for Automated Red Teaming and Robust Refusal,
M. Mazeika et al., “HarmBench: A Standardized Evalua-tion Framework for Automated Red Teaming and Robust Refusal,” Proc Mach Learn Res, vol. 235, pp. 35181–35224, Feb. 2024, Accessed: Jun. 03,
work page 2024
-
[38]
HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal
[Online]. Available: https://arxiv.org/pdf/2402.04249
work page internal anchor Pith review Pith/arXiv arXiv
-
[39]
PromptBench: A Unified Library for Evaluation of Large Language Models,
K. Zhu, Q. Zhao, H. Chen, J. Wang, X. Xie, and Z. Wen, “PromptBench: A Unified Library for Evaluation of Large Language Models,” Journal of Machine Learning Research, vol. 25, pp. 1–22, Dec. 2023, Accessed: Jun. 03,
work page 2023
-
[40]
Available: https://arxiv.org/pdf/2312.07910
[Online]. Available: https://arxiv.org/pdf/2312.07910
-
[41]
74.Toxicity in ChatGPT: Analyzing 13 Persona-assigned Language Models,
A. Deshpande, V. Murahari, T. Rajpurohit, A. Kalyan, and K. Narasimhan, “74.Toxicity in ChatGPT: Analyzing 13 Persona-assigned Language Models,” Findings of the As-sociation for Computational Linguistics: EMNLP 2023, pp. 1236–1270, Apr. 2023, doi: 10.18653/v1/2023.find-ings-emnlp.88
-
[42]
A. Dangol et al., “21.Mediating Culture: Cultivating So-cio-cultural Understanding of AI in Children through Par-ticipatory Design,” Proceedings of the 2024 ACM De-signing Interactive Systems Conference, DIS 2024, pp. 1805–1822, Jul. 2024, doi: 10.1145/3643834.3661515
-
[43]
D. B. Wandera, “69.Skewed Artificial Intelligence: Flag-ging Embedded Cultural Practices in Children’s Stories Featuring ‘Alice and Sparkle,’” Read Res Q, vol. 59, no. 4, pp. 651–664, Oct. 2024, doi: 10.1002/RRQ.572
-
[44]
M. A. Hedderich, N. N. Bazarova, W. Zou, R. Shim, X. Ma, and Q. Yang, “48.A Piece of Theatre: Investigating How Teachers Design LLM Chatbots to Assist Adoles-cent Cyberbullying Education,” Conference on Human Factors in Computing Systems - Proceedings, p. 17, May 2024, doi: 10.1145/3613904.3642379/SUPPL_FILE/3613904.3642379-TALK-VIDEO.VTT
work page doi:10.1145/3613904.3642379/suppl_file/3613904.3642379-talk-video.vtt 2024
-
[45]
V. Shrivastava, S. Sharma, D. Chakraborty, and M. Kin-nula, “58.Is a Sunny Day Bright and Cheerful or Hot and Uncomfortable? Young Children’s Exploration of ChatGPT,” ACM International Conference Proceeding Series, Oct. 2024, doi: 10.1145/3679318.3685397
-
[46]
E. Murgia, Z. Abbasiantaeb, M. Aliannejadi, T. Huibers, M. Landoni, and M. S. Pera, “47.ChatGPT in the Class-room: A Preliminary Exploration on the Feasibility of Adapting ChatGPT to Support Children’s Information Discovery,” UMAP 2023 - Adjunct Proceedings of the 31st ACM Conference on User Modeling, Adaptation and Personalization, pp. 22–27, Jun. 2023, ...
-
[47]
7.IoT-based Child Se-curity Monitoring System,
L. Y. Heng and I. F. B. Kamsin, “7.IoT-based Child Se-curity Monitoring System,” Proceedings of the 3rd Inter-national Conference on Integrated Intelligent Computing Communication & Security (ICIIC 2021), vol. 4, Sep. 2021, doi: 10.2991/AHIS.K.210913.058
-
[48]
Y. Lee, K. K. Kim, and J. H. Kim, “13.Prevention of Safety Accidents through Artificial Intelligence Monitor-ing of Infants in the Home Environment,” ICTC 2019 - 10th International Conference on ICT Convergence: ICT Convergence Leading the Autonomous Future, pp. 474–477, Oct. 2019, doi: 10.1109/ICTC46691.2019.8939675
-
[49]
9.Implementation of Re-strictions in Smart Home Devices for Safety of Children,
S. Sangal and R. Bathla, “9.Implementation of Re-strictions in Smart Home Devices for Safety of Children,” 2019 4th International Conference on Information Sys-tems and Computer Networks, ISCON 2019, pp. 139–143, Nov. 2019, doi: 10.1109/ISCON47742.2019.9036218
-
[50]
49.A SWOT analysis of ChatGPT: Implications for educational practice and research,
M. Farrokhnia, S. K. Banihashem, O. Noroozi, and A. Wals, “49.A SWOT analysis of ChatGPT: Implications for educational practice and research,” Innovations in Ed-ucation and Teaching International, vol. 61, no. 3, pp. 460–474, May 2024, doi: 10.1080/14703297.2023.2195846
-
[51]
N. T. Sahin, N. U. Keshav, J. P. Salisbury, and A. Va-habzadeh, “12.Safety and Lack of Negative Effects of Wearable Augmented-Reality Social Communication Aid for Children and Adults with Autism,” Journal of Clinical Medicine 2018, Vol. 7, Page 188, vol. 7, no. 8, p. 188, Jul. 2018, doi: 10.3390/JCM7080188
-
[52]
15.The risks of using ChatGPT to obtain common safety-related information and advice,
O. Oviedo-Trespalacios et al., “15.The risks of using ChatGPT to obtain common safety-related information and advice,” Saf Sci, vol. 167, p. 106244, Nov. 2023, doi: 10.1016/J.SSCI.2023.106244
-
[53]
E. Polat et al., “66.Evaluating the accuracy and readabil-ity of ChatGPT in providing parental guidance for ade-noidectomy, tonsillectomy, and ventilation tube insertion surgery,” Int J Pediatr Otorhinolaryngol, vol. 181, p. 111998, Jun. 2024, doi: 10.1016/J.IJPORL.2024.111998
-
[54]
73.The Second Workshop on Child-Centered AI Design (CCAI),
A. Atabey et al., “73.The Second Workshop on Child-Centered AI Design (CCAI),” Conference on Human Factors in Computing Systems - Proceedings, May 2024, doi: 10.1145/3613905.3636305
-
[55]
A. Bryant et al., “76.Children of AI: A Protocol for Man-aging the Born-Digital Ephemera Spawned by Generative AI Language Models,” Publications 2023, Vol. 11, Page 45, vol. 11, no. 3, p. 45, Sep. 2023, doi: 10.3390/PUBLI-CATIONS11030045
-
[56]
24.Adventure in AI Project (2AI): Pro-moting AI Knowledge for Kids Aged 7–12 Using Gam-ing,
P. Petridis et al., “24.Adventure in AI Project (2AI): Pro-moting AI Knowledge for Kids Aged 7–12 Using Gam-ing,” Lecture Notes in Networks and Systems, vol. 937 LNNS, pp. 307–315, 2024, doi: 10.1007/978-3-031-56075-0_29/FIGURES/5
-
[57]
40.Designing One Year Curriculum to Teach Artificial Intelligence for Middle School,
A. Sabuncuoglu, “40.Designing One Year Curriculum to Teach Artificial Intelligence for Middle School,” Annual Conference on Innovation and Technology in Computer Science Education, ITiCSE, vol. 7, pp. 96–102, Jun. 2020, doi: 10.1145/3341525.3387364
-
[58]
23.The Scaffolded AI Literacy (SAIL) Framework for Educa-tion,
K. MacCallum, D. Parsons, and M. Mohaghegh, “23.The Scaffolded AI Literacy (SAIL) Framework for Educa-tion,” He Rourou, p. 23, Sep. 2024, doi: 10.54474/HEROUROU.V1I1.10835
-
[59]
W. Luo, H. He, M. Gao, and H. Li, “17.Safety, Identity, Attitude, Cognition, and Capability: The ‘SIACC’ Frame-work of Early Childhood AI Literacy,” Educ Sci (Basel), vol. 14, no. 8, Aug. 2024, doi: 10.3390/EDUCSCI14080871
-
[60]
11.Safety Engineering for Ar-tificial General Intelligence,
R. Yampolskiy and J. Fox, “11.Safety Engineering for Ar-tificial General Intelligence,” Topoi, vol. 32, no. 2, pp. 217–226, Oct. 2013, doi: 10.1007/S11245-012-9128-9/METRICS
-
[61]
[Online]. Available: https://www.researchgate.net/publi-cation/380001925_Ethical_and_Responsi-ble_AI_and_Robotics_for_Children
-
[62]
Young children’s un-derstanding of AI,
D. M. Heeg and L. Avraamidou, “Young children’s un-derstanding of AI,” Educ Inf Technol (Dordr), 2024, doi: 10.1007/S10639-024-13169-X
-
[63]
33.AI, Ethics, and Education: The Pioneering Path of Sidekick Academy,
E. Radday and M. Mervis, “33.AI, Ethics, and Education: The Pioneering Path of Sidekick Academy,” Proceedings of the AAAI Conference on Artificial Intelligence, vol. 38, no. 21, pp. 23294–23299, Mar. 2024, doi: 10.1609/AAAI.V38I21.30377
-
[64]
52.Distributed agency in second lan-guage learning and teaching through generative AI,
R. Godwin-Jones, “52.Distributed agency in second lan-guage learning and teaching through generative AI,” Lan-guage Learning & Technology, vol. 28, no. 2, Mar. 2024, 14 Accessed: Jan. 14,
work page 2024
-
[65]
Available: https://arxiv.org/abs/2403.20216v4
[Online]. Available: https://arxiv.org/abs/2403.20216v4
-
[66]
36.Making a Case for Artificial Intelligence Literacy Skills for School-Age Children,
E. Kleinknecht, F. C. Blumberg, R. M. Flynn, E. Klein-knecht, F. C. Blumberg, and R. M. Flynn, “36.Making a Case for Artificial Intelligence Literacy Skills for School-Age Children,” pp. 201–212, 2024, doi: 10.1007/978-3-031-60713-4_13
-
[67]
P. Rath, H. Shrawgi, P. Agrawal, and S. Dandapat, “LLM Safety for Children,” Feb. 2025, Accessed: Apr. 14,
work page 2025
-
[68]
Available: https://arxiv.org/abs/2502.12552v1
[Online]. Available: https://arxiv.org/abs/2502.12552v1
-
[69]
26.Developing Middle School Students’ AI Literacy,
I. Lee, S. Ali, H. Zhang, D. Dipaola, and C. Breazeal, “26.Developing Middle School Students’ AI Literacy,” SIGCSE 2021 - Proceedings of the 52nd ACM Technical Symposium on Computer Science Education, vol. 7, pp. 191–197, Mar. 2021, doi: 10.1145/3408877.3432513
-
[70]
27.Co-Designing an AI Cur-riculum with University Researchers and Middle School Teachers,
C. Gardner-Mccune et al., “27.Co-Designing an AI Cur-riculum with University Researchers and Middle School Teachers,” SIGCSE 2023 - Proceedings of the 54th ACM Technical Symposium on Computer Science Education, vol. 2, p. 1306, Mar. 2023, doi: 10.1145/3545947.3576253
-
[71]
Available: https://lmsys.org/blog/2023-03-30-vicuna/
[Online]. Available: https://lmsys.org/blog/2023-03-30-vicuna/
work page 2023
-
[72]
R. Williams et al., “25.AI + Ethics Curricula for Middle School Youth: Lessons Learned from Three Project-Based Curricula,” Int J Artif Intell Educ, vol. 33, no. 2, pp. 325–383, Jun. 2023, doi: 10.1007/S40593-022-00298-Y/TABLES/7
-
[73]
P. Dúo-Terrón, “56.Generative Artificial Intelligence: Educational Reflections from an Analysis of Scientific Production.,” J Technol Sci Educ, vol. 14, no. 3, pp. 756–769, 2024, doi: 10.3926/jotse.2680
-
[74]
T. Famaye, C. S. Bailey, I. Adisa, and G. A. Irgens, “‘What Makes ChatGPT Dangerous is Also What Makes It Special’: High-School Student Perspectives on the In-tegration or Ban of Artificial Intelligence in Educational Contexts,” International Journal of Technology in Educa-tion, vol. 7, no. 2, pp. 174–199, Mar. 2024, doi: 10.46328/IJTE.651
-
[75]
S. J. Lee and K. Kwon, “A systematic review of AI edu-cation in K-12 classrooms from 2018 to 2023: Topics, strategies, and learning outcomes,” Computers and Edu-cation: Artificial Intelligence, vol. 6, Jun. 2024, doi: 10.1016/J.CAEAI.2024.100211
-
[76]
Generative AI and ChatGPT in School Children’s Education: Evidence from a School Lesson,
J. S. Jauhiainen and A. G. Guerra, “Generative AI and ChatGPT in School Children’s Education: Evidence from a School Lesson,” Sustainability (Switzerland), vol. 15, no. 18, Sep. 2023, doi: 10.3390/SU151814025
-
[77]
Primary school students’ perceptions of artifi-cial intelligence – for good or bad,
S. Walan, “Primary school students’ perceptions of artifi-cial intelligence – for good or bad,” Int J Technol Des Educ, Mar. 2024, doi: 10.1007/S10798-024-09898-2
-
[78]
G. S. Kasun, Y. C. Liao, L. E. Margulieux, and M. Woodall, “38.Unexpected outcomes from an AI educa-tion course among education faculty: Toward making AI accessible with marginalized youth in urban Mexico,” Front Educ (Lausanne), vol. 9, 2024, doi: 10.3389/FEDUC.2024.1368604
-
[79]
31.Evaluation of ChatGPT Usage in Pre-school Education: Teacher Perspectives,
M. Uğraş, “31.Evaluation of ChatGPT Usage in Pre-school Education: Teacher Perspectives,” Eğitim Ve İn-sani Bilimler Dergisi: Teori Ve Uygulama, vol. 15, no. 30, pp. 387–414, Dec. 2024, doi: 10.58689/EIBD.1537337
-
[80]
W. Yang, “28.Artificial Intelligence education for young children: Why, what, and how in curriculum design and implementation,” Computers and Education: Artificial Intelligence, vol. 3, p. 100061, Jan. 2022, doi: 10.1016/J.CAEAI.2022.100061
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.