Towards SocratiCode: Designing a Generative AI-Based Programming Tutor for K-12 Students through a 4-Week Participatory Design Study

Anshul Bihani; Cassandra Lucas; Chun-Hua Tsai; Jaydeb Sarker; Mia Mohammad Imran; Rohini Kukka

arxiv: 2605.17857 · v1 · pith:JIKASBDWnew · submitted 2026-05-18 · 💻 cs.HC

Towards SocratiCode: Designing a Generative AI-Based Programming Tutor for K-12 Students through a 4-Week Participatory Design Study

Cassandra Lucas , Anshul Bihani , Rohini Kukka , Chun-Hua Tsai , Jaydeb Sarker , Mia Mohammad Imran This is my paper

Pith reviewed 2026-05-20 09:35 UTC · model grok-4.3

classification 💻 cs.HC

keywords generative AIK-12 programming educationSocratic tutoringparticipatory designadaptive learning companionPython for beginnershuman-AI collaboration

0 comments

The pith

Generative AI for K-12 programming works best as a Socratic questioner embedded in human-guided lessons rather than as a direct answer provider.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper reports on a four-week participatory design process with two K-12 students in which an AI tutoring system named SocratiCode was repeatedly revised based on learner feedback. It moved from open-ended tutorial generation to a more constrained dialogic style that uses guided questions, reflection prompts, misconception checks, incremental hints, and required pauses for student input. A sympathetic reader would care because the authors argue this Socratic form reduces the overwhelm that lengthy AI explanations can create for novices while still leveraging generative capabilities. The work positions the AI as a companion inside a broader human-led instructional setup rather than a standalone solution engine.

Core claim

Across the study iterations the system shifted toward dialogic support through guided questioning, reflection prompts, misconception checks, incremental hints, and mandatory pauses for learner input; preliminary observations indicate this change improved explanation clarity, supported problem-solving engagement, and better matched novice needs when combined with human guidance.

What carries the argument

SocratiCode is the evolving adaptive tutorial system whose refinement into a Socratic tutoring model supplies guided questions and learner-input pauses instead of full solutions.

Load-bearing premise

Feedback from only two K-12 students across four weeks can show reliable gains in clarity, engagement, and fit for a wider population of novice learners.

What would settle it

A controlled trial that assigns many more K-12 students to either the final Socratic version or a directive answer-giving version and measures differences in problem-solving success and reported confusion.

Figures

Figures reproduced from arXiv: 2605.17857 by Anshul Bihani, Cassandra Lucas, Chun-Hua Tsai, Jaydeb Sarker, Mia Mohammad Imran, Rohini Kukka.

**Figure 1.** Figure 1: Experiment Pipeline of SocratiCode. Revision Topics (Thursday) Weekly Agile Style Group Meeting and Feedback (Friday) Update Prompt (If needed) Daily Topic Exploration (Monday, Tuesday, Wednesday) and Feedback [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗

**Figure 2.** Figure 2: Feedback Loop with conditions” with “This completes the lesson.” These revisions produced a more structured and dialogic tutoring flow. We provide the shortened prompt template used by the end of W4 below. Template Structure (Shortened). For Full Prompt [6] 1. Role & Audience: Act as a step-by-step tutorial guide for absolute beginners, using Python by default and clear analogies. 2. Learner Adaptation: As… view at source ↗

read the original abstract

Generative AI creates new opportunities for programming education, but many existing systems remain overly directive, producing lengthy explanations and premature solutions that can overwhelm K-12 novices. In this paper, we present a participatory design study of how an adaptive tutorial system, SocratiCode, evolved toward a Socratic tutoring model for beginner programming instruction. Drawing on weekly learner feedback, we iteratively refined the system over a four-week study with two K-12 students learning Python. Across iterations, the system shifted from flexible tutorial generation toward a more dialogic form of support characterized by guided questioning, reflection prompts, misconception checks, incremental hints, and mandatory pauses for learner input. Our preliminary observations suggest that this Socratic shift improved explanation clarity, supported problem-solving engagement, and better aligned instruction with novice learners' needs, especially when combined with human guidance. We argue that generative AI in K-12 programming education may be most effective not as an answer engine, but as a Socratic, adaptive learning companion embedded within a human-guided instructional framework.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Small participatory design study with two K-12 students shows how to add Socratic features to a generative AI Python tutor, but the evidence for real improvements stays thin.

read the letter

The key takeaway is that this paper walks through a four-week iterative design process with two K-12 students that shifted SocratiCode from direct answer generation toward guided questioning, reflection prompts, misconception checks, incremental hints, and forced pauses. The authors document weekly feedback sessions and the specific tweaks that followed, which gives a clear before-and-after picture of the interface changes. That concrete trajectory is the main new piece here, since prior work on Socratic tutoring and AI in education already exists but rarely shows this exact participatory path for beginner Python learners. The study does a decent job of illustrating how human guidance stayed in the loop and how the system responded to novice confusion in real time. Those details could help other designers think through similar refinements. The main limitation is the sample. Two students over four weeks, with only qualitative observations and no pre/post measures, controls, or replication, makes it difficult to attribute any gains in clarity or engagement to the Socratic elements themselves rather than the short duration or the consistent adult oversight. The abstract is upfront that these are preliminary observations, which is good, but the broader claim that generative AI works best as an embedded Socratic companion still rests on very limited data. This paper is mainly for HCI researchers and edtech designers who run small-scale participatory studies and want a worked example in K-12 programming tools. It is not positioned as a large-scale evaluation, so readers looking for robust learning outcomes or generalizable results will come away wanting more. It deserves a serious referee because the design process is described in enough detail to be useful for feedback, and external reviewers could push on evaluation methods and scaling questions without the work being incoherent on its own terms. I would send it to peer review rather than desk reject, with the expectation that the authors expand the limitations section and outline clearer next steps for testing the approach with more students.

Referee Report

2 major / 2 minor

Summary. The paper reports on a 4-week participatory design study with two K-12 students in which the authors iteratively refined a generative-AI programming tutor (SocratiCode) from a flexible tutorial generator into a dialogic Socratic system that uses guided questioning, reflection prompts, misconception checks, incremental hints, and mandatory pauses. Preliminary qualitative observations from weekly learner feedback are presented as evidence that the Socratic shift improved explanation clarity, supported problem-solving engagement, and better aligned with novice needs when combined with human guidance; the authors conclude that generative AI in K-12 programming education is most effective as an embedded Socratic companion within a human-guided instructional framework.

Significance. If the reported benefits of the Socratic features can be replicated at scale, the work would supply useful design heuristics for AI tutors aimed at young beginners, particularly the value of mandatory reflection pauses and incremental scaffolding over direct answer generation. At present the contribution remains exploratory and design-oriented rather than a validated pedagogical result.

major comments (2)

[Abstract and §4] Abstract and §4 (Results/Observations): the claims that the Socratic shift 'improved explanation clarity, supported problem-solving engagement, and better aligned instruction with novice learners' needs' rest solely on qualitative observations from two participants; no quantitative pre/post learning or engagement metrics, no control condition, and no systematic error analysis are reported, leaving attribution to the dialogic features insecure.
[Discussion] Discussion section: the broader argument that generative AI 'may be most effective not as an answer engine, but as a Socratic, adaptive learning companion embedded within a human-guided instructional framework' extrapolates from iterative refinements driven by feedback from only two learners over four weeks; the manuscript provides no evidence that the observed changes are driven by the Socratic elements themselves rather than learner-specific factors, consistent human guidance, or study duration.

minor comments (2)

[Methods] Methods: provide the exact system prompts or prompt-engineering changes applied at each weekly iteration so that the design trajectory can be reproduced or extended by other researchers.
[Figures] Figures: ensure any diagrams showing the evolution of the tutor interface across the four weeks are explicitly labeled with iteration number and the specific Socratic features introduced at each step.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback, which has helped us better frame the exploratory scope of this participatory design study. We respond to each major comment below and describe the changes incorporated into the revised manuscript.

read point-by-point responses

Referee: [Abstract and §4] the claims that the Socratic shift 'improved explanation clarity, supported problem-solving engagement, and better aligned instruction with novice learners' needs' rest solely on qualitative observations from two participants; no quantitative pre/post learning or engagement metrics, no control condition, and no systematic error analysis are reported, leaving attribution to the dialogic features insecure.

Authors: We agree that the reported observations are qualitative, drawn from only two participants, and lack quantitative pre/post measures, a control condition, or systematic error analysis. This is consistent with the participatory design methodology of the study, which prioritized iterative refinement based on weekly learner feedback rather than controlled experimentation. In the revised manuscript we have updated the abstract and Section 4 to qualify all statements as preliminary observations from the design process. We now explicitly note the absence of quantitative metrics and control conditions, avoid causal language regarding attribution to the dialogic features, and add a forward-looking statement calling for larger-scale studies with such measures to validate the observed patterns. revision: yes
Referee: [Discussion] the broader argument that generative AI 'may be most effective not as an answer engine, but as a Socratic, adaptive learning companion embedded within a human-guided instructional framework' extrapolates from iterative refinements driven by feedback from only two learners over four weeks; the manuscript provides no evidence that the observed changes are driven by the Socratic elements themselves rather than learner-specific factors, consistent human guidance, or study duration.

Authors: We accept that the small sample and study duration limit the strength of broader claims and that alternative explanations (learner-specific factors, human guidance, or simply the passage of time) cannot be ruled out from the available data. We have revised the Discussion section to present the argument as a set of design heuristics emerging from this case rather than a generalizable conclusion. We have added explicit discussion of potential confounds, including the role of consistent human guidance, and inserted a new Limitations subsection that directly addresses the small participant count, the four-week timeframe, and the inability to isolate the contribution of the Socratic elements from other study variables. revision: yes

Circularity Check

0 steps flagged

No significant circularity in qualitative participatory design study

full rationale

The paper describes a 4-week participatory design process with two K-12 students in which SocratiCode was iteratively refined based on direct weekly feedback. All claims about improved clarity, engagement, and alignment with novice needs are presented as preliminary observations drawn from that feedback and the resulting design changes. No equations, fitted parameters, predictions, uniqueness theorems, or self-citation chains appear; the work contains no derivations that could reduce outputs to inputs by construction. The study is therefore self-contained against external benchmarks and receives a circularity score of 0.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim depends on the assumption that small-scale qualitative feedback from two participants can validly inform general design principles for AI tutoring effectiveness.

axioms (1)

domain assumption Weekly feedback from a small number of K-12 learners in participatory sessions accurately identifies effective tutoring strategies for novice programmers
The study uses this feedback to drive iterative shifts from tutorial generation to Socratic dialogue.

pith-pipeline@v0.9.0 · 5740 in / 1344 out tokens · 86994 ms · 2026-05-20T09:35:14.899789+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

38 extracted references · 38 canonical work pages · 1 internal anchor

[1]

Erfan Al-Hossami, Razvan Bunescu, Ryan Teehan, Laurel Powell, Khyati Ma- hajan, and Mohsen Dorodchi. 2023. Socratic questioning of novice debuggers: A benchmark dataset and preliminary evaluations. InProceedings of the 18th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2023). 709–726

work page 2023
[2]

Ohud Abdullah Alasmari, Jeremy Singer, and Mireilla Bikanga Ada. 2023. Do current online coding tutorial systems address novice programmer difficulties?. InProceedings of the 15th International Conference on Education Technology and Computers. 242–248

work page 2023
[3]

Mohammed Amin Almaiah, Raghad Alfaisal, Said A Salloum, Fahima Hajjej, et al. 2022. Examining the impact of artificial intelligence and social and com- puter anxiety in e-learning settings: Students’ perceptions at the university level. Electronics11, 22 (2022), 3662

work page 2022
[4]

Zeyad Alshaikh, Lasagn Tamang, and Vasile Rus. 2020. A Socratic tutor for source code comprehension. InInternational conference on artificial intelligence in education. Springer, 15–19

work page 2020
[5]

Zeyad Alshaikh, Lasang Jimba Tamang, and Vasile Rus. 2020. Experiments with a socratic intelligent tutoring system for source code understanding. InThe Thirty-Third International Florida Artificial Intelligence Research Society Conference (FLAIRS-32)

work page 2020
[6]

Anonymous Anonymous. 2026. Replication Package forSocratiCodefor K-12 Students Study. doi:10.5281/zenodo.20018098

work page doi:10.5281/zenodo.20018098 2026
[7]

Samuel Boguslawski, Rowan Deer, and Mark G Dawson. 2025. Programming education and learner motivation in the age of generative AI: student and educator perspectives.Information and Learning Sciences(2025)

work page 2025
[8]

Michelle Brachman, Siya Kunde, Sarah Miller, Ana Fucs, Samantha Dempsey, Jamie Jabbour, and Werner Geyer. 2025. Building Appropriate Mental Models: What Users Know and Want to Know about an Agentic AI Chatbot. InProceedings of the 30th International Conference on Intelligent User Interfaces. 247–264

work page 2025
[9]

Peter Brusilovsky and Eva Millán. 2007. User models for adaptive hypermedia and adaptive educational systems. InThe adaptive web: methods and strategies of web personalization. Springer, 3–53

work page 2007
[10]

2006.Constructing grounded theory: A practical guide through qualitative analysis

Kathy Charmaz. 2006.Constructing grounded theory: A practical guide through qualitative analysis. sage

work page 2006
[11]

Rudrajit Choudhuri, Ambareesh Ramakrishnan, Amreeta Chatterjee, Bianca Trinkenreich, et al. 2025. Insights from the Frontline: GenAI Utilization Among Software Engineering Students.IEEE Xplore(2025), 1–12

work page 2025
[12]

Paul Denny, David H Smith IV, Max Fowler, James Prather, Brett A Becker, and Juho Leinonen. 2024. Explaining code with a purpose: An integrated approach for developing code comprehension and prompting skills. InProceedings of the 2024 on Innovation and Technology in Computer Science Education V. 1. 283–289

work page 2024
[13]

Sidney D’mello and Art Graesser. 2013. AutoTutor and affective AutoTutor: Learning by talking with cognitively and emotionally intelligent computers that talk back.ACM Transactions on Interactive Intelligent Systems (TiiS)(2013)

work page 2013
[14]

Ian Drosos, Jack Williams, Advait Sarkar, Nicholas Wilson, Sean Rintel, and Payod Panda. 2025. Dynamic Prompt Middleware: Contextual Prompt Refinement Controls for Comprehension Tasks. InProceedings of the 4th Annual Symposium on Human-Computer Interaction for Work. 1–23

work page 2025
[15]

2026.Socratic method

Encyclopaedia Britannica. 2026.Socratic method. https://www.britannica.com/ topic/Socratic-method Last updated March 13, 2026. Accessed April 15, 2026

work page 2026
[16]

Guangrui Fan, Dandan Liu, Rui Zhang, and Lihu Pan. 2025. The impact of AI-assisted pair programming on student motivation, programming anxiety, collaborative learning, and programming performance: a comparative study with traditional pair programming and individual approaches.International Journal of STEM Education12, 1 (2025), 16

work page 2025
[17]

2025.Generative artificial intelligence (AI) in education

Department for Education. 2025.Generative artificial intelligence (AI) in education. Technical Report. Department for Education, UK. Updated 12 August 2025

work page 2025
[18]

Michail Giannakos, Roger Azevedo, et al. 2025. The promise and challenges of generative AI in education.Behaviour & Information Technology(2025)

work page 2025
[19]

Shuchi Grover and Roy Pea. 2013. Computational thinking in K–12: A review of the state of the field.Educational researcher42, 1 (2013), 38–43

work page 2013
[20]

Xingjian Gu and Barbara J Ericson. 2025. AI literacy in K-12 and higher education in the wake of generative AI: An integrative review. InProceedings of the 2025 ACM Conference on International Computing Education Research V. 1. 125–140

work page 2025
[21]

2013.Computer Science Curricula 2013: Curriculum Guidelines for Undergraduate Degree Programs in Computer Science

Joint Task Force on Computing Curricula, Association for Computing Machinery (ACM) and IEEE Computer Society. 2013.Computer Science Curricula 2013: Curriculum Guidelines for Undergraduate Degree Programs in Computer Science. ACM Press and IEEE Computer Society Press, New York, NY, USA

work page 2013
[22]

Caitlin Kelleher and Randy Pausch. 2005. Lowering the barriers to program- ming: A taxonomy of programming environments and languages for novice programmers.ACM computing surveys (CSUR)37, 2 (2005), 83–137

work page 2005
[23]

Caitlin Kelleher, Randy Pausch, and Sara Kiesler. 2007. Storytelling alice motivates middle school girls to learn computer programming. InProceedings of the SIGCHI conference on Human factors in computing systems. 1455–1464

work page 2007
[24]

Eric Klopfer, Justin Reich, Hal Abelson, and Cynthia Breazeal. 2024. Generative AI and K-12 education: An MIT perspective. (2024)

work page 2024
[25]

Uday Mittal, Siva Sai, Vinay Chamola, et al. 2024. A comprehensive review on generative AI for education.IEEE Access(2024)

work page 2024
[26]

Susanne Narciss and Ecenaz Alemdag. 2025. Learning from errors and failure in educational contexts: New insights and future directions for research and practice.British Journal of Educational Psychology95, 1 (2025), 197–218

work page 2025
[27]

Sydney Nguyen, Hannah McLean Babe, Yangtian Zi, Arjun Guha, Carolyn Jane Anderson, and Molly Q Feldman. 2024. How Beginning Programmers and Code LLMs (Mis)read Each Other.Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems (CHI ’24)(2024), 1–26

work page 2024
[28]

Aannemarie Sullivan Palinscar and Ann L Brown. 1984. Reciprocal teaching of comprehension-fostering and comprehension-monitoring activities.Cognition and instruction1, 2 (1984), 117–175

work page 1984
[29]

Jiyeon Park and Sam Choo. 2025. Generative AI prompt engineering for educators: Practical strategies.Journal of Special Education Technology40, 3 (2025), 411–417

work page 2025
[30]

Christian Rahe and Walid Maalej. 2025. How Do Programming Students Use Generative AI?Proceedings of the ACM on Software EngineeringFSE (2025)

work page 2025
[31]

Brian J Reiser. 2018. Scaffolding complex learning: The mechanisms of structuring and problematizing student work. InScaffolding. Psychology Press, 273–304

work page 2018
[32]

Sangho Suh, Jian Zhao, and Edith Law. 2022. Codetoon: Story ideation, auto comic generation, and structure mapping for code-driven storytelling. InProceedings of the 35th Annual ACM Symposium on User Interface Software and Technology

work page 2022
[33]

Osman Tasdelen and Daniel Bodemer. 2025. Generative AI in the classroom: Effects of context-personalized learning material and tasks on motivation and performance.International Journal of Artificial Intelligence in Education(2025). TowardsSocratiCode: Designing a Generative AI-Based Programming Tutor for K-12 Students through a 4-Week Participatory Design...

work page 2025
[34]

Selin Urhan and Selay Arkun Kocadere. 2024. Problem-Solving Through Pair- Programming: The Mediational Role of ChatGPT. In2024 5th International Con- ference in Electronic Engineering, Information Technology & Education. IEEE

work page 2024
[35]

Jules White, Quchen Fu, Sam Hays, Michael Sandborn, Carlos Olea, Henry Gilbert, Ashraf Elnashar, et al. 2023. A prompt pattern catalog to enhance prompt engi- neering with chatgpt.arXiv preprint arXiv:2302.11382(2023)

work page internal anchor Pith review Pith/arXiv arXiv 2023
[36]

Leon E Winslow. 1996. Programming pedagogy—a psychological overview.ACM Sigcse Bulletin28, 3 (1996), 17–22

work page 1996
[37]

Yangtian Zi, Luisa Li, Arjun Guha, Carolyn Jane Anderson, and Molly Q Feldman

work page
[38]

I Would Have Written My Code Differently

“I Would Have Written My Code Differently”: Beginners Struggle to Understand LLM-Generated Code.Proceedings of the 33rd ACM International Conference on the Foundations of Software Engineering (FSE Companion ’25)(2025)

work page 2025

[1] [1]

Erfan Al-Hossami, Razvan Bunescu, Ryan Teehan, Laurel Powell, Khyati Ma- hajan, and Mohsen Dorodchi. 2023. Socratic questioning of novice debuggers: A benchmark dataset and preliminary evaluations. InProceedings of the 18th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2023). 709–726

work page 2023

[2] [2]

Ohud Abdullah Alasmari, Jeremy Singer, and Mireilla Bikanga Ada. 2023. Do current online coding tutorial systems address novice programmer difficulties?. InProceedings of the 15th International Conference on Education Technology and Computers. 242–248

work page 2023

[3] [3]

Mohammed Amin Almaiah, Raghad Alfaisal, Said A Salloum, Fahima Hajjej, et al. 2022. Examining the impact of artificial intelligence and social and com- puter anxiety in e-learning settings: Students’ perceptions at the university level. Electronics11, 22 (2022), 3662

work page 2022

[4] [4]

Zeyad Alshaikh, Lasagn Tamang, and Vasile Rus. 2020. A Socratic tutor for source code comprehension. InInternational conference on artificial intelligence in education. Springer, 15–19

work page 2020

[5] [5]

Zeyad Alshaikh, Lasang Jimba Tamang, and Vasile Rus. 2020. Experiments with a socratic intelligent tutoring system for source code understanding. InThe Thirty-Third International Florida Artificial Intelligence Research Society Conference (FLAIRS-32)

work page 2020

[6] [6]

Anonymous Anonymous. 2026. Replication Package forSocratiCodefor K-12 Students Study. doi:10.5281/zenodo.20018098

work page doi:10.5281/zenodo.20018098 2026

[7] [7]

Samuel Boguslawski, Rowan Deer, and Mark G Dawson. 2025. Programming education and learner motivation in the age of generative AI: student and educator perspectives.Information and Learning Sciences(2025)

work page 2025

[8] [8]

Michelle Brachman, Siya Kunde, Sarah Miller, Ana Fucs, Samantha Dempsey, Jamie Jabbour, and Werner Geyer. 2025. Building Appropriate Mental Models: What Users Know and Want to Know about an Agentic AI Chatbot. InProceedings of the 30th International Conference on Intelligent User Interfaces. 247–264

work page 2025

[9] [9]

Peter Brusilovsky and Eva Millán. 2007. User models for adaptive hypermedia and adaptive educational systems. InThe adaptive web: methods and strategies of web personalization. Springer, 3–53

work page 2007

[10] [10]

2006.Constructing grounded theory: A practical guide through qualitative analysis

Kathy Charmaz. 2006.Constructing grounded theory: A practical guide through qualitative analysis. sage

work page 2006

[11] [11]

Rudrajit Choudhuri, Ambareesh Ramakrishnan, Amreeta Chatterjee, Bianca Trinkenreich, et al. 2025. Insights from the Frontline: GenAI Utilization Among Software Engineering Students.IEEE Xplore(2025), 1–12

work page 2025

[12] [12]

Paul Denny, David H Smith IV, Max Fowler, James Prather, Brett A Becker, and Juho Leinonen. 2024. Explaining code with a purpose: An integrated approach for developing code comprehension and prompting skills. InProceedings of the 2024 on Innovation and Technology in Computer Science Education V. 1. 283–289

work page 2024

[13] [13]

Sidney D’mello and Art Graesser. 2013. AutoTutor and affective AutoTutor: Learning by talking with cognitively and emotionally intelligent computers that talk back.ACM Transactions on Interactive Intelligent Systems (TiiS)(2013)

work page 2013

[14] [14]

Ian Drosos, Jack Williams, Advait Sarkar, Nicholas Wilson, Sean Rintel, and Payod Panda. 2025. Dynamic Prompt Middleware: Contextual Prompt Refinement Controls for Comprehension Tasks. InProceedings of the 4th Annual Symposium on Human-Computer Interaction for Work. 1–23

work page 2025

[15] [15]

2026.Socratic method

Encyclopaedia Britannica. 2026.Socratic method. https://www.britannica.com/ topic/Socratic-method Last updated March 13, 2026. Accessed April 15, 2026

work page 2026

[16] [16]

Guangrui Fan, Dandan Liu, Rui Zhang, and Lihu Pan. 2025. The impact of AI-assisted pair programming on student motivation, programming anxiety, collaborative learning, and programming performance: a comparative study with traditional pair programming and individual approaches.International Journal of STEM Education12, 1 (2025), 16

work page 2025

[17] [17]

2025.Generative artificial intelligence (AI) in education

Department for Education. 2025.Generative artificial intelligence (AI) in education. Technical Report. Department for Education, UK. Updated 12 August 2025

work page 2025

[18] [18]

Michail Giannakos, Roger Azevedo, et al. 2025. The promise and challenges of generative AI in education.Behaviour & Information Technology(2025)

work page 2025

[19] [19]

Shuchi Grover and Roy Pea. 2013. Computational thinking in K–12: A review of the state of the field.Educational researcher42, 1 (2013), 38–43

work page 2013

[20] [20]

Xingjian Gu and Barbara J Ericson. 2025. AI literacy in K-12 and higher education in the wake of generative AI: An integrative review. InProceedings of the 2025 ACM Conference on International Computing Education Research V. 1. 125–140

work page 2025

[21] [21]

2013.Computer Science Curricula 2013: Curriculum Guidelines for Undergraduate Degree Programs in Computer Science

Joint Task Force on Computing Curricula, Association for Computing Machinery (ACM) and IEEE Computer Society. 2013.Computer Science Curricula 2013: Curriculum Guidelines for Undergraduate Degree Programs in Computer Science. ACM Press and IEEE Computer Society Press, New York, NY, USA

work page 2013

[22] [22]

Caitlin Kelleher and Randy Pausch. 2005. Lowering the barriers to program- ming: A taxonomy of programming environments and languages for novice programmers.ACM computing surveys (CSUR)37, 2 (2005), 83–137

work page 2005

[23] [23]

Caitlin Kelleher, Randy Pausch, and Sara Kiesler. 2007. Storytelling alice motivates middle school girls to learn computer programming. InProceedings of the SIGCHI conference on Human factors in computing systems. 1455–1464

work page 2007

[24] [24]

Eric Klopfer, Justin Reich, Hal Abelson, and Cynthia Breazeal. 2024. Generative AI and K-12 education: An MIT perspective. (2024)

work page 2024

[25] [25]

Uday Mittal, Siva Sai, Vinay Chamola, et al. 2024. A comprehensive review on generative AI for education.IEEE Access(2024)

work page 2024

[26] [26]

Susanne Narciss and Ecenaz Alemdag. 2025. Learning from errors and failure in educational contexts: New insights and future directions for research and practice.British Journal of Educational Psychology95, 1 (2025), 197–218

work page 2025

[27] [27]

Sydney Nguyen, Hannah McLean Babe, Yangtian Zi, Arjun Guha, Carolyn Jane Anderson, and Molly Q Feldman. 2024. How Beginning Programmers and Code LLMs (Mis)read Each Other.Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems (CHI ’24)(2024), 1–26

work page 2024

[28] [28]

Aannemarie Sullivan Palinscar and Ann L Brown. 1984. Reciprocal teaching of comprehension-fostering and comprehension-monitoring activities.Cognition and instruction1, 2 (1984), 117–175

work page 1984

[29] [29]

Jiyeon Park and Sam Choo. 2025. Generative AI prompt engineering for educators: Practical strategies.Journal of Special Education Technology40, 3 (2025), 411–417

work page 2025

[30] [30]

Christian Rahe and Walid Maalej. 2025. How Do Programming Students Use Generative AI?Proceedings of the ACM on Software EngineeringFSE (2025)

work page 2025

[31] [31]

Brian J Reiser. 2018. Scaffolding complex learning: The mechanisms of structuring and problematizing student work. InScaffolding. Psychology Press, 273–304

work page 2018

[32] [32]

Sangho Suh, Jian Zhao, and Edith Law. 2022. Codetoon: Story ideation, auto comic generation, and structure mapping for code-driven storytelling. InProceedings of the 35th Annual ACM Symposium on User Interface Software and Technology

work page 2022

[33] [33]

Osman Tasdelen and Daniel Bodemer. 2025. Generative AI in the classroom: Effects of context-personalized learning material and tasks on motivation and performance.International Journal of Artificial Intelligence in Education(2025). TowardsSocratiCode: Designing a Generative AI-Based Programming Tutor for K-12 Students through a 4-Week Participatory Design...

work page 2025

[34] [34]

Selin Urhan and Selay Arkun Kocadere. 2024. Problem-Solving Through Pair- Programming: The Mediational Role of ChatGPT. In2024 5th International Con- ference in Electronic Engineering, Information Technology & Education. IEEE

work page 2024

[35] [35]

Jules White, Quchen Fu, Sam Hays, Michael Sandborn, Carlos Olea, Henry Gilbert, Ashraf Elnashar, et al. 2023. A prompt pattern catalog to enhance prompt engi- neering with chatgpt.arXiv preprint arXiv:2302.11382(2023)

work page internal anchor Pith review Pith/arXiv arXiv 2023

[36] [36]

Leon E Winslow. 1996. Programming pedagogy—a psychological overview.ACM Sigcse Bulletin28, 3 (1996), 17–22

work page 1996

[37] [37]

Yangtian Zi, Luisa Li, Arjun Guha, Carolyn Jane Anderson, and Molly Q Feldman

work page

[38] [38]

I Would Have Written My Code Differently

“I Would Have Written My Code Differently”: Beginners Struggle to Understand LLM-Generated Code.Proceedings of the 33rd ACM International Conference on the Foundations of Software Engineering (FSE Companion ’25)(2025)

work page 2025