pith. sign in

arxiv: 2605.30539 · v1 · pith:YH3L5VATnew · submitted 2026-05-28 · 💻 cs.MA

A Theory-Guided LLM Pedagogical Agent for STEM+C Scaffolding Without Over-Reliance

Pith reviewed 2026-06-28 23:41 UTC · model grok-4.3

classification 💻 cs.MA
keywords LLM pedagogical agentSTEM+C scaffoldingover-reliancecognitive offloadingEvidence-Decision-FeedbackSocial Cognitive Theorymultimodal agenthigh school study
0
0 comments X

The pith

A theory-grounded multimodal LLM agent supports student confidence in STEM+C without causing dependence.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops Copa, a multi-agent multimodal LLM pedagogical agent for STEM+C learning. Copa uses the Evidence-Decision-Feedback framework to ground its actions in Social Cognitive Theory and Social Constructivism, aiming for adaptive dialogic support that promotes sense-making. A study with 33 high school dyads shows the agent helps students build confidence and verbalize understanding without leading to over-reliance or dependence. This addresses concerns that LLM agents in education might encourage cognitive offloading. The findings suggest theory-guided design can make classroom AI integration more beneficial by amplifying student reasoning.

Core claim

Copa demonstrates that an agentic, multi-agent, multimodal Collaborative Peer Agent, built on the Evidence-Decision-Feedback framework, can support students' confidence building and ability to verbalize conceptual understanding without causing dependence while providing adaptive feedback personalized to learners and interpretable with respect to their multimodal input data.

What carries the argument

The Evidence-Decision-Feedback (EDF) framework that structures the agent's interactions to promote sense-making through adaptive, dialogic support rather than answer-seeking.

If this is right

  • Students build confidence in their STEM+C abilities through guided interactions.
  • Learners improve their ability to verbalize conceptual understanding.
  • The agent delivers adaptive feedback based on students' multimodal inputs without fostering dependence.
  • Such theory-guided agents offer a path for AI in classrooms that enhances rather than replaces student reasoning.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The EDF approach could be adapted for other AI tutoring systems to reduce over-reliance risks.
  • Testing the agent in different subject areas might reveal its broader applicability.
  • Long-term studies could check if students maintain independent problem-solving skills after using the agent.
  • Combining the framework with additional data sources like eye-tracking could refine personalization further.

Load-bearing premise

That interactions grounded in the learning theories via the EDF framework sufficiently prevent cognitive offloading and over-reliance as measured in the study.

What would settle it

A replication study finding that students using Copa show increased dependence on the agent or reduced ability to solve problems independently compared to a control group.

Figures

Figures reproduced from arXiv: 2605.30539 by Angela Eeds, Ashwin T S, Clayton Cohn, Gautam Biswas, Hanchen David Wang, Meiyi Ma, Menton Deweese, Naveeduddin Mohammed, Pamela J. Osborn Popp, Rebekah Stanton, Ryan Li, Shakeera Walker, Shruti Jain, Siyuan Guo, Surya Rayala, Umesh Timalsina.

Figure 1
Figure 1. Figure 1: Evidence-Decision-Feedback (EDF) framework ( [PITH_FULL_IMAGE:figures/full_fig_p005_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Copa, pictured within the C2STEM learning environment. [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Copa and its four sub-agents. Colors correspond to the EDF modules presented in Figure [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Correlation between success rate and mas [PITH_FULL_IMAGE:figures/full_fig_p010_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Correlation between student requests for [PITH_FULL_IMAGE:figures/full_fig_p011_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Student dialogue state confidence across mastery deciles. tasks and interact with Copa, they should become more confident in their ability to build, test, and refine these models as their problem solving improves. To evaluate RQ4, we used dialogue state groupings—i.e., more-confident versus less-confident—as a proxy for confidence, correlating it with mastery. Specifically, we examined shifts from less-con… view at source ↗
Figure 7
Figure 7. Figure 7: Comparison of the two dyads across the three sessions. (A) Agent utterances per day. (B) Pre/post normalized learning gains, with dots indicating individual students. (C) Proportions of the four problem-solving strategies by day. (D) Aggregate productive versus unproductive strategy proportions by day. could produce tangible progress, they became more willing to treat its feedback as a resource for subsequ… view at source ↗
read the original abstract

LLM pedagogical agents are proliferating, yet recent findings have raised questions about their adherence to established theories of learning and, by extension, their educational value. Concerns regarding cognitive offloading, over-reliance, and "gaming" behaviors persist and remain largely unaddressed. In response, we developed Copa, an agentic, multi-agent, multimodal Collaborative Peer Agent for STEM+C learning. Copa is built on top of the Evidence-Decision-Feedback (EDF) framework, grounding its interactions in Social Cognitive Theory and Social Constructivism and promoting sense-making through adaptive, dialogic support rather than answer-seeking. In an authentic high school computational-modeling study (n=33 dyads), we demonstrate that Copa (1) supports students' confidence building and ability to verbalize conceptual understanding without causing dependence; and (2) provides adaptive feedback personalized to learners that is interpretable with respect to students' multimodal input data. These findings position theory-guided, multimodal LLM agents as a promising path toward classroom AI integration that amplifies students' reasoning rather than replacing it.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript introduces Copa, an agentic multi-agent multimodal Collaborative Peer Agent for STEM+C learning built on the Evidence-Decision-Feedback (EDF) framework, which grounds interactions in Social Cognitive Theory and Social Constructivism to promote sense-making through adaptive dialogic support. Based on an authentic high school computational-modeling study with n=33 dyads, the paper claims that Copa supports students' confidence building and ability to verbalize conceptual understanding without causing dependence, and provides adaptive feedback personalized to learners that is interpretable with respect to students' multimodal input data.

Significance. If the reported findings from the user study hold under rigorous scrutiny, this work could be significant in advancing the integration of LLM-based pedagogical agents in education. By explicitly grounding the agent in established learning theories via the EDF framework, it offers a potential solution to concerns about cognitive offloading and over-reliance, positioning theory-guided multimodal agents as a viable approach for amplifying student reasoning in classroom settings.

major comments (2)
  1. [Abstract] The abstract asserts empirical results from the 33-dyad study demonstrating that Copa supports confidence without dependence, but supplies no data, statistical methods, error bars, baseline comparisons, or exclusion criteria. This makes it impossible to determine whether the data support the claims as stated regarding the prevention of over-reliance.
  2. [User Study Description] The claim that the EDF framework prevents over-reliance is load-bearing for the central contribution, yet the manuscript provides no specifics on how dependence was operationalized or measured (e.g., independent post-test accuracy, usage frequency without prompts, or comparison to a no-agent control), leaving the isolation of the framework's effect from potential confounds unaddressed.
minor comments (1)
  1. The abstract could benefit from a brief mention of the specific STEM+C topic or computational modeling task used in the study to provide context for the claims.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive comments, which highlight opportunities to strengthen the clarity and transparency of our empirical claims. We address each major point below and indicate where revisions will be made.

read point-by-point responses
  1. Referee: [Abstract] The abstract asserts empirical results from the 33-dyad study demonstrating that Copa supports confidence without dependence, but supplies no data, statistical methods, error bars, baseline comparisons, or exclusion criteria. This makes it impossible to determine whether the data support the claims as stated regarding the prevention of over-reliance.

    Authors: We agree that the abstract, due to length constraints, presents high-level claims without the supporting statistical details. The full manuscript contains these elements in the Results section (including pre/post confidence measures, verbalization coding, dependence proxies via post-test accuracy and interaction logs, and baseline comparisons). To improve accessibility, we will revise the abstract to briefly note the study design (within-subjects dyad comparison with multimodal logging) and key statistical outcomes while preserving conciseness. revision: yes

  2. Referee: [User Study Description] The claim that the EDF framework prevents over-reliance is load-bearing for the central contribution, yet the manuscript provides no specifics on how dependence was operationalized or measured (e.g., independent post-test accuracy, usage frequency without prompts, or comparison to a no-agent control), leaving the isolation of the framework's effect from potential confounds unaddressed.

    Authors: The referee correctly identifies that the current manuscript text does not explicitly detail the operationalization of dependence in the User Study Description. Dependence was assessed via (1) independent post-test accuracy on modeling tasks without agent access, (2) frequency of agent queries versus self-initiated actions in logs, and (3) comparison against a no-agent control condition within the dyad design. We will expand this section with these metrics, exclusion criteria, and statistical methods to allow readers to evaluate isolation of the EDF effect from confounds. revision: yes

Circularity Check

0 steps flagged

No significant circularity; empirical study self-contained

full rationale

The paper reports empirical outcomes from an n=33 dyad high-school study on the Copa agent. No equations, fitted parameters, predictions, or derivations are present. Claims rest on observed student behaviors and feedback interpretability rather than any reduction to self-defined inputs or self-citation chains. The theory grounding (Social Cognitive Theory, Social Constructivism, EDF framework) is presented as design rationale, not as a mathematical premise that loops back to the results. This is the normal case of a non-circular empirical report.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the premise that established learning theories, when encoded into the EDF framework, will produce the reported benefits; no free parameters, new entities, or ad-hoc axioms are introduced in the abstract.

axioms (1)
  • domain assumption Social Cognitive Theory and Social Constructivism provide effective grounding for designing LLM interactions that promote sense-making rather than answer-seeking.
    Invoked in the abstract to justify the EDF framework and dialogic support.

pith-pipeline@v0.9.1-grok · 5782 in / 1326 out tokens · 29073 ms · 2026-06-28T23:41:30.051096+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

21 extracted references · 10 canonical work pages · 3 internal anchors

  1. [1]

    Computers and Education: Artificial Intelligence 6, 100215

    Human-centred learning analytics and ai in education: A systematic literature review. Computers and Education: Artificial Intelligence 6, 100215. Aulia,F.E.,Hidayah,I.,Fauziati,S.,Anissa,S.,2025. Guidingself-regulatedlearningwithanllm-basedpedagogicalchatbot,in:2025International Seminar on Application for Technology of Information and Communication (iSema...

  2. [2]

    Annual review of psychology 52, 1–26

    Social cognitive theory: An agentic perspective. Annual review of psychology 52, 1–26. Birks,M., Chapman,Y.,Francis, K.,2008. Memoinginqualitative research:Probingdata andprocesses. Journalofresearch innursing13, 68–75. Bo, N.S.W.,

  3. [3]

    Hungarian Educational Research Journal 15, 284–289

    Oecd digital education outlook 2023: Towards an effective education ecosystem. Hungarian Educational Research Journal 15, 284–289. Chowrira, S.G., Smith, K.M., Dubois, P.J., Roll, I.,

  4. [4]

    A LLM-Powered Automatic Grading Framework with Human-Level Guidelines Optimization, in: Educational Data Mining, International Educational Data Mining Society. p. n/a. URL: https://educationaldatamining.org/EDM2025/proceedings/2025.EDM.long-papers.80/index.html, doi:10.48550/arXiv. 2410.02165. Cock,J.M.,Marras,M.,Giang,C.,Käser,T.,2022. Generalisablemetho...

  5. [5]

    Learning and Instruction 103, 102274

    Analyzing embodied learning in classroom settings: A human-in-the-loop ai approach for multimodal learning analytics. Learning and Instruction 103, 102274. doi:https://doi.org/10.1016/j.learninstruc.2025.102274. Ganguly,A.,Mehjabin,N.,Malik,A.,Johri,A.,2026. Conversationalaiagentsineducation:Anumbrellareviewofcurrentutilization,challenges, and future dire...

  6. [6]

    Howarelearninganalyticsconsideringthesocietalvaluesoffairness,accountability,transparencyandhuman well-being?: A literature review

    Hakami,E.,HernándezLeo,D.,2020. Howarelearninganalyticsconsideringthesocietalvaluesoffairness,accountability,transparencyandhuman well-being?: A literature review. Martínez-Monés A, Álvarez A, Caeiro-Rodríguez M, Dimitriadis Y, editors. LASI-SPAIN 2020: Learning Analytics Summer Institute Spain 2020: Learning Analytics. Time for Adoption?; 2020 Jun 15-16;...

  7. [7]

    (Eds.), Artificial Intelligence in Education, Springer Nature Switzerland, Cham

    An LLM-Enhanced Multi-agent Architecture for Conversation-Based Assessment, in: Cristea, A.I., Walker, E., Lu, Y., Santos, O.C., Isotani, S. (Eds.), Artificial Intelligence in Education, Springer Nature Switzerland, Cham. pp. 119–134. doi:10.1007/978-3-031-98417-4_9. Hutchins,N.M.,Biswas,G.,Maróti,M.,Lédeczi,Á.,Grover,S.,Wolf,R.,Blair,K.P.,Chin,D.,Conlin,...

  8. [8]

    Your Brain on ChatGPT: Accumulation of Cognitive Debt when Using an AI Assistant for Essay Writing Task

    Explainable artificial intelligence in education. Computers and education: artificial intelligence 3, 100074. Khosrawi-Rad,B.,Keller,P.F.,Benner,D.,Grogorick,L.,Borchers,A.,Janson,A.,Leimeister,J.M.,Robra-Bissantz,S.,2025. Promotingstudents’ motivation in language education with gamified pedagogical conversational agents. Computers & Education 238, 105374...

  9. [9]

    Performance Comparison of Deep Learning Models for CO2 Pre- diction: Analyzing Carbon Footprint with Advanced Trackers,

    EduMAS: A Novel LLM-Powered Multi-Agent Framework for Educational Support, in: 2024 IEEE International Conference on Big Data (BigData), pp. 8309–8316. URL:https://ieeexplore.ieee.org/abstract/document/ 10826103/authors, doi:10.1109/BigData62323.2024.10826103. ISSN: 2573-2978. Liu,B.,Zhang,J.,Lin,F.,Jia,X.,Peng,M.,2025a. Onesizedoesn’tfitall:Apersonalized...

  10. [10]

    Computers & Education 234, 105314

    The role of teachable agents’ personality traits on student-ai interactions and math learning. Computers & Education 234, 105314. Memarian,B.,Doleck,T.,2024. Human-in-the-loopinartificialintelligenceineducation:Areviewandentity-relationship(er)analysis. Computers in Human Behavior: Artificial Humans 2, 100053. Mislevy, R.J., Almond, R.G., Lukas, J.F.,

  11. [11]

    ETS Research Report Series 2003, i–29

    A brief introduction to evidence-centered design. ETS Research Report Series 2003, i–29. Moos, D.C., Azevedo, R.,

  12. [12]

    Review of educational research 79, 576–600

    Learning with computer-based learning environments: A literature review of computer self-efficacy. Review of educational research 79, 576–600. Munshi,A.,Biswas,G.,Baker,R.,Ocumpaugh,J.,Hutt,S.,Paquette,L.,2023.Analysingadaptivescaffoldsthathelpstudentsdevelopself-regulated learning behaviours. Journal of Computer Assisted Learning 39, 351–368. Cohn et al....

  13. [13]

    Measuring Agents in Production

    Measuring agents in production. arXiv preprint arXiv:2512.04123 . Ponton,M.K.,Rhea,N.E.,2006. Autonomouslearningfromasocialcognitiveperspective. NewHorizonsinAdultEducationandHumanResource Development 20, 38–49. Ritter, F.E., Tehranchi, F., Oury, J.D.,

  14. [14]

    URL:https://www.sciencedirect.com/science/article/pii/S109675162600014X, doi:https://doi.org/10.1016/j.iheduc.2026.101087

    Human-ai collaboration or obedient and often clueless ai in instruct, serve, repeat dynamics? TheInternetandHigherEducation70,101087. URL:https://www.sciencedirect.com/science/article/pii/S109675162600014X, doi:https://doi.org/10.1016/j.iheduc.2026.101087. Scholz,N.,Nguyen,M.H.,Singla,A.,Nagashima,T.,2025. Partneringwithai:Apedagogicalfeedbacksystemforllm...

  15. [15]

    (Eds.), Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Association for Computational Linguistics, Vienna, Austria

    EducationQ: Evaluating LLMs’ Teaching Capabilities Through Multi-Agent Dialogue Framework, in: Che, W., Nabende, J., Shutova, E., Pilehvar, M.T. (Eds.), Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Association for Computational Linguistics, Vienna, Austria. pp. 32799–32828. URL:https://ac...

  16. [16]

    Computer games and instruction 55, 503–524

    Stealth assessment in computer-based games to support learning. Computer games and instruction 55, 503–524. Sinha,T.,Kapur,M.,2021. Whenproblemsolvingfollowedbyinstructionworks:Evidenceforproductivefailure. ReviewofEducationalResearch 91, 761–798. Sixu, A., Yu, Y., Yunsi, M., Guandong, X., et al.,

  17. [17]

    Sun,E.,Tai,L.,2025

    What is agentic ai? URL:https://www.ibm.com/think/topics/agentic-ai. Sun,E.,Tai,L.,2025. MultiTutor:CollaborativeLLMAgentsforMultimodalStudentSupport,in:ProceedingsoftheInnovationandResponsibility in AI-Supported Education Workshop, PMLR. pp. 174–190. URL:https://proceedings.mlr.press/v273/sun25a.html. ISSN: 2640-

  18. [18]

    Timalsina,U.,Davalos_Anaya,E.,Sanda,N.,Zhang,Y.,Horn_Fonteles,J.,T_S,A.,Biswas,G.,2025.Syncflow:Ascalableplatformformultimodal learning analytics, in: US Research Software Engineering Conference (USRSE25), Zenodo. p. n/a. Tsai, Y.S., Whitelock-Wainwright, A., Gašević, D.,

  19. [19]

    LLM-powered Multi-agent Framework for Goal-oriented Learning in Intelligent Tutoring System, in: Companion Proceedings of the ACM on Web Conference 2025, Association for Computing Machinery, New York, NY, USA. pp. 510–519. URL:https://dl.acm.org/doi/10.1145/3701716.3715244, doi:10.1145/3701716. 3715244. Wu,T.,Zhai,X.,Song,Y.,2025. Theeffectsofgai-enhanced...

  20. [20]

    Computers & Education , 105494

    Investigating the effects of an llm-based socratic conversational agent on students’ academic performance and reflective thinking in higher education. Computers & Education , 105494. Xing,W.,Kim,T.,Song,Y.,Li,H.,Li,C.,Kim,J.,2026. Unveilinginteractionpatternsbetweenstudentsandgenerativeaiteachableagent:Focusing on students’ agency and ai agents’ authority...

  21. [21]

    Mentigo: An Intelligent Agent for Mentoring Students in the Creative Problem SolvingProcess,in:Proceedingsofthe2025CHIConferenceonHumanFactorsinComputingSystems,AssociationforComputingMachinery, New York, NY, USA. pp. 1–22. URL:https://dl.acm.org/doi/10.1145/3706598.3713952, doi:10.1145/3706598.3713952. Zhang, J., Borchers, C., Cohn, C., Srivastava, N., S...