Why teaching resists automation in an AI-inundated era: Human judgment, non-modular work, and the limits of delegation

Songhee Han

arxiv: 2604.07285 · v1 · submitted 2026-04-08 · 💻 cs.CL · cs.CY

Why teaching resists automation in an AI-inundated era: Human judgment, non-modular work, and the limits of delegation

Songhee Han This is my paper

Pith reviewed 2026-05-10 17:43 UTC · model grok-4.3

classification 💻 cs.CL cs.CY

keywords teaching automationAI in educationhuman judgmentlimits of delegationnon-modular workprofessional judgmenteducational technologyAI and cognition

0 comments

The pith

Teaching resists full automation because it depends on interpretive judgment and emergent understandings of human cognition that AI cannot exhaustively model.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper argues that teaching cannot be treated as a set of separable, procedural tasks ready for delegation to AI. While systems like large language models can handle bounded activities such as information retrieval, the relational and contextual elements of instruction derive their value from ongoing human interpretation that resists complete specification. This holds because learning involves unpredictable interactions of motivation, behavior, and social dynamics that emerge in real time rather than following fixed rules. A reader should care because the claim implies that efforts to replace teachers with technology will leave gaps in accountability and adaptability that matter for actual educational outcomes. The conclusion follows directly from viewing teaching as professional work grounded in unspecifiable aspects of human cognition.

Core claim

Teaching and learning are shaped by human cognition, behavior, motivation, and social interaction in ways that cannot be fully specified, predicted, or exhaustively modeled. Tasks that may appear separable in principle derive their instructional value in practice from ongoing contextual interpretation across learners, situations, and relationships. As long as educational practice relies on emergent understanding of human cognition and learning, teaching remains a form of professional work that resists automation.

What carries the argument

The non-modular character of instructional work, in which tasks gain meaning only through interpretive judgment tied to specific learner contexts and relationships.

If this is right

AI can improve access to information and support selected activities but leaves the core need for human judgment and relational accountability intact.
Instructional value in practice often comes from elements that cannot be isolated without losing their effectiveness.
Claims that teaching will be largely automated overstate the separability of its components.
Professional teaching will continue to require human oversight where learning involves emergent and context-specific factors.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Similar limits on automation may apply to other fields that center on real-time human interpretation, such as clinical care or legal advocacy.
Training programs for educators could shift emphasis toward cultivating judgment skills rather than assuming technology will handle core decisions.
Policy decisions about AI in schools should prioritize tools that augment rather than attempt to substitute for the relational side of instruction.
If future AI could simulate emergent cognition at scale, the paper's central distinction would require re-examination.

Load-bearing premise

The interpretive, relational, and judgment-based elements of teaching cannot be fully specified, predicted, or exhaustively modeled by AI systems.

What would settle it

A controlled, long-term trial in which an AI system handles every aspect of classroom instruction for multiple groups of students, including real-time adaptation to unexpected motivations and social dynamics, with no human teacher present and with measurable learning outcomes matching or exceeding those under human-led teaching.

read the original abstract

Debates about artificial intelligence (AI) in education often portray teaching as a modular and procedural job that can increasingly be automated or delegated to technology. This brief communication paper argues that such claims depend on treating teaching as more separable than it is in practice. Drawing on recent literature and empirical studies of large language models and retrieval-augmented generation systems, I argue that although AI can support some bounded functions, instructional work remains difficult to automate in meaningful ways because it is inherently interpretive, relational, and grounded in professional judgment. More fundamentally, teaching and learning are shaped by human cognition, behavior, motivation, and social interaction in ways that cannot be fully specified, predicted, or exhaustively modeled. Tasks that may appear separable in principle derive their instructional value in practice from ongoing contextual interpretation across learners, situations, and relationships. As long as educational practice relies on emergent understanding of human cognition and learning, teaching remains a form of professional work that resists automation. AI may improve access to information and support selected instructional activities, but it does not remove the need for human judgment and relational accountability that effective teaching requires.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This position paper restates that teaching resists automation due to relational judgment and emergent human cognition but adds no new evidence or technical bounds.

read the letter

The main takeaway is that this is a position paper arguing teaching resists full automation because it involves interpretive and relational judgment tied to human cognition that AI like LLMs cannot exhaustively model. It pulls from recent studies but offers no original data or formal results. The paper does a good job of clearly stating why some tasks that seem delegable still require human oversight in practice. It connects this to the limits of current AI systems in handling context and emergence, which is a fair synthesis of familiar concerns in the field. Where it is softer is on proving the limits. The claim depends on the idea that certain elements of teaching derive from emergent understanding that cannot be specified or predicted by AI, but there is no technical argument here showing why retrieval or transformer models hit a hard wall on this. It also skips detailed discussion of existing adaptive tools that approximate personalization. This leaves the conclusion resting more on assumption than demonstration, though the cited literature may back parts of it. This kind of short piece is useful for readers in education policy or AI ethics who need a concise overview of why full delegation might not work. It won't shift technical researchers much but could inform discussions on teacher roles. I would bring it to a reading group for discussion on the assumptions. It deserves peer review as a communication to get feedback on strengthening the engagement with counterexamples. I'd recommend accepting it for review.

Referee Report

2 major / 1 minor

Summary. The paper is a brief conceptual communication arguing that teaching resists meaningful automation by AI (including LLMs and RAG systems) because instructional work is inherently interpretive, relational, and dependent on professional judgment grounded in emergent human cognition, behavior, motivation, and social interaction. It claims that tasks appearing separable in principle derive their value from ongoing contextual interpretation that cannot be fully specified, predicted, or exhaustively modeled, so AI can support bounded functions but does not eliminate the need for human judgment and relational accountability.

Significance. If the central claim holds, the paper would contribute to policy and design discussions in AI-augmented education by cautioning against over-delegation. However, as a purely argumentative piece without new data, derivations, or technical bounds, its significance is modest and primarily rhetorical; it synthesizes existing literature on LLM limitations but does not advance falsifiable predictions or machine-checked results.

major comments (2)

[Abstract / central argument] The load-bearing premise (stated in the abstract and developed throughout) that human cognition and learning 'cannot be fully specified, predicted, or exhaustively modeled' by LLMs or RAG is asserted via reference to empirical studies but receives no technical characterization, information-theoretic bound, or architectural argument showing why transformer-based systems with retrieval cannot approximate the required judgment. This assumption directly supports the resistance-to-automation conclusion yet is not demonstrated within the manuscript.
[Abstract / central argument] The manuscript does not engage counterexamples such as existing adaptive tutoring systems that already perform relational personalization and contextual judgment at scale; without addressing these cases, the claim that instructional value 'derives from ongoing contextual interpretation' remains vulnerable to the objection that such interpretation is already being partially delegated.

minor comments (1)

[Abstract] The abstract and text refer to 'recent literature and empirical studies' without naming key citations in the provided summary; adding 2-3 specific references to LLM/RAG limitation studies would improve traceability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for these constructive comments, which help clarify the argumentative structure of our brief conceptual paper. We address each major point below, indicating where revisions will be made to strengthen engagement with the literature and potential objections.

read point-by-point responses

Referee: The load-bearing premise (stated in the abstract and developed throughout) that human cognition and learning 'cannot be fully specified, predicted, or exhaustively modeled' by LLMs or RAG is asserted via reference to empirical studies but receives no technical characterization, information-theoretic bound, or architectural argument showing why transformer-based systems with retrieval cannot approximate the required judgment. This assumption directly supports the resistance-to-automation conclusion yet is not demonstrated within the manuscript.

Authors: We acknowledge that the manuscript provides no new technical characterization, information-theoretic bound, or architectural analysis of transformer limitations. As a short conceptual communication, it instead synthesizes and applies existing empirical findings from the LLM literature (on context windows, hallucination patterns, and the difficulty of modeling relational and motivational dynamics) to argue that full specification of teaching judgment is not feasible in practice. We will revise to more explicitly summarize the cited empirical studies and to state clearly that deriving formal bounds lies outside the paper's non-technical scope; the contribution remains the application of these limits to instructional work rather than a proof of impossibility. revision: partial
Referee: The manuscript does not engage counterexamples such as existing adaptive tutoring systems that already perform relational personalization and contextual judgment at scale; without addressing these cases, the claim that instructional value 'derives from ongoing contextual interpretation' remains vulnerable to the objection that such interpretation is already being partially delegated.

Authors: We agree this engagement is needed. Adaptive tutoring systems achieve personalization within constrained domains but depend on human-specified knowledge models, pre-defined learning objectives, and ongoing teacher oversight for motivation, emotional context, and unanticipated social factors. We will add a concise paragraph in the discussion section that directly addresses these systems as examples of bounded delegation, showing how they still leave core interpretive and relational elements to human judgment and thereby supporting rather than undermining the central argument. revision: yes

Circularity Check

0 steps flagged

No circularity: conceptual argument draws on external literature without self-referential reduction

full rationale

The paper advances a philosophical claim that teaching resists automation due to its interpretive, relational, and judgment-based nature rooted in emergent human cognition that cannot be exhaustively modeled by current AI. This rests on citations to external empirical studies of LLMs and RAG systems rather than any internal derivation, equation, or self-definition. No fitted parameters, predictions by construction, or load-bearing self-citations appear; the argument is self-contained against external benchmarks and does not reduce its conclusion to its own premises by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on domain assumptions about the non-modular character of teaching and the inherent limits of current AI in modeling human cognition and social interaction; no free parameters or invented entities are introduced.

axioms (2)

domain assumption Teaching derives its instructional value from ongoing contextual interpretation across learners, situations, and relationships that cannot be fully specified in advance.
This premise underpins the conclusion that automation is limited even for tasks that appear separable.
domain assumption Human cognition, behavior, motivation, and social interaction in learning cannot be exhaustively modeled or predicted by AI systems.
Invoked to explain why relational and judgmental aspects of teaching resist delegation.

pith-pipeline@v0.9.0 · 5495 in / 1303 out tokens · 85317 ms · 2026-05-10T17:43:10.135182+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

5 extracted references · 5 canonical work pages

[1]

Abo El-Enen, M., Saad, S., & Nazmy, T. (2025). A survey on retrieval-augmentation generation (RAG) models for healthcare applications. Neural Computing and Applications, 37(33), 28191–28267. https://doi.org/10.1007/s00521-025-11666-9 Abu-Rasheed, H., Weber, C., & Fathi, M. (2024). Knowledge graphs as context sources for LLM-based explanations of learning ...

work page doi:10.1007/s00521-025-11666-9 2025
[2]

Brown, J

Institute for Mathematical Studies in the Social Sciences, Stanford University. Brown, J. S., Collins, A., & Duguid, P. (1989). Situated cognition and the culture of learning. 18(1), 32–42. Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J. D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., Agarwal, S., Herbert-Voss, A., Krueger, G...

work page doi:10.52202/079017-3270 1989
[3]

https://doi.org/10.1007/s10462-024-10888-y Lampert, M. (2010). Learning teaching in, from, and for practice: What do we mean? Journal of Teacher Education, 61(1–2), 21–34. Laurillard, D. (2013). Teaching as a design science: Building pedagogical patterns for learning and technology. Routledge. https://doi.org/10.4324/9780203125083 Lave, J., & Wenger, E. (...

work page doi:10.1007/s10462-024-10888-y 2010
[4]

https://doi.org/10.1007/s10648-024-09934-6 Palmer, D., & Snodgrass Rangel, V. (2011). High stakes accountability and policy implementation: Teacher decision making in bilingual classrooms in Texas. Educational Policy, 25(4), 614–647. https://doi.org/10.1177/0895904810374848 Park, E. S., McPartlan, P., Solanki, S., & Xu, D. (2023). When expectation isn’t r...

work page doi:10.1007/s10648-024-09934-6 2011
[5]

https://doi.org/10.3102/00346543051004455 Stevens, D. D. (2023). Introduction to rubrics: An assessment tool to save grading time, convey effective feedback, and promote student learning (2nd ed.). Routledge. https://doi.org/10.4324/9781003445432 Tait, M. (2008). Resilience as a contributor to novice teacher success, commitment, and retention. Teacher Edu...

work page doi:10.3102/00346543051004455 2023

[1] [1]

Abo El-Enen, M., Saad, S., & Nazmy, T. (2025). A survey on retrieval-augmentation generation (RAG) models for healthcare applications. Neural Computing and Applications, 37(33), 28191–28267. https://doi.org/10.1007/s00521-025-11666-9 Abu-Rasheed, H., Weber, C., & Fathi, M. (2024). Knowledge graphs as context sources for LLM-based explanations of learning ...

work page doi:10.1007/s00521-025-11666-9 2025

[2] [2]

Brown, J

Institute for Mathematical Studies in the Social Sciences, Stanford University. Brown, J. S., Collins, A., & Duguid, P. (1989). Situated cognition and the culture of learning. 18(1), 32–42. Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J. D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., Agarwal, S., Herbert-Voss, A., Krueger, G...

work page doi:10.52202/079017-3270 1989

[3] [3]

https://doi.org/10.1007/s10462-024-10888-y Lampert, M. (2010). Learning teaching in, from, and for practice: What do we mean? Journal of Teacher Education, 61(1–2), 21–34. Laurillard, D. (2013). Teaching as a design science: Building pedagogical patterns for learning and technology. Routledge. https://doi.org/10.4324/9780203125083 Lave, J., & Wenger, E. (...

work page doi:10.1007/s10462-024-10888-y 2010

[4] [4]

https://doi.org/10.1007/s10648-024-09934-6 Palmer, D., & Snodgrass Rangel, V. (2011). High stakes accountability and policy implementation: Teacher decision making in bilingual classrooms in Texas. Educational Policy, 25(4), 614–647. https://doi.org/10.1177/0895904810374848 Park, E. S., McPartlan, P., Solanki, S., & Xu, D. (2023). When expectation isn’t r...

work page doi:10.1007/s10648-024-09934-6 2011

[5] [5]

https://doi.org/10.3102/00346543051004455 Stevens, D. D. (2023). Introduction to rubrics: An assessment tool to save grading time, convey effective feedback, and promote student learning (2nd ed.). Routledge. https://doi.org/10.4324/9781003445432 Tait, M. (2008). Resilience as a contributor to novice teacher success, commitment, and retention. Teacher Edu...

work page doi:10.3102/00346543051004455 2023