pith. sign in

arxiv: 2604.22237 · v1 · submitted 2026-04-24 · 💻 cs.CL · cs.AI

Tell Me Why: Designing an Explainable LLM-based Dialogue System for Student Problem Behavior Diagnosis

Pith reviewed 2026-05-08 11:55 UTC · model grok-4.3

classification 💻 cs.CL cs.AI
keywords explainable AILLM dialogue systemsstudent behavior diagnosiseducational technologyteacher trusthierarchical attributionintervention planning
0
0 comments X

The pith

An LLM dialogue system with hierarchical explanations for student behavior diagnoses increases reported teacher trust by surfacing dialogue evidence.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Teachers must pull together many details to identify student problem behaviors and choose interventions, yet black-box LLM suggestions offer little insight into their logic. This paper builds a fine-tuned LLM dialogue system and adds a hierarchical attribution method drawn from explainable AI to locate the specific conversation evidence behind each recommendation and turn that evidence into plain-language explanations. Technical tests show the method finds supporting evidence more accurately than baseline approaches. A preliminary study with 22 pre-service teachers found that participants given these explanations reported higher trust in the system than those who received only the recommendations.

Core claim

The authors demonstrate that a fine-tuned LLM-based dialogue system for diagnosing student problem behaviors can be augmented with a hierarchical attribution method to identify relevant dialogue evidence for each recommendation and generate natural-language explanations from that evidence. This produces stronger performance on evidence identification tasks than baseline methods and leads to measurably higher trust ratings from pre-service teachers in a small user study.

What carries the argument

Hierarchical attribution method, which traces each system recommendation back to specific parts of the dialogue to produce evidence-based natural-language explanations.

If this is right

  • The system can identify behavioral categories and suggest interventions while also showing teachers the exact dialogue turns that support those suggestions.
  • Teachers who receive the explanations report higher trust, which the authors link to greater potential use of the tool in practice.
  • The hierarchical method outperforms standard attribution baselines at recovering supporting evidence from multi-turn conversations.
  • Explanations are generated automatically from the model's outputs without requiring separate training for the explanation component.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Similar evidence-tracing techniques could be tested in other teacher-facing tools such as lesson planning or progress monitoring where transparency matters for adoption.
  • If the explanations prove faithful over time, they might serve as training examples that help new teachers internalize diagnostic patterns.
  • A follow-up study measuring whether higher trust actually changes the interventions teachers choose in live classroom scenarios would test the practical payoff.
  • The same hierarchical approach might address trust barriers when LLMs are used for other high-stakes professional decisions outside education.

Load-bearing premise

The hierarchical attribution method produces faithful accounts of the LLM's actual reasoning rather than plausible but invented justifications, and that higher reported trust will produce better real-world teaching decisions.

What would settle it

A study in which the attributions fail to match the model's internal attention patterns on the same inputs, or a larger trial showing no difference in teachers' actual diagnostic accuracy or intervention quality despite the added explanations.

Figures

Figures reproduced from arXiv: 2604.22237 by Deliang Wang, Penghe Chen, Yu Lu, Zhilin Fan.

Figure 1
Figure 1. Figure 1: Overview of the explainable diagnostic dialogue system. view at source ↗
Figure 2
Figure 2. Figure 2: System interface showing (1) the dialogue history, (2) recommended view at source ↗
read the original abstract

Diagnosing student problem behaviors requires teachers to synthesize multifaceted information, identify behavioral categories, and plan intervention strategies. Although fine-tuned large language models (LLMs) can support this process through multi-turn dialogue, they rarely explain why a strategy is recommended, limiting transparency and teachers' trust. To address this issue, we present an explainable dialogue system built on a fine-tuned LLM. The system uses a hierarchical attribution method based on explainable AI (xAI) to identify dialogue evidence for each recommendation and generate a natural-language explanation based on that evidence. In technical evaluation, the method outperformed baseline approaches in identifying supporting evidence. In a preliminary user study with 22 pre-service teachers, participants who received explanations reported higher trust in the system. These findings suggest a promising direction for improving LLM explainability in educational dialogue systems.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 0 minor

Summary. The manuscript presents an LLM-based multi-turn dialogue system for diagnosing student problem behaviors, augmented by a hierarchical attribution method drawn from xAI to extract dialogue evidence and generate natural-language explanations for each recommendation. It reports that the attribution method outperforms baseline approaches at identifying supporting evidence in technical evaluation, and that a preliminary user study with 22 pre-service teachers found higher self-reported trust when explanations were shown.

Significance. If the reported gains are shown to rest on faithful attributions and to translate into improved diagnostic accuracy or intervention quality, the work would offer a concrete, deployable approach to increasing transparency in educational dialogue systems. The combination of fine-tuned LLMs with hierarchical attribution is a timely application of xAI techniques to a domain where teacher trust is critical; the preliminary user-study evidence of trust gains is a useful starting point, though the absence of outcome measures limits immediate claims of practical impact.

major comments (3)
  1. [Abstract] The technical evaluation asserts outperformance in identifying supporting evidence, yet supplies no quantitative metrics, statistical tests, baseline implementation details, or sample sizes (Abstract). These omissions make it impossible to evaluate whether the superiority claim is robust or merely suggestive.
  2. [User Study] The user study (n=22) measures only Likert-scale trust and does not assess whether participants who received explanations produced more accurate behavior diagnoses or higher-quality intervention plans relative to an expert gold standard. This gap directly weakens the inference that higher trust improves real-world teaching decisions.
  3. No standard faithfulness checks for the hierarchical attribution method—such as correlation of attribution scores with model internals (attention or gradients), ablation of evidence spans, or human ratings of explanation fidelity—are reported. Without these, it remains unclear whether the generated natural-language explanations reflect the LLM’s actual reasoning or constitute post-hoc rationalizations.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the detailed and constructive comments. We address each major point below, indicating revisions where appropriate to strengthen the manuscript while maintaining its focus on a preliminary study.

read point-by-point responses
  1. Referee: [Abstract] The technical evaluation asserts outperformance in identifying supporting evidence, yet supplies no quantitative metrics, statistical tests, baseline implementation details, or sample sizes (Abstract). These omissions make it impossible to evaluate whether the superiority claim is robust or merely suggestive.

    Authors: We agree that the abstract would benefit from greater specificity. The full manuscript reports quantitative results (including precision, recall, and F1 for evidence identification), baseline details, sample sizes, and statistical comparisons in the technical evaluation section. We will revise the abstract to incorporate key metrics, statistical test outcomes, and implementation details to better substantiate the outperformance claim. revision: yes

  2. Referee: [User Study] The user study (n=22) measures only Likert-scale trust and does not assess whether participants who received explanations produced more accurate behavior diagnoses or higher-quality intervention plans relative to an expert gold standard. This gap directly weakens the inference that higher trust improves real-world teaching decisions.

    Authors: We acknowledge this limitation. The study is explicitly framed as preliminary and measures self-reported trust as an initial indicator of explanation utility. It does not include outcome measures such as diagnostic accuracy or intervention quality against a gold standard. We will revise the discussion and limitations sections to clearly state this scope, avoid overclaiming practical impact, and outline plans for future studies that incorporate expert-rated outcomes. revision: partial

  3. Referee: [—] No standard faithfulness checks for the hierarchical attribution method—such as correlation of attribution scores with model internals (attention or gradients), ablation of evidence spans, or human ratings of explanation fidelity—are reported. Without these, it remains unclear whether the generated natural-language explanations reflect the LLM’s actual reasoning or constitute post-hoc rationalizations.

    Authors: The technical evaluation provides evidence of the attribution method's utility through superior performance on evidence identification tasks, which serves as a task-aligned proxy for faithfulness. However, we did not include correlations with internal model signals, ablation studies, or dedicated human fidelity ratings. We will add a dedicated subsection discussing the method's grounding in xAI techniques, report any available human ratings of explanation quality from the user study, and explicitly note the absence of certain internal checks as a limitation with suggestions for future validation. revision: partial

Circularity Check

0 steps flagged

No circularity: empirical claims rest on external evaluations and user study

full rationale

The paper describes construction of an LLM-based dialogue system augmented with a hierarchical attribution method for generating explanations. Its central claims (outperformance in identifying supporting evidence; higher trust in user study) are supported by technical comparisons against baselines and a preliminary study with 22 participants measuring self-reported trust. No equations, derivations, or first-principles arguments are present that reduce to self-definition, fitted inputs renamed as predictions, or load-bearing self-citations. The evaluations are independent measurements against external baselines and human subjects rather than tautological restatements of the system's design choices.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim depends on standard assumptions about LLM fine-tuning and the validity of attribution methods in xAI; no free parameters or new entities are introduced in the abstract.

axioms (2)
  • domain assumption Fine-tuned LLMs can synthesize multifaceted information to diagnose student problem behaviors through multi-turn dialogue
    Invoked as the foundation for the dialogue system
  • domain assumption Hierarchical attribution from xAI can reliably surface dialogue evidence that supports model recommendations
    Core premise enabling the generation of natural-language explanations

pith-pipeline@v0.9.0 · 5441 in / 1341 out tokens · 52310 ms · 2026-05-08T11:55:10.538416+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

13 extracted references · 13 canonical work pages

  1. [1]

    IEEE Transactions on Learning Technologies18, 1–15 (2025)

    Chen, P., Fan, Z., Lu, Y.: Knowstu: Diagnosing students’ problem behaviors using fine-tuned llm and rag. IEEE Transactions on Learning Technologies18, 1–15 (2025)

  2. [2]

    In: International Conference on Artificial Intelligence in Education

    Chen, P., Fan, Z., Lu, Y., Xu, Q.: Pbchat: Enhance student’s problem behavior diagnosis with large language model. In: International Conference on Artificial Intelligence in Education. pp. 32–45. Springer (2024)

  3. [3]

    In: Forty-second International Conference on Machine Learning

    Chuang, Y.S., Cohen-Wang, B., Shen, Z., Wu, Z., Xu, H., Lin, X.V., Glass, J.R., Li, S.W., Yih, W.t.: Selfcite: Self-supervised alignment for context attribution in large language models. In: Forty-second International Conference on Machine Learning

  4. [4]

    arXiv preprint arXiv:2409.00729 (2024)

    Cohen-Wang, B., Shah, H., Georgiev, K., Madry, A.: Contextcite: Attributing model generation to context. arXiv preprint arXiv:2409.00729 (2024)

  5. [5]

    In: International Conference on Artificial Intelligence in Education

    Fan, Z., Chen, P., Lu, Y.: Why did the ai suggest that? designing an explainable ed- ucational counseling system. In: International Conference on Artificial Intelligence in Education. pp. 321–335. Springer (2025)

  6. [6]

    International Journal of Artificial Intelli- gence in Education35(5), 2889–2922 (2025)

    Feldman-Maggor, Y., Cukurova, M., Kent, C., Alexandron, G.: The impact of ex- plainable ai on teachers’ trust and acceptance of ai edtech recommendations: The power of domain-specific explanations. International Journal of Artificial Intelli- gence in Education35(5), 2889–2922 (2025)

  7. [7]

    Computers and Education: Artificial Intelligence3, 100074 (2022)

    Khosravi, H., Shum, S.B., Chen, G., Conati, C., Tsai, Y.S., Kay, J., Knight, S., Martinez-Maldonado, R., Sadiq, S., Gašević, D.: Explainable artificial intelligence in education. Computers and Education: Artificial Intelligence3, 100074 (2022)

  8. [8]

    Human Factors: The Journal of the Human Factors and Ergonomics Society , author =

    Merritt, S.M.: Affective processes in human–automation interactions. Human Fac- tors53(4), 356–370 (2011).https://doi.org/10.1177/0018720811411912

  9. [9]

    Qian, C., Wang, P., Liu, D., Yang, J., Guo, D., Tang, L., Mei, J., Ren, Q., Shao, S., Liu, Y., Fu, J., Shao, J., Hu, X.: The why behind the action: Unveiling internal drivers via agentic attribution (2026),https://arxiv.org/abs/2601.15075

  10. [10]

    Journal of Positive Behavior Interventions 22(4), 220–233 (2020)

    Sutherland, K., Conroy, M., McLeod, B., Granger, K., Broda, M., Kunemund, R.: Preliminary study of the effects of best in class–elementary on outcomes of elemen- tary students with problem behavior. Journal of Positive Behavior Interventions 22(4), 220–233 (2020)

  11. [11]

    Advances in Neural Information Processing Systems36, 74952–74965 (2023)

    Turpin, M., Michael, J., Perez, E., Bowman, S.: Language models don’t always say what they think: Unfaithful explanations in chain-of-thought prompting. Advances in Neural Information Processing Systems36, 74952–74965 (2023)

  12. [12]

    British Journal of Educational Technology55(6), 2530–2556 (2024)

    Wang, D., Bian, C., Chen, G.: Using explainable ai to unravel classroom dialogue analysis: Effects of explanations on teachers’ trust, technology acceptance and cog- nitive load. British Journal of Educational Technology55(6), 2530–2556 (2024)

  13. [13]

    IEEE Transactions on Education 67(6), 907–918 (2024)

    Wang, D., Chen, G.: Making ai accessible for stem teachers: Using explainable ai for unpacking classroom discourse analysis. IEEE Transactions on Education 67(6), 907–918 (2024)