pith. sign in

arxiv: 2605.17468 · v2 · pith:EO55UQASnew · submitted 2026-05-17 · 💻 cs.HC · cs.AI

An Interpretable Closed-Loop Intelligent Tutoring System for Multimodal Affective Feedback in Asynchronous Presentation Training

Pith reviewed 2026-05-25 05:47 UTC · model grok-4.3

classification 💻 cs.HC cs.AI
keywords intelligent tutoring systemmultimodal feedbackpresentation trainingbehaviorally anchored rating scaleclosed-loop ITSXGBoostasynchronous learningaffective computing
0
0 comments X

The pith

A closed-loop intelligent tutoring system using multimodal analysis and three-layer feedback produces significant gains on seven presentation skill dimensions.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces an interpretable ITS for on-camera oral presentation training that processes facial, vocal, textual, and oculomotor features through an XGBoost model to generate scores on a seven-dimensional Behaviorally Anchored Rating Scale. A three-layer architecture then converts those scores into audience-perceived expressive diagnostics and retrieval-augmented coaching advice to guide deliberate practice. The model was trained on 10,360 MOOC video segments and reached scoring performance comparable to expert raters. In a 30-day pre-post study of 204 adult learners, all seven dimensions showed statistically significant improvement, and higher practice frequency predicted better posttest results after baseline and demographic controls. The work illustrates a concrete path for turning multimodal analytics into observable behavioral change for performance competencies at scale.

Core claim

The system operationalizes a seven-dimensional BARS and implements a three-layer interpretable feedback architecture that connects rubric-aligned multimodal scoring, audience-perceived expressive diagnostics, and retrieval-augmented conversational coaching. Trained on 10,360 MOOC video segments, the XGBoost backbone achieves rubric-aligned scoring with R2 of 0.48-0.61, Spearman's rho of 0.69-0.78, and MAE of 0.43-0.57. In the pre-post validation study with 204 adult learners over a 30-day window, participants showed significant improvements across all seven BARS dimensions (Cohen's d = 0.39-0.90), and practice frequency remained strongly associated with posttest performance after controlling

What carries the argument

The three-layer interpretable feedback architecture that maps rubric-aligned multimodal scores to audience-perceived expressive diagnostics and retrieval-augmented conversational coaching.

If this is right

  • Participants demonstrated significant improvements across all seven BARS dimensions with Cohen's d ranging from 0.39 to 0.90.
  • Practice frequency showed a strong positive association with posttest performance after controlling for baseline scores and demographics.
  • The XGBoost model reached rubric-aligned scoring performance levels comparable to expert ratings on the held-out MOOC segments.
  • The integrated feedback architecture supports deliberate practice for performance-based competencies at scale.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The traceable, rubric-linked feedback may increase learner acceptance compared with opaque scoring systems.
  • The same architecture could be tested on related performance skills such as job-interview responses or sales pitches.
  • Follow-up measurement after the 30-day window would show whether the gains transfer to live, unscripted presentations.

Load-bearing premise

The three-layer feedback architecture successfully converts rubric-aligned multimodal scores into audience-perceived expressive diagnostics that produce the observed behavioral improvements.

What would settle it

A randomized trial in which one group uses the full three-layer ITS while a matched group receives only recording practice without the diagnostic or coaching layers, then comparing pre-post changes on the seven BARS dimensions.

read the original abstract

This paper presents an interpretable closed-loop Intelligent Tutoring System (ITS) that supports feedback-guided practice for developing on-camera oral presentation skills at scale. The system operationalizes a seven-dimensional Behaviorally Anchored Rating Scale (BARS) and implements a three-layer interpretable feedback architecture that connects rubric-aligned multimodal scoring, audience-perceived expressive diagnostics, and retrieval-augmented conversational coaching to support deliberate practice. Built on an XGBoost backbone, the ITS maps multimodal inputs (facial, vocal, textual, and oculomotor features) into evidence-based feedback that can be traced back to observable performance cues. Trained on 10,360 Massive Open Online Course (MOOC) video segments, the system achieved rubric-aligned scoring with performance levels comparable to expert ratings (R2 = 0.48-0.61, Spearman's rho = 0.69-0.78, MAE = 0.43-0.57). In a pre-post validation study with 204 adult learners over a 30-day practice window, participants demonstrated significant improvements across all seven BARS dimensions (Cohen's d = 0.39-0.90), with practice frequency showing a strong positive association with posttest performance after controlling for baseline scores and demographics. The results demonstrate how multimodal analytic outputs can be systematically transformed into observable behavioral change through an integrated feedback architecture, advancing explainable and pedagogically grounded ITS design for performance-based competencies.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 0 minor

Summary. The manuscript presents an interpretable closed-loop Intelligent Tutoring System (ITS) for asynchronous on-camera presentation training. It operationalizes a seven-dimensional Behaviorally Anchored Rating Scale (BARS) via a three-layer feedback architecture (rubric-aligned multimodal scoring with XGBoost, audience-perceived expressive diagnostics, and retrieval-augmented conversational coaching). The scoring model, trained on 10,360 MOOC video segments, maps facial, vocal, textual, and oculomotor features to BARS ratings with R² = 0.48-0.61, Spearman's rho = 0.69-0.78, and MAE = 0.43-0.57. A pre-post validation study with 204 adult learners over 30 days reports significant gains across all BARS dimensions (Cohen's d = 0.39-0.90) and a positive association between practice frequency and posttest performance after controlling for baselines and demographics.

Significance. If the causal attribution holds, the work supplies a concrete, traceable pipeline from multimodal analytics to behavioral change in a scalable ITS for performance skills. Credit is due for the held-out MOOC training set, the independent pre-post validation with reported effect sizes, and the explicit controls for baseline scores and demographics in the practice-frequency analysis.

major comments (1)
  1. [Validation study] Validation study (pre-post design with 204 learners): the reported BARS gains (d = 0.39-0.90) and the practice-frequency association are attributed to the three-layer feedback architecture, yet every participant receives the full closed-loop ITS. Without a control arm, the design cannot separate architecture-driven change from repeated testing, task familiarity, or self-selection into higher practice frequency, leaving the central claim that the architecture converts rubric-aligned scores into diagnostics that drive observable improvements untested.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their constructive feedback on the validation study design. We address the major comment below and will revise the manuscript accordingly.

read point-by-point responses
  1. Referee: [Validation study] Validation study (pre-post design with 204 learners): the reported BARS gains (d = 0.39-0.90) and the practice-frequency association are attributed to the three-layer feedback architecture, yet every participant receives the full closed-loop ITS. Without a control arm, the design cannot separate architecture-driven change from repeated testing, task familiarity, or self-selection into higher practice frequency, leaving the central claim that the architecture converts rubric-aligned scores into diagnostics that drive observable improvements untested.

    Authors: We agree that the single-arm pre-post design limits causal attribution to the three-layer feedback architecture specifically. The reported gains and dose-response association (after controlling for baselines and demographics) demonstrate observable change in a real-world deployment, but cannot isolate effects from repeated testing, task familiarity, or self-selection. We will revise the manuscript to: (1) explicitly state this limitation in the Discussion, (2) temper language around causal claims to emphasize associations and improvements rather than architecture-driven causation, and (3) outline future randomized controlled trials as necessary next steps. This preserves the contribution of the held-out training data, effect sizes, and traceable pipeline while addressing the design gap. revision: yes

Circularity Check

0 steps flagged

No circularity; scoring model and pre-post gains remain independent

full rationale

The paper trains an XGBoost model on 10,360 MOOC segments and reports standard held-out metrics (R2 = 0.48-0.61, rho = 0.69-0.78). It then presents a separate pre-post study (n=204) measuring direct behavioral change via Cohen's d and regression on practice frequency. No equations, self-citations, or definitions reduce the reported skill gains to the fitted scoring parameters by construction. The three-layer feedback architecture is described but does not invoke load-bearing self-citations or rename known results as novel derivations. The central claims are therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that BARS dimensions validly represent audience perception and that the feedback architecture produces the measured gains; the XGBoost parameters are fitted to the training videos.

free parameters (1)
  • XGBoost model parameters
    Fitted to the 10,360 MOOC video segments to achieve the reported R2 and MAE values.
axioms (1)
  • domain assumption The seven-dimensional BARS accurately measures audience-perceived presentation quality
    The system operationalizes this scale as the target for multimodal scoring and feedback.

pith-pipeline@v0.9.0 · 5792 in / 1350 out tokens · 34353 ms · 2026-05-25T05:47:53.035569+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

50 extracted references · 50 canonical work pages

  1. [1]

    Controlled evaluation of a multimodal system to improve oral presentation skills in a real learning setting,

    X. Ochoa and F. Domínguez, “Controlled evaluation of a multimodal system to improve oral presentation skills in a real learning setting,” Br. J. Educ. Technol., vol. 51, no. 5, pp. 1615 –1630, 2020, doi: 10.1111/bjet.12987

  2. [2]

    OpenOPAF: An open source multimodal system for automated feedback for oral presentations,

    X. Ochoa and H. Zhao, “OpenOPAF: An open source multimodal system for automated feedback for oral presentations,” J. Learn. Anal., vol. 11, no. 3, pp. 224–248, 2024, doi: 10.18608/jla.2024.8411

  3. [3]

    Evaluation of presentation skills in the context of online learning: A literature review,

    S. Suroto, E. Y. Haenilah, H. Hariri, Pargito, and N. Trenggono, “Evaluation of presentation skills in the context of online learning: A literature review,” Int. J. Inf. Educ. Technol., vol. 13, no. 5, pp. 855 –860, 2023, doi: 10.18178/ijiet.2023.13.5.1879

  4. [4]

    Developing a computer -based tutor utilizing generative artificial intelligence (GAI) and retrieval-augmented generation (RAG),

    Y. Lee, “Developing a computer -based tutor utilizing generative artificial intelligence (GAI) and retrieval-augmented generation (RAG),” Educ. Inf. Technol., vol. 30, pp. 7841–7862, 2025, doi: 10.1007/s10639-024-13129- 5

  5. [5]

    Predicting presentation skill of a speaker using automatic speaker and audience measurement,

    C. Thomas and D. Jayagopi, “Predicting presentation skill of a speaker using automatic speaker and audience measurement,” IEEE Trans. Learn. Technol., vol. 15, pp. 350–363, 2022, doi: 10.1109/TLT.2022.3171601

  6. [6]

    Multimodal transfer learning for oral presentation assessment,

    S. S. Y. Tun, S. Okada, H. -H. Huang, and C. W. Leong, “Multimodal transfer learning for oral presentation assessment,” IEEE Access, vol. 11, pp. 84013–84026, 2023, doi: 10.1109/ACCESS.2023.3295832

  7. [7]

    The first steps for adapting an artificial intelligence emotion expression recognition software for emotional management in the educational context,

    J. Herrero, F. Gomez -Donoso, and R. Roig -Vila, “The first steps for adapting an artificial intelligence emotion expression recognition software for emotional management in the educational context,” Br. J. Educ. Technol., vol. 54, pp. 1939–1963, 2023, doi: 10.1111/bjet.13326

  8. [8]

    Learning through AI -clones: Enhancing self-perception and presentation performance,

    Q. Zheng, Z. Chen, and Y. Huang, “Learning through AI -clones: Enhancing self-perception and presentation performance,” Comput. Hum. Behav.: Artif. Humans, vol. 3, p. 100117, 2025, doi: 10.1016/j.chbah.2025.100117

  9. [9]

    Effect of video styles on learner engagement in MOOCs,

    R. Deng, “Effect of video styles on learner engagement in MOOCs,” Technol. Pedagog. Educ., vol. 33, no. 1, pp. 1 –21, 2023, doi: 10.1080/1475939X.2023.2246981

  10. [10]

    Towards a set of design principles for developing oral presentation competence: A synthesis of research in higher education,

    H. van Ginkel, J. Gulikers, H. Biemans, and M. Mulder, “Towards a set of design principles for developing oral presentation competence: A synthesis of research in higher education,” Educ. Res. Rev., vol. 14, pp. 62–80, 2015, doi: 10.1016/j.edurev.2015.02.002

  11. [11]

    Teachers’ vocal expressions and student engagement in asynchronous video learning,

    D. J. Neufeld, M. M. Roghanizad, and R. E. White, "The impact of video- mediated communication on social predictions and theory of mind activation," Int. J. Hum. –Comput. Interact. , pp. 1 –14, 2025, doi: 10.1080/10447318.2025.2493374

  12. [12]

    Evaluating recent advances in affective intelligent tutoring systems: A scoping review of educational impacts and future prospects,

    J. Ferná ndez-Herrero, “Evaluating recent advances in affective intelligent tutoring systems: A scoping review of educational impacts and future prospects,” Educ. Sci., vol. 14, no. 8, p. 839, 2024, doi: 10.3390/educsci14080839

  13. [13]

    XGBoost to enhance learner performance prediction,

    S. Hakkal and A. A. Lahcen, "XGBoost to enhance learner performance prediction," Comput. Educ.: Artif . Intell., vol. 7, p. 100254, 2024. doi: 10.1016/j.caeai.2024.100254

  14. [14]

    A review of automated feedback systems for learners: Classification framework, challenges and opportunities,

    G. Deeva, D. Bogdanova, E. Serral, M. Snoeck, and J. De Weerdt, “A review of automated feedback systems for learners: Classification framework, challenges and opportunities,” Comput. Educ., vol. 162, p. 104094, 2021, doi: 10.1016/j.compedu.2020.104094

  15. [15]

    Intelligent tutoring systems and learning outcomes: A meta-analysis,

    W. Ma, O. Adesope, J. Nesbit, and Q. Liu, “Intelligent tutoring systems and learning outcomes: A meta-analysis,” J. Educ. Psychol., vol. 106, no. 4, pp. 901–918, 2014, doi: 10.1037/a0037123

  16. [16]

    Effectiveness of intelligent tutoring systems,

    J. A. Kulik and J. D. Fletcher, “Effectiveness of intelligent tutoring systems,” Rev. Educ. Res., vol. 86, no. 1, pp. 42 –78, 2016, doi: 10.3102/0034654315581420

  17. [17]

    Data - driven artificial intelligence in education: A comprehensive review,

    K. Ahmad, H. Ullah, A. Al -Barakati, M. Al -Shehri, and F. Alam, “Data - driven artificial intelligence in education: A comprehensive review,” IEEE Trans. Learn. Technol., vol. 17, no. 1, pp. 12 –31, 2024, doi: 10.1109/TLT.2023.3323123

  18. [18]

    Rubric formats for the formative assessment of oral presentation skills acquisition in secondary education,

    R. Nadolski, H. Hummel, E. Rusman, and K. Ackermans, “Rubric formats for the formative assessment of oral presentation skills acquisition in secondary education,” Educ. Technol. Res. Dev., vol. 69, pp. 2663 –2682, 2021, doi: 10.1007/s11423-021-10030-7. Accepted manuscript. Published version available in IEEE Transactions on Learning Technologies. DOI: 10....

  19. [19]

    Attaining self -regulation: A social cognitive perspective,

    B. J. Zimmerman, “Attaining self -regulation: A social cognitive perspective,” in Handb. Self-Regulation, M. Boekaerts, P. R. Pintrich, and M. Zeidner, Eds. Academic Press, 2000, pp. 13 –39, doi: 10.1016/B978 - 012109890-2/50031-7

  20. [20]

    The added benefit of an extra practice session in virtual reality on the development of presentation skills: A randomized control trial,

    J. Boetje and S. Ginkel, “The added benefit of an extra practice session in virtual reality on the development of presentation skills: A randomized control trial,” J. Comput. Assist. Learn., vol. 37, pp. 253 –264, 2020, doi: 10.1111/jcal.12484

  21. [21]

    DBATES: Dataset for discerning benefits of audio, textual, and facial expression features in comp etitive debate speeches,

    T. Sen, G. Naven, L. Gerstner, D. Bagley, R. Baten, W. Rahman, M. Hasan, K. Haut, A. Mamun, S. Samrose, A. Solbu, R. Barnes, G. Mark, F. Metze, and E. Hoque, “DBATES: Dataset for discerning benefits of audio, textual, and facial expression features in comp etitive debate speeches,” IEEE Trans. Affect. Comput., vol. 14, pp. 1028 –1043, 2023, doi: 10.1109/T...

  22. [22]

    Automatic gaze analysis: A survey of deep learning based approaches,

    S. Ghosh, A. Dhall, M. Hayat, J. Knibbe, and Q. Ji, “Automatic gaze analysis: A survey of deep learning based approaches,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 46, no. 1, pp. 61 –84, Jan. 2024, doi: 10.1109/TPAMI.2023.3321337

  23. [23]

    Face direction estimation based on MediaPipe landmarks,

    A. Al -Nuimi and G. Mohammed, “Face direction estimation based on MediaPipe landmarks,” in Proc. 7th Int. Conf. Contemp. Inf. Technol. Math. (ICCITM), 2021, pp. 185 –190, doi: 10.1109/ICCITM53167.2021.9677878

  24. [24]

    Online presentations for instruction: An overview,

    F. Ruth, C. Lipphardt, M. Schickel, E. Ruth -Herbein, and T. Ringeisen, “Online presentations for instruction: An overview,” Front. Educ., vol. 10, p. 1450222, 2025, doi: 10.3389/feduc.2025.145022

  25. [25]

    Chen and C

    T. Chen and C. Guestrin, “XGBoost: A scalable tree boosting system,” Proc. 22nd ACM SIGKDD Int. Conf. Knowledge Discovery and Data Mining, 2016, pp. 785–794, doi: 10.1145/2939672.2939785

  26. [26]

    Adapting and evaluating influence-estimation methods for gradient -boosted decision trees,

    J. Brophy, Z. Hammoudeh, and D. Lowd, “Adapting and evaluating influence-estimation methods for gradient -boosted decision trees,” J. Mach. Learn. Res., vol. 24, Art. no. 154, pp. 1 –48, 2023. [Online]. Available: https://jmlr.org/papers/v24/22-0449.html

  27. [27]

    Does using virtual reality to enhance students' presentation skills work? The role of feedback and presence,

    R. Di Palma, S. Beausaert, D. Mahr, J. Heller, and T. Hilken, “Does using virtual reality to enhance students' presentation skills work? The role of feedback and presence,” J. Comput. Assist. Learn., vol. 41, no. 5, p. e70097, 2025, doi: 10.1111/jcal.70097

  28. [28]

    Heuristics for supporting cooperative dashboard design,

    V. Setlur, M. Correll, A. Satyanarayan, and M. Tory, “Heuristics for supporting cooperative dashboard design,” IEEE Trans. Vis. Comput. Graph., vol. 30, pp. 370–380, 2023, doi: 10.1109/TVCG.2023.3327158

  29. [29]

    Examining the applications of intelligent tutoring systems in real educational contexts: A systematic literature review from the social experiment perspective,

    H. Wang, A. Tlili, R. Huang, et al., “Examining the applications of intelligent tutoring systems in real educational contexts: A systematic literature review from the social experiment perspective,” Educ. Inf. Technol., vol. 28, pp. 9113–9148, 2023, doi: 10.1007/s10639-022-11555- x

  30. [30]

    The comparison of two automated feedback approaches based on automated analysis of the online asynchronous interaction: a case of massive online teacher training,

    N. Ma, Y. L. Zhang, C. P. Liu, and L. Du, “The comparison of two automated feedback approaches based on automated analysis of the online asynchronous interaction: a case of massive online teacher training,” Interact. Learn. Environ., vol. 32, no. 7, pp. 38 18–3839, 2023, doi: 10.1080/10494820.2023.2191252

  31. [31]

    Designing an automated assessment of public speaking skills using multimodal cues,

    L. Chen, G. Feng, C. W. Leong, J. Joe, C. Kitchen, and C. M. Lee, “Designing an automated assessment of public speaking skills using multimodal cues,” J. Learn. Anal., vol. 3, no. 2, pp. 261 –281, 2016, doi: 10.18608/jla.2016.32.13

  32. [32]

    AutoTutor: A tutor with dialogue in natural language,

    A. C. Graesser, S. Lu, G. T. Jackson, H. H. Mitchell, and A. Olney, “AutoTutor: A tutor with dialogue in natural language,” Behav. Res. Methods Instrum. Comput., vol. 36, no. 2, pp. 180 –192, 2004, doi: 10.3758/BF03195563

  33. [33]

    ATTENDEE: an affective tutoring system based on facial emotion recognition and head pose estimation to personalize e-learning environment,

    M. Pourmirzaei, G. A. Montazer, and E. Mousavi, “ATTENDEE: an affective tutoring system based on facial emotion recognition and head pose estimation to personalize e-learning environment,” J. Comput. Educ., vol. 12, pp. 65–92, 2025, doi: 10.1007/s40692-023-00303-w

  34. [34]

    Thin slices of expressive behavior as predictors of interpersonal consequences: A meta-analysis,

    N. Ambady and R. Rosenthal, “Thin slices of expressive behavior as predictors of interpersonal consequences: A meta-analysis,” Psychol. Bull., vol. 111, pp. 256–274, 1992, doi: 10.1037/0033-2909.111.2.256

  35. [35]

    The expressive balance effect: Perception and physiological responses of prosody and gestures,

    E. Rodero, O. Larrea, I. Rodrí guez-De-Dios, and I. Lucas, “The expressive balance effect: Perception and physiological responses of prosody and gestures,” J. Lang. Soc. Psychol., vol. 41, pp. 659 –684, 2022, doi: 10.1177/0261927X221078317

  36. [36]

    KHFA: Knowledge -Driven Hierarchical Feature Alignment Framework for Subject -Invariant Facial Action Unit Detection,

    H. Zhao, S. He, C. Du, L. Liu, and L. Yu, “KHFA: Knowledge -Driven Hierarchical Feature Alignment Framework for Subject -Invariant Facial Action Unit Detection,” IEEE Trans. Instrum. Meas., vol. 73, pp. 1 –14,

  37. [37]

    0.1109/TIM.2024.3361596

  38. [38]

    A computational modeling approach to investigating mind wandering-related adjustments to gaze behavior during scene viewing,

    K. Krasich, K. O’Neill, S. Murray, J. R. Brockmole, F. De Brigard, and A. Nuthmann, “A computational modeling approach to investigating mind wandering-related adjustments to gaze behavior during scene viewing,” Cognition, vol. 242, p. 105624, 2023

  39. [39]

    Recent developments in openSMILE, the Munich open -source multimedia feature extraction toolkit,

    F. Eyben, F. Weninger, F. Gross, and B. Schuller, "Recent developments in openSMILE, the Munich open -source multimedia feature extraction toolkit," in Proc. 21st ACM Int. Conf. Multimedia, Oct. 2013, pp. 835–838. doi: 10.1145/2502081.2502224

  40. [40]

    Sentence -BERT: Sentence Embeddings using Siamese BERT-Networks,

    N. Reimers and I. Gurevych, "Sentence -BERT: Sentence Embeddings using Siamese BERT-Networks," in Proc. 2019 Conf. Empirical Methods Natural Language Process. 9th Int. Joint Conf. Natural Language Process. (EMNLP-IJCNLP), Nov. 2019, pp. 3982 –3992. doi: 10.1 8653/v1/D19- 1410

  41. [41]

    Taking the human out of the loop: A review of Bayesian optimization,

    B. Shahriari, K. Swersky, Z. Wang, R. P. Adams, and N. de Freitas, “Taking the human out of the loop: A review of Bayesian optimization,” Proc. IEEE, vol. 104, no. 1, pp. 148 –175, Jan. 2016, doi: 10.1109/JPROC.2015.2494218

  42. [42]

    The Impact of Speaker -Independent Experiments on the Validity of Speech-Based Affective Computing,

    G. Mezgec and S. Seljak, "The Impact of Speaker -Independent Experiments on the Validity of Speech-Based Affective Computing," IEEE Access, vol. 12, pp. 15432 -15450, 2024. doi: 10.1109/ACCESS.2024.335678

  43. [43]

    Comparing the Pearson and Spearman correlation coefficients across distributions and sample sizes: A tutorial using simulations and empirical data,

    J. C. F. de Winter, S. D. Gosling, and J. Potter, “Comparing the Pearson and Spearman correlation coefficients across distributions and sample sizes: A tutorial using simulations and empirical data,” Psychol. Methods, vol. 21, no. 3, pp. 273–290, 2016, doi: 10.1037/met0000079

  44. [44]

    Enhancing learner affective engagement: The impact of instructor emotional expressions and vocal charisma in asynchronous video-based online learning,

    H.-Y. Suen and K.-E. Hung, “Enhancing learner affective engagement: The impact of instructor emotional expressions and vocal charisma in asynchronous video-based online learning,” Educ. Inf. Technol., vol. 30, pp. 4033–4060, 2025, doi: 10.1007/s10639-024-12956-w

  45. [45]

    Teachers’ vocal expressions and student engagement in asynchronous video learning,

    H.-Y. Suen and Y. -S. Su, “Teachers’ vocal expressions and student engagement in asynchronous video learning,” Int. J. Hum. -Comput. Interact., pp. 1–12, 2025, doi: 10.1080/10447318.2025.2474469

  46. [46]

    Whitmore and T

    S. Whitmore and T. Gaskell, Coaching for performance: The principles and practice of coaching and leadership, 6th ed. London, UK: John Murray Business, 2024

  47. [47]

    Intraclass correlations: Uses in assessing rater reliability,

    P. E. Shrout and J. L. Fleiss, "Intraclass correlations: Uses in assessing rater reliability," Psychol. Bull., vol. 86, no. 2, pp. 420 –428, 1979. doi : 10.1037/0033-2909.86.2.420

  48. [48]

    Enhancing English oral presentation skills through a rubric -based hybrid AI –peer feedback platform,

    P. Lai, C. Chan, J. Chen, and C. Chan, “Enhancing English oral presentation skills through a rubric -based hybrid AI –peer feedback platform,” in Proc. IEEE Int. Conf. Teaching, Assessment, and Learning for Engineering (TALE), Macao, China, 2025, pp. 1 –8, do i: 10.1109/TALE66047.2025.11346692

  49. [49]

    Enhancing English oral presentation skills through a rubric -based hybrid AI –peer feedback platform,

    Y. Guo, H. L. Li, and H. Y. J. Lai, “WIP: Adaptive presentation training powered by AI,” in Proc. IEEE Int. Conf. Teaching, Assessment, and Learning for Engineering (TALE), Macao, China, 2025, pp. 1 –3, doi: 10.1109/TALE66047.2025.11346676

  50. [50]

    Evaluation of artificial intelligence as a tool for assessing presentation skills among first -year medical students at Ain Shams University,

    N. N. A. Abdel Fatah, A. S. Mohamed Bakr, H. A. M. Shaaban, S. K. Ashry, and M. A. A. Abdel -Hamid Elzahry, “Evaluation of artificial intelligence as a tool for assessing presentation skills among first -year medical students at Ain Shams University,” QJM: An International Journal of Medicine, vol. 118, suppl. 1, p. hcaf224-138, 2025. Hung-Yue Suen receiv...