AI-supported data analysis boosts student motivation and reduces stress in physics education

Andr\'e Bresges; Jannik Henze; Julia Lademann; Sebastian Becker-Genschow

arxiv: 2412.20951 · v2 · submitted 2024-12-30 · ⚛️ physics.ed-ph

AI-supported data analysis boosts student motivation and reduces stress in physics education

Jannik Henze , Julia Lademann , Sebastian Becker-Genschow , Andr\'e Bresges This is my paper

Pith reviewed 2026-05-23 06:50 UTC · model grok-4.3

classification ⚛️ physics.ed-ph

keywords AI in educationphysics educationstudent motivationdata analysischatbotlearning outcomesaffective dimensionspendulum experiments

0 comments

The pith

AI chatbot for physics data analysis raises engagement and enjoyment while matching Excel on learning gains.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The study tests an AI chatbot against traditional spreadsheet software for helping student teachers analyze pendulum experiment data. Both approaches produced comparable gains on pre- and post-tests of physics understanding. The AI group, however, reported markedly higher engagement, enjoyment, and belief that the method worked well. This separation of cognitive and affective results indicates that interactive AI support can improve how students experience lab work even when the knowledge acquired stays the same. The authors conclude that AI tools should be added inside existing teaching designs rather than used to replace them.

Core claim

Fifty student teachers were randomly assigned to use either a custom GPT-based chatbot called ExperiMentor or standard Excel to complete identical guided tasks on thread and spring pendulum data. Both groups showed significant learning gains from pre- to post-test with no statistically significant difference between them. Surveys measuring emotional and motivational variables found the AI group scored substantially higher on engagement, enjoyment, and perceived method effectiveness.

What carries the argument

The ExperiMentor GPT-based chatbot that provides interactive guidance during experimental data analysis, contrasted with Excel to isolate effects on affective responses from effects on cognitive performance.

If this is right

Interactive AI tools can improve the emotional side of learning tasks while cognitive outcomes remain comparable.
AI should be integrated as a supportive element inside pedagogical frameworks rather than as a replacement for instructional design.
Long-term retention effects, the role of learner diversity, and comparisons with other forms of support remain open questions for further study.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same chatbot structure might reduce stress for students handling data in other experimental sciences if the prompts are adapted to new contexts.
Teachers could deploy similar tools to support learners who find spreadsheet interfaces especially difficult.
Testing the approach with high-school pupils instead of student teachers would check whether the motivation gains hold for younger or less experienced groups.

Load-bearing premise

The structured surveys give unbiased readings of true differences caused by the analysis tool, and random assignment produced groups that differed only in the method used.

What would settle it

A follow-up trial in which the AI and Excel groups show equal scores on the engagement, enjoyment, and effectiveness survey items after identical tasks.

read the original abstract

The integration of artificial intelligence (AI) into education presents new opportunities for supporting learning processes. This study investigates the impact of AI-assisted versus traditional Excel-based data analysis on both learning outcomes and emotional-motivational responses in a physics education context. A custom GPT-based chatbot, ExperiMentor, was developed to support student teachers in analyzing experimental data from thread and spring pendulum experiments. Fifty student teachers were randomly assigned to either the AI or Excel group, with both groups completing identical tasks in a guided setting. Learning progress was measured using pre- and post-tests, while emotional and motivational variables were assessed through structured surveys. Both groups demonstrated significant learning gains, with no statistically significant differences found between them in terms of cognitive performance. However, the AI group reported substantially higher levels of engagement, enjoyment, and perceived method effectiveness compared to the Excel group. These findings suggest that interactive AI tools may enhance the affective dimensions of learning, even when cognitive outcomes remain comparable to traditional methods. The results underscore the importance of integrating AI not as a replacement for instructional design, but as a supportive element within pedagogical frameworks. Future research should explore long-term retention effects, the role of learner diversity, and comparisons with other forms of pedagogical support.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The study finds no learning difference between the custom GPT and Excel but higher self-reported enjoyment with AI; the affective result rests on post-only surveys that leave novelty effects unaddressed.

read the letter

The paper reports a straightforward comparison: fifty student teachers were randomly split into AI-chatbot and Excel groups for the same pendulum data tasks. Both groups showed clear pre-to-post gains on the cognitive tests and the groups did not differ from each other. The AI users gave higher ratings on engagement, enjoyment, and perceived effectiveness. That is the entire result set. The design itself is clean on the cognitive side—random assignment, identical tasks, pre-post measures—and the authors are careful not to claim cognitive superiority. The custom GPT (ExperiMentor) is a concrete implementation rather than a new theoretical framework, so the contribution is mainly the specific application to physics teacher training. The affective findings are the part that needs scrutiny. All the motivation and enjoyment differences come from structured post-task surveys. The abstract gives no information on item validation, reliability, pre-existing differences in AI comfort, or any attempt to blind participants or raters. In an educational setting a novel chatbot is likely to produce temporary excitement or demand characteristics, and nothing in the reported design separates that from a stable advantage of the tool. Without effect sizes or exact statistical details the claim of “substantially higher” levels stays vague. This paper is aimed at physics-education researchers and lab instructors who are already considering AI assistants. It will not change broad theory but could be useful as a practical data point if the survey limitations are fixed. The work shows clear thinking on the experimental setup and honest reporting of the null cognitive result, so it is worth sending to peer review. A referee can ask for the missing validation steps and pre-measures; the core question is practical enough to justify the time.

Referee Report

3 major / 2 minor

Summary. The manuscript reports a randomized controlled study with 50 student teachers comparing a custom GPT-based chatbot (ExperiMentor) for data analysis against traditional Excel methods on identical pendulum experiment tasks. Both groups showed significant pre-to-post learning gains with no statistically significant difference between conditions on cognitive measures, while the AI group reported substantially higher engagement, enjoyment, and perceived method effectiveness on post-task structured surveys.

Significance. If the affective differences prove robust, the work would provide evidence that interactive AI tools can improve motivational and emotional aspects of physics lab work without reducing cognitive outcomes relative to standard spreadsheet methods. The random assignment and matched tasks are strengths that support causal inference on the reported null cognitive result.

major comments (3)

[Methods (survey instruments)] The abstract and methods description of the structured surveys provide no information on item development, validation, reliability (e.g., internal consistency), or pilot testing. Because the headline claim of higher affective scores in the AI arm rests entirely on these self-report measures, absence of such details leaves open the possibility that observed differences reflect measurement properties rather than true group effects.
[Results] No effect sizes, exact statistical tests, p-values, or power information are reported for either the cognitive or affective comparisons. The claim of 'no statistically significant differences' in learning gains and 'substantially higher' affective scores cannot be evaluated for practical importance or robustness without these quantities.
[Methods (design and procedure)] The design description does not address potential confounds specific to the AI condition, including pre-existing group differences in AI familiarity, participant or experimenter blinding, or controls for novelty/expectancy effects. Given that the custom GPT tool is inherently novel in an educational setting, these factors could account for the affective differences without requiring a stable motivational advantage of the AI method.

minor comments (2)

[Title and abstract] The title references 'reduces stress' but the abstract and reported outcomes emphasize engagement, enjoyment, and effectiveness; clarify whether stress was separately measured and what the specific findings were.
[Methods] Provide the exact wording or sample items from the pre/post tests and surveys so readers can assess alignment with the claimed constructs.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their constructive feedback, which has strengthened the reporting and interpretation of our study. We address each major comment below.

read point-by-point responses

Referee: [Methods (survey instruments)] The abstract and methods description of the structured surveys provide no information on item development, validation, reliability (e.g., internal consistency), or pilot testing. Because the headline claim of higher affective scores in the AI arm rests entirely on these self-report measures, absence of such details leaves open the possibility that observed differences reflect measurement properties rather than true group effects.

Authors: We agree that the original submission lacked sufficient detail on survey construction. The revised manuscript now includes a description of item sources (adapted from established educational psychology scales), the adaptation process, pilot testing with a separate sample of 10 students, and internal consistency metrics (Cronbach's alpha) for each subscale. revision: yes
Referee: [Results] No effect sizes, exact statistical tests, p-values, or power information are reported for either the cognitive or affective comparisons. The claim of 'no statistically significant differences' in learning gains and 'substantially higher' affective scores cannot be evaluated for practical importance or robustness without these quantities.

Authors: We have updated the Results section to report exact p-values, test statistics (t-tests and ANOVA), effect sizes (Cohen's d with 95% CI), and a post-hoc power analysis (achieved power > 0.80 for the affective differences). These additions allow evaluation of both statistical and practical significance. revision: yes
Referee: [Methods (design and procedure)] The design description does not address potential confounds specific to the AI condition, including pre-existing group differences in AI familiarity, participant or experimenter blinding, or controls for novelty/expectancy effects. Given that the custom GPT tool is inherently novel in an educational setting, these factors could account for the affective differences without requiring a stable motivational advantage of the AI method.

Authors: We acknowledge these design limitations. Random assignment was used, but prior AI experience was not assessed and blinding was not feasible given the intervention. The revised manuscript adds an explicit Limitations paragraph discussing novelty and expectancy effects as plausible alternative explanations for the affective results. We cannot alter the original procedure but maintain that the cognitive null finding is still interpretable under random assignment. revision: partial

Circularity Check

0 steps flagged

No circularity: direct empirical RCT with independent measures

full rationale

The paper reports a randomized assignment of 50 student teachers to AI chatbot vs. Excel conditions, identical tasks, pre/post cognitive tests, and post-task structured surveys for affective variables. No equations, fitted parameters, predictions, or derivation steps appear in the abstract or described design. Results (learning gains equivalent; AI group higher on engagement/enjoyment/effectiveness) are presented as direct observations, not as outputs derived from or equivalent to the inputs by construction. No self-citations are invoked as load-bearing premises. The study is therefore self-contained against external benchmarks and receives the default non-circularity finding.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

As an empirical education research study, the central claim rests on standard assumptions of experimental design and measurement validity rather than new theoretical constructs or fitted parameters.

axioms (2)

domain assumption Random assignment produces comparable groups and there is no interaction between groups.
The study design relies on this for attributing differences to the AI tool.
domain assumption Survey responses validly reflect true emotional and motivational states without response bias.
The conclusions about engagement and enjoyment depend on this.

pith-pipeline@v0.9.0 · 5755 in / 1235 out tokens · 41110 ms · 2026-05-23T06:50:08.423718+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

65 extracted references · 65 canonical work pages · 1 internal anchor

[1]

First, the entire questionnaire was analyzed, i.e

Performance Data To answer the first research question, the learning gains within the individual groups between pre and post were examined. First, the entire questionnaire was analyzed, i.e. the sum of all correct answers in the pre- test and post -test. Then the three indiv idual subject areas of thread pendulum, spring pendulum and evaluation methods we...

work page
[2]

Emotional-motivational Data In addition to the descriptive statistics, the normal distribution was tested using the Shapiro-Wilk test for all questions, independent of whether it was the pre - or the post-test. The intern reliability of the post-test is evaluated with Cronbach’s α , a coefficient that represents the average correlation among all individua...

work page
[3]

Overall, an increase in performance can be seen in the post - survey, both in the total sum and in several categories and individual items

Excel Group The analysis of the data from the pre- and post-test of the Excel group revealed significant differences in the results of the two measurement times. Overall, an increase in performance can be seen in the post - survey, both in the total sum and in several categories and individual items. The total sum, which includes all items and categories,...

work page
[4]

The median value also increased from MDN = 7 to MDN = 9

AI Group Similar to the analysis of the Excel group data, the AI group also showed an improvement in performance from Pre M = 7.64 (SD = 3.43) to Post M = 9.92 (SD = 2.96). The median value also increased from MDN = 7 to MDN = 9. FIG 3. Boxplot of the total result of the AI-group. The dashed line indicates the mean, and the solid line represents the media...

work page
[5]

The questions from section B 1 were already validated in prior studies [41], ensuring their reliability without the need for further analysis

Pre-Intervention Differences in Emotional- Motivational Attitudes For the pre -test, Cronbach's Α was not applied. The questions from section B 1 were already validated in prior studies [41], ensuring their reliability without the need for further analysis. In section B2, the number of items was too small to reliably calculate Cronbach's Α. As a result, t...

work page
[6]

These constructs were analyzed with Cronbach’s α, where values above 0.9 are considered excellent, above 0.8 good, above 0.7 acceptable and above 0.6 questionable [51]

Post-Intervention Differences in Emotional- Motivational Attitudes The comparative analysis between Excel and AI - assisted learning methods revealed detailed and statistically significant differences across the eight key constructs of learning experience. These constructs were analyzed with Cronbach’s α, where values above 0.9 are considered excellent, a...

work page 2005
[7]

The significant increase in the total sum (Σ Total) underscores a general improvement in the measured skills after the intervention

Excel Group The intra-group results of the Excel -group regarding the difference from pre- to post-test indicate an overall positive development in the participants’ performance. The significant increase in the total sum (Σ Total) underscores a general improvement in the measured skills after the intervention. The mean value and the median rose notably fr...

work page
[8]

The overall increase in performance, coupled with a large effect size, highlig hts the robustness of this finding

AI Group The performance of the AI-assisted group exhibited significant improvement from pre - to post -test, reinforcing the potential effectiveness of AI -driven learning methods in enhancing learning outcomes. The overall increase in performance, coupled with a large effect size, highlig hts the robustness of this finding. The results demonstrate that ...

work page
[9]

Pre-Intervention Differences in Emotional- Motivational Attitudes A comparison of the attitudes at the beginning allows to identify possible differences in the initial conditions between the groups. The attitudes of the participants towards technical innovations and the evaluation of experiments provide insights into the ir motivation, openness and self -...

work page
[10]

Post-Intervention Differences in Emotional- Motivational Attitudes The results of the comparative analysis reveal a complex pattern of technological interaction in educational contexts comparing AI -assisted and Excel-based data analysis. The research goes beyond surface-level comparisons to uncover detailed insights into how different technological appro...

work page
[11]

Roll and R

I. Roll and R. Wylie, Evolution and Revolution in Artificial Intelligence in Education , Int J Artif Intell Educ 26, 582 (2016)

work page 2016
[12]

Winkelmann, M

J. Winkelmann, M. Freese, and T. Strömmer, Schwierigkeitserzeugende Merkmale im Physikunterricht [Difficulty-Inducing Features in Physics Education] (2021)

work page 2021
[13]

Kuleto et al

V. Kuleto et al. , Exploring Opportunities and Challenges of Artificial Intelligence and Machine Learning in Higher Education Institutions , Sustainability 13, 10424 (2021)

work page 2021
[14]

Bacia et al

E. Bacia et al. , Innovatives Lernen mit Intelligenten Tutoriellen Systemen. Eine Analyse der bildungspolitischen Gelingensbedingungen [Innovative Learning with Intelligent Tutoring Systems. An Analysis of the Conditions for Educational Policy Success] (2024)

work page 2024
[15]

Küchemann et al., Large language models— Valuable tools that require a sensitive integration into teaching and learning physics , The Physics Teacher 62, 400 (2024)

S. Küchemann et al., Large language models— Valuable tools that require a sensitive integration into teaching and learning physics , The Physics Teacher 62, 400 (2024). 15

work page 2024
[16]

Tong et al

D. Tong et al. , Investigating ChatGPT -4’s performance in solving physics problems and its potential implications for education, Asia Pacific Educ. Rev. 25, 1379 (2024)

work page 2024
[17]

Farrokhnia, S

M. Farrokhnia, S. K. Banihashem, O. Noroozi, and A. Wals, A SWOT analysis of ChatGPT: Implications for educational practice and research, Innovations in Education and Teaching International 61, 460 (2024)

work page 2024
[18]

Kechel and R

J. Kechel and R. Wodzinski, Methoden zur Erfassung von Schwierigkeiten bei Schülerexperimenten [Variety of Prerequisites in Science Education] , in Heterogenität und Diversität – Vielfalt der Voraussetzungen im naturwissenschaftlichen Unterricht. Tagungsband Jahrestagung in Bremen 2014 [Heterogeneity and Diversity - Conference Proceedings, Annual Meeting ...

work page 2014
[19]

Low and Z

A. Low and Z. Y. Kalender, Data Dialogue with ChatGPT. Using Code Interpreter to Simulate and Analyse Experimental Data (2023), http://arxiv.org/pdf/2311.12415v2

work page arXiv 2023
[20]

S. A. D. Popenici and S. Kerr, Exploring the impact of artificial intelligence on teaching and learning in higher education , Research and practice in technology enhanced learning 12, 22 (2017)

work page 2017
[21]

Impulspapier der Ständigen Wissenschaftlichen Kommission (SWK) der Kultusministerkonferenz [Large Language Models and their potential in the education system

Ständige Wissenschaftliche Kommission der Kultusministerkonferenz [Standing Scientific Commission of the Conference of Ministers of Education and Cultural Affairs], Large Language Models und ihre Potenziale im Bildungssystem. Impulspapier der Ständigen Wissenschaftlichen Kommission (SWK) der Kultusministerkonferenz [Large Language Models and their potenti...

work page doi:10.25656/01:28303 2023
[22]

Vincent -Lancrin and R

S. Vincent -Lancrin and R. van der Vlies, Trustworthy artificial intelligence (AI) in education. Promises and challenges , OECD Education Working Papers No. 218, Vol. 218 (2020)

work page 2020
[23]

Mahligawati, E

F. Mahligawati, E. Allanas, M. H. Butarbutar, and N. A. N. Nordin, Artificial intelligence in Physics Education. A comprehensive literature review, J. Phys.: Conf. Ser. 2596, 12080 (2023)

work page 2023
[24]

L. Chen, P. Chen, and Z. Lin, Artificial Intelligence in Education: A Review , IEEE Access 8, 75264 (2020)

work page 2020
[25]

Zawacki-Richter, V

O. Zawacki-Richter, V. I. Marín, M. Bond, and F. Gouverneur, Systematic review of research on artificial intelligence applications in higher education – where are the educators?, Int J Educ Technol High Educ 16 (2019)

work page 2019
[26]

Salas -Pilco, K

S. Salas -Pilco, K. Xiao, and X. Hu, Artificial Intelligence and Learning Analytics in Teacher Education: A Systematic Review , Education Sciences 12, 569 (2022)

work page 2022
[27]

Lin et al

X.-F. Lin et al. , Teachers’ Perceptions of Teaching Sustainable Artificial Intelligence. A Design Frame Perspective , Sustainability 14, 7811 (2022)

work page 2022
[28]

R. P. d. Santos, Enhancing Physics Learning with ChatGPT, Bing Chat, and Bard as Agents -to- Think-With: A Comparative Case Study (2023), http://arxiv.org/pdf/2306.00724

work page arXiv 2023
[29]

Kieser, P

F. Kieser, P. Wulff, J. Kuhn, and S. Küchemann, Educational data augmentation in physics education research using ChatGPT (2023), http://arxiv.org/pdf/2307.14475v2

work page arXiv 2023
[30]

Liang, Di Zou, H

Y. Liang, Di Zou, H. Xie, and F. L. Wang, Exploring the potential of using ChatGPT in physics education , Smart Learn. Environ. 10 (2023)

work page 2023
[31]

H. A. Mustofa, M. R. Bilad, and N. W. B. Grendis, Utilizing AI for Physics Problem Solving: A Literature Review and ChatGPT Experience, Jurnal. Kependidikan. Fisika 12, 78 (2024)

work page 2024
[32]

Krupp et al., Challenges and Opportunities of Moderating Usage of Large Language Models in Education (2023), http://arxiv.org/pdf/2312.14969v1

L. Krupp et al., Challenges and Opportunities of Moderating Usage of Large Language Models in Education (2023), http://arxiv.org/pdf/2312.14969v1

work page arXiv 2023
[33]

Halaweh, ChatGPT in education: Strategies for responsible implementation , CONT ED TECHNOLOGY 15, ep421 (2023)

M. Halaweh, ChatGPT in education: Strategies for responsible implementation , CONT ED TECHNOLOGY 15, ep421 (2023)

work page 2023
[34]

Padma and C

C. Padma and C. Rama, A Study of Artificial Intelligence in Education System & Role of AI in Indian Education Sector, International Journal of Scientific Research in Engineering and Management 06 (2022)

work page 2022
[35]

C. K. Lo, What Is the Impact of ChatGPT on Education? A Rapid Review of the Literature , Education Sciences 13, 410 (2023)

work page 2023
[36]

Bitzenbauer, ChatGPT in physics education: A pilot study on easy -to-implement activities , CONT ED TECHNOLOGY 15, ep430 (2023)

P. Bitzenbauer, ChatGPT in physics education: A pilot study on easy -to-implement activities , CONT ED TECHNOLOGY 15, ep430 (2023)

work page 2023
[37]

Z. Wen, E. Bai, and M. Li, An Evaluation of the Impact of Artificial Intelligence on university Students' Learning, JID 6, 22 (2024)

work page 2024
[38]

Kasneci et al

E. Kasneci et al. , ChatGPT for good? On opportunities and challenges of large language models for education , Learning and Individual Differences 103 (2023)

work page 2023
[39]

Kortemeyer, Could an Artificial -Intelligence agent pass an introductory physics course? , Phys

G. Kortemeyer, Could an Artificial -Intelligence agent pass an introductory physics course? , Phys. Rev. Phys. Educ. Res. 19, 15 (2023), http://arxiv.org/pdf/2301.12127v2. 16

work page arXiv 2023
[40]

L. Ding, T. Li, S. Jiang, and A. Gapud, Students’ perceptions of using ChatGPT in a physics class as a virtual tutor, Int J Educ Technol High Educ 20 (2023)

work page 2023
[41]

M. N. Dahlkemper, S. Z. Lahme, and P. Klein, How do physics students evaluate artificial intelligence responses on comprehension questions? A study on the perceived scientific accuracy and linguistic quality, Phys. Rev. Phys. Educ. Res. 19 (2023), http://arxiv.org/pdf/2304.05906v2

work page arXiv 2023
[42]

Unreflected Acceptance -- Investigating the Negative Consequences of ChatGPT-Assisted Problem Solving in Physics Education

L. Krupp et al. , Unreflected Acceptance. Investigating the Negative Consequences of ChatGPT-Assisted Problem Solving in Physics Education (2023), http://arxiv.org/pdf/2309.03087v1

work page internal anchor Pith review Pith/arXiv arXiv 2023
[43]

Gada and S

T. Gada and S. Chudasana, Impact of Artificial Intelligence on student attitudes, engagement, and learning, IRJMETS 06, 2695 (2024)

work page 2024
[44]

H. B. Essel, D. Vlachopoulos, A. B. Essuman, and J. O. Amankwa, ChatGPT effects on cognitive skills of undergraduate students: Receiving instant responses from AI -based conversational large language models (LLMs) , Computers and Education: Artificial Intelligence 6, 100198 (2024)

work page 2024
[45]

Hanum Siregar, B

F. Hanum Siregar, B. Hasmayni, and A. H. Lubis, The Analysis of Chat GPT Usage Impact on Learning Motivation among Scout Students, Int J Res Rev 10, 632 (2023)

work page 2023
[46]

Yu and Y

S. Yu and Y. Lu, An Introduction to Artificial Intelligence in Education (2021)

work page 2021
[47]

Schule in Zeiten von künstlicher Intelligenz und ChatGPT [Into the Unknown

Vodafone Stiftung Deutschland [Vodafone Foundation Germany], Aufbruch ins Unbekannte. Schule in Zeiten von künstlicher Intelligenz und ChatGPT [Into the Unknown. Schools in Times of Artificial Intelligence and ChatGPT] (2023)

work page 2023
[48]

Hedderich and L

J. Hedderich and L. Sachs, Angewandte Statistik. Methodensammlung mit R [Applied Statistics. Collection of Methods with R] (2016), http://nbn- resolving.org/urn:nbn:de:bsz:31-epflicht- 1574102

work page 2016
[49]

Döring et al

N. Döring et al. , Forschungsmethoden und Evaluation in den Sozial - und Humanwissenschaften [Research Methods and Evaluation in the Social and Human Sciences] (2016)

work page 2016
[50]

See Supplemental Material at [Link] for the tasks, tests and data tables

work page
[51]

F. J. Neyer, J. Felber, and C. Gebhardt, Kurzskala Technikbereitschaft [Short Scale for Technology Commitment ] (2016)

work page 2016
[52]

Wollschläger, Grundlagen der Datenanalyse mit R [Basics of Data Analysis with R] (2014)

D. Wollschläger, Grundlagen der Datenanalyse mit R [Basics of Data Analysis with R] (2014)

work page 2014
[53]

Bortz and C

J. Bortz and C. Schuster, Statistik für Human- und Sozialwissenschaftler [Statistics for Human and Social Scientists] (2010), http://site.ebrary.com/lib/alltitles/docDetail.actio n?docID=10448295

work page 2010
[54]

Schmider et al

E. Schmider et al. , Is It Really Robust? , Methodology 6, 147 (2010)

work page 2010
[55]

G. V. Glass, P. D. Peckham, and J. R. Sanders, Consequences of Failure to Meet Assumptions Underlying the Fixed Effects Analyses of Variance and Covariance , Review of Educational Research 42, 237 (1972)

work page 1972
[56]

R. R. Hake, Interactive-engagement versus traditional methods: A six -thousand-student survey of mechanics test data for introductory physics courses, American Journal of Physics 66, 64 (1998)

work page 1998
[57]

McKagan, E

S. McKagan, E. Sayre, and A. Madsen, Normalized gain. What is it and when and how should I use it? , 2022, https://www.physport.org/recommendations/Ent ry.cfm?ID=93334

work page 2022
[58]

Hake, Lessons from the Physics Education Reform Effort, CE 5 (2002)

R. Hake, Lessons from the Physics Education Reform Effort, CE 5 (2002)

work page 2002
[59]

V. P. Coletta and J. J. Steinert, Why normalized gain should continue to be used in analyzing preinstruction and postinstruction scores on concept inventories, Phys. Rev. Phys. Educ. Res. 16 (2020)

work page 2020
[60]

L. J. Cronbach, Coefficient alpha and the internal structure of tests, Psychometrika 16, 297 (1951)

work page 1951
[61]

Blanz, Forschungsmethoden und Statistik für die Soziale Arbeit

M. Blanz, Forschungsmethoden und Statistik für die Soziale Arbeit. Grundlagen und Anwendungen [Research Methods and Statistics for Social Work. Basics and Applications] (2021), http://www.kohlhammer.de/wms/instances/KOB /appDE/nav_product.php?product=978-3-17- 039818-4

work page 2021
[62]

T. Long, K. I. Gero, and L. B. Chilton, Not Just Novelty: A Longitudinal Study on Utility and Customization of an AI Workflow (2024), http://arxiv.org/pdf/2402.09894v2

work page arXiv 2024
[63]

Stadler, M

M. Stadler, M. Bannert, and M. Sailer, Cognitive ease at a cost: LLMs reduce mental effort but compromise depth in student scientific inquiry , Computers in Human Behavior 160, 108386 (2024)

work page 2024
[64]

Karataş and B

F. Karataş and B. A. Ataç, When TPACK meets artificial intelligence: Analyzing TPACK and AI- TPACK components through structural equation modelling, Educ Inf Technol (2024)

work page 2024
[65]

Chounta, E

I.-A. Chounta, E. Bardone, A. Raudsep, and M. Pedaste, Exploring Teachers’ Perceptions of Artificial Intelligence as a Tool to Support their Practice in Estonian K -12 Education, Int J Artif Intell Educ 32, 725 (2022)

work page 2022

[1] [1]

First, the entire questionnaire was analyzed, i.e

Performance Data To answer the first research question, the learning gains within the individual groups between pre and post were examined. First, the entire questionnaire was analyzed, i.e. the sum of all correct answers in the pre- test and post -test. Then the three indiv idual subject areas of thread pendulum, spring pendulum and evaluation methods we...

work page

[2] [2]

Emotional-motivational Data In addition to the descriptive statistics, the normal distribution was tested using the Shapiro-Wilk test for all questions, independent of whether it was the pre - or the post-test. The intern reliability of the post-test is evaluated with Cronbach’s α , a coefficient that represents the average correlation among all individua...

work page

[3] [3]

Overall, an increase in performance can be seen in the post - survey, both in the total sum and in several categories and individual items

Excel Group The analysis of the data from the pre- and post-test of the Excel group revealed significant differences in the results of the two measurement times. Overall, an increase in performance can be seen in the post - survey, both in the total sum and in several categories and individual items. The total sum, which includes all items and categories,...

work page

[4] [4]

The median value also increased from MDN = 7 to MDN = 9

AI Group Similar to the analysis of the Excel group data, the AI group also showed an improvement in performance from Pre M = 7.64 (SD = 3.43) to Post M = 9.92 (SD = 2.96). The median value also increased from MDN = 7 to MDN = 9. FIG 3. Boxplot of the total result of the AI-group. The dashed line indicates the mean, and the solid line represents the media...

work page

[5] [5]

The questions from section B 1 were already validated in prior studies [41], ensuring their reliability without the need for further analysis

Pre-Intervention Differences in Emotional- Motivational Attitudes For the pre -test, Cronbach's Α was not applied. The questions from section B 1 were already validated in prior studies [41], ensuring their reliability without the need for further analysis. In section B2, the number of items was too small to reliably calculate Cronbach's Α. As a result, t...

work page

[6] [6]

These constructs were analyzed with Cronbach’s α, where values above 0.9 are considered excellent, above 0.8 good, above 0.7 acceptable and above 0.6 questionable [51]

Post-Intervention Differences in Emotional- Motivational Attitudes The comparative analysis between Excel and AI - assisted learning methods revealed detailed and statistically significant differences across the eight key constructs of learning experience. These constructs were analyzed with Cronbach’s α, where values above 0.9 are considered excellent, a...

work page 2005

[7] [7]

The significant increase in the total sum (Σ Total) underscores a general improvement in the measured skills after the intervention

Excel Group The intra-group results of the Excel -group regarding the difference from pre- to post-test indicate an overall positive development in the participants’ performance. The significant increase in the total sum (Σ Total) underscores a general improvement in the measured skills after the intervention. The mean value and the median rose notably fr...

work page

[8] [8]

The overall increase in performance, coupled with a large effect size, highlig hts the robustness of this finding

AI Group The performance of the AI-assisted group exhibited significant improvement from pre - to post -test, reinforcing the potential effectiveness of AI -driven learning methods in enhancing learning outcomes. The overall increase in performance, coupled with a large effect size, highlig hts the robustness of this finding. The results demonstrate that ...

work page

[9] [9]

Pre-Intervention Differences in Emotional- Motivational Attitudes A comparison of the attitudes at the beginning allows to identify possible differences in the initial conditions between the groups. The attitudes of the participants towards technical innovations and the evaluation of experiments provide insights into the ir motivation, openness and self -...

work page

[10] [10]

Post-Intervention Differences in Emotional- Motivational Attitudes The results of the comparative analysis reveal a complex pattern of technological interaction in educational contexts comparing AI -assisted and Excel-based data analysis. The research goes beyond surface-level comparisons to uncover detailed insights into how different technological appro...

work page

[11] [11]

Roll and R

I. Roll and R. Wylie, Evolution and Revolution in Artificial Intelligence in Education , Int J Artif Intell Educ 26, 582 (2016)

work page 2016

[12] [12]

Winkelmann, M

J. Winkelmann, M. Freese, and T. Strömmer, Schwierigkeitserzeugende Merkmale im Physikunterricht [Difficulty-Inducing Features in Physics Education] (2021)

work page 2021

[13] [13]

Kuleto et al

V. Kuleto et al. , Exploring Opportunities and Challenges of Artificial Intelligence and Machine Learning in Higher Education Institutions , Sustainability 13, 10424 (2021)

work page 2021

[14] [14]

Bacia et al

E. Bacia et al. , Innovatives Lernen mit Intelligenten Tutoriellen Systemen. Eine Analyse der bildungspolitischen Gelingensbedingungen [Innovative Learning with Intelligent Tutoring Systems. An Analysis of the Conditions for Educational Policy Success] (2024)

work page 2024

[15] [15]

Küchemann et al., Large language models— Valuable tools that require a sensitive integration into teaching and learning physics , The Physics Teacher 62, 400 (2024)

S. Küchemann et al., Large language models— Valuable tools that require a sensitive integration into teaching and learning physics , The Physics Teacher 62, 400 (2024). 15

work page 2024

[16] [16]

Tong et al

D. Tong et al. , Investigating ChatGPT -4’s performance in solving physics problems and its potential implications for education, Asia Pacific Educ. Rev. 25, 1379 (2024)

work page 2024

[17] [17]

Farrokhnia, S

M. Farrokhnia, S. K. Banihashem, O. Noroozi, and A. Wals, A SWOT analysis of ChatGPT: Implications for educational practice and research, Innovations in Education and Teaching International 61, 460 (2024)

work page 2024

[18] [18]

Kechel and R

J. Kechel and R. Wodzinski, Methoden zur Erfassung von Schwierigkeiten bei Schülerexperimenten [Variety of Prerequisites in Science Education] , in Heterogenität und Diversität – Vielfalt der Voraussetzungen im naturwissenschaftlichen Unterricht. Tagungsband Jahrestagung in Bremen 2014 [Heterogeneity and Diversity - Conference Proceedings, Annual Meeting ...

work page 2014

[19] [19]

Low and Z

A. Low and Z. Y. Kalender, Data Dialogue with ChatGPT. Using Code Interpreter to Simulate and Analyse Experimental Data (2023), http://arxiv.org/pdf/2311.12415v2

work page arXiv 2023

[20] [20]

S. A. D. Popenici and S. Kerr, Exploring the impact of artificial intelligence on teaching and learning in higher education , Research and practice in technology enhanced learning 12, 22 (2017)

work page 2017

[21] [21]

Impulspapier der Ständigen Wissenschaftlichen Kommission (SWK) der Kultusministerkonferenz [Large Language Models and their potential in the education system

Ständige Wissenschaftliche Kommission der Kultusministerkonferenz [Standing Scientific Commission of the Conference of Ministers of Education and Cultural Affairs], Large Language Models und ihre Potenziale im Bildungssystem. Impulspapier der Ständigen Wissenschaftlichen Kommission (SWK) der Kultusministerkonferenz [Large Language Models and their potenti...

work page doi:10.25656/01:28303 2023

[22] [22]

Vincent -Lancrin and R

S. Vincent -Lancrin and R. van der Vlies, Trustworthy artificial intelligence (AI) in education. Promises and challenges , OECD Education Working Papers No. 218, Vol. 218 (2020)

work page 2020

[23] [23]

Mahligawati, E

F. Mahligawati, E. Allanas, M. H. Butarbutar, and N. A. N. Nordin, Artificial intelligence in Physics Education. A comprehensive literature review, J. Phys.: Conf. Ser. 2596, 12080 (2023)

work page 2023

[24] [24]

L. Chen, P. Chen, and Z. Lin, Artificial Intelligence in Education: A Review , IEEE Access 8, 75264 (2020)

work page 2020

[25] [25]

Zawacki-Richter, V

O. Zawacki-Richter, V. I. Marín, M. Bond, and F. Gouverneur, Systematic review of research on artificial intelligence applications in higher education – where are the educators?, Int J Educ Technol High Educ 16 (2019)

work page 2019

[26] [26]

Salas -Pilco, K

S. Salas -Pilco, K. Xiao, and X. Hu, Artificial Intelligence and Learning Analytics in Teacher Education: A Systematic Review , Education Sciences 12, 569 (2022)

work page 2022

[27] [27]

Lin et al

X.-F. Lin et al. , Teachers’ Perceptions of Teaching Sustainable Artificial Intelligence. A Design Frame Perspective , Sustainability 14, 7811 (2022)

work page 2022

[28] [28]

R. P. d. Santos, Enhancing Physics Learning with ChatGPT, Bing Chat, and Bard as Agents -to- Think-With: A Comparative Case Study (2023), http://arxiv.org/pdf/2306.00724

work page arXiv 2023

[29] [29]

Kieser, P

F. Kieser, P. Wulff, J. Kuhn, and S. Küchemann, Educational data augmentation in physics education research using ChatGPT (2023), http://arxiv.org/pdf/2307.14475v2

work page arXiv 2023

[30] [30]

Liang, Di Zou, H

Y. Liang, Di Zou, H. Xie, and F. L. Wang, Exploring the potential of using ChatGPT in physics education , Smart Learn. Environ. 10 (2023)

work page 2023

[31] [31]

H. A. Mustofa, M. R. Bilad, and N. W. B. Grendis, Utilizing AI for Physics Problem Solving: A Literature Review and ChatGPT Experience, Jurnal. Kependidikan. Fisika 12, 78 (2024)

work page 2024

[32] [32]

Krupp et al., Challenges and Opportunities of Moderating Usage of Large Language Models in Education (2023), http://arxiv.org/pdf/2312.14969v1

L. Krupp et al., Challenges and Opportunities of Moderating Usage of Large Language Models in Education (2023), http://arxiv.org/pdf/2312.14969v1

work page arXiv 2023

[33] [33]

Halaweh, ChatGPT in education: Strategies for responsible implementation , CONT ED TECHNOLOGY 15, ep421 (2023)

M. Halaweh, ChatGPT in education: Strategies for responsible implementation , CONT ED TECHNOLOGY 15, ep421 (2023)

work page 2023

[34] [34]

Padma and C

C. Padma and C. Rama, A Study of Artificial Intelligence in Education System & Role of AI in Indian Education Sector, International Journal of Scientific Research in Engineering and Management 06 (2022)

work page 2022

[35] [35]

C. K. Lo, What Is the Impact of ChatGPT on Education? A Rapid Review of the Literature , Education Sciences 13, 410 (2023)

work page 2023

[36] [36]

Bitzenbauer, ChatGPT in physics education: A pilot study on easy -to-implement activities , CONT ED TECHNOLOGY 15, ep430 (2023)

P. Bitzenbauer, ChatGPT in physics education: A pilot study on easy -to-implement activities , CONT ED TECHNOLOGY 15, ep430 (2023)

work page 2023

[37] [37]

Z. Wen, E. Bai, and M. Li, An Evaluation of the Impact of Artificial Intelligence on university Students' Learning, JID 6, 22 (2024)

work page 2024

[38] [38]

Kasneci et al

E. Kasneci et al. , ChatGPT for good? On opportunities and challenges of large language models for education , Learning and Individual Differences 103 (2023)

work page 2023

[39] [39]

Kortemeyer, Could an Artificial -Intelligence agent pass an introductory physics course? , Phys

G. Kortemeyer, Could an Artificial -Intelligence agent pass an introductory physics course? , Phys. Rev. Phys. Educ. Res. 19, 15 (2023), http://arxiv.org/pdf/2301.12127v2. 16

work page arXiv 2023

[40] [40]

L. Ding, T. Li, S. Jiang, and A. Gapud, Students’ perceptions of using ChatGPT in a physics class as a virtual tutor, Int J Educ Technol High Educ 20 (2023)

work page 2023

[41] [41]

M. N. Dahlkemper, S. Z. Lahme, and P. Klein, How do physics students evaluate artificial intelligence responses on comprehension questions? A study on the perceived scientific accuracy and linguistic quality, Phys. Rev. Phys. Educ. Res. 19 (2023), http://arxiv.org/pdf/2304.05906v2

work page arXiv 2023

[42] [42]

Unreflected Acceptance -- Investigating the Negative Consequences of ChatGPT-Assisted Problem Solving in Physics Education

L. Krupp et al. , Unreflected Acceptance. Investigating the Negative Consequences of ChatGPT-Assisted Problem Solving in Physics Education (2023), http://arxiv.org/pdf/2309.03087v1

work page internal anchor Pith review Pith/arXiv arXiv 2023

[43] [43]

Gada and S

T. Gada and S. Chudasana, Impact of Artificial Intelligence on student attitudes, engagement, and learning, IRJMETS 06, 2695 (2024)

work page 2024

[44] [44]

H. B. Essel, D. Vlachopoulos, A. B. Essuman, and J. O. Amankwa, ChatGPT effects on cognitive skills of undergraduate students: Receiving instant responses from AI -based conversational large language models (LLMs) , Computers and Education: Artificial Intelligence 6, 100198 (2024)

work page 2024

[45] [45]

Hanum Siregar, B

F. Hanum Siregar, B. Hasmayni, and A. H. Lubis, The Analysis of Chat GPT Usage Impact on Learning Motivation among Scout Students, Int J Res Rev 10, 632 (2023)

work page 2023

[46] [46]

Yu and Y

S. Yu and Y. Lu, An Introduction to Artificial Intelligence in Education (2021)

work page 2021

[47] [47]

Schule in Zeiten von künstlicher Intelligenz und ChatGPT [Into the Unknown

Vodafone Stiftung Deutschland [Vodafone Foundation Germany], Aufbruch ins Unbekannte. Schule in Zeiten von künstlicher Intelligenz und ChatGPT [Into the Unknown. Schools in Times of Artificial Intelligence and ChatGPT] (2023)

work page 2023

[48] [48]

Hedderich and L

J. Hedderich and L. Sachs, Angewandte Statistik. Methodensammlung mit R [Applied Statistics. Collection of Methods with R] (2016), http://nbn- resolving.org/urn:nbn:de:bsz:31-epflicht- 1574102

work page 2016

[49] [49]

Döring et al

N. Döring et al. , Forschungsmethoden und Evaluation in den Sozial - und Humanwissenschaften [Research Methods and Evaluation in the Social and Human Sciences] (2016)

work page 2016

[50] [50]

See Supplemental Material at [Link] for the tasks, tests and data tables

work page

[51] [51]

F. J. Neyer, J. Felber, and C. Gebhardt, Kurzskala Technikbereitschaft [Short Scale for Technology Commitment ] (2016)

work page 2016

[52] [52]

Wollschläger, Grundlagen der Datenanalyse mit R [Basics of Data Analysis with R] (2014)

D. Wollschläger, Grundlagen der Datenanalyse mit R [Basics of Data Analysis with R] (2014)

work page 2014

[53] [53]

Bortz and C

J. Bortz and C. Schuster, Statistik für Human- und Sozialwissenschaftler [Statistics for Human and Social Scientists] (2010), http://site.ebrary.com/lib/alltitles/docDetail.actio n?docID=10448295

work page 2010

[54] [54]

Schmider et al

E. Schmider et al. , Is It Really Robust? , Methodology 6, 147 (2010)

work page 2010

[55] [55]

G. V. Glass, P. D. Peckham, and J. R. Sanders, Consequences of Failure to Meet Assumptions Underlying the Fixed Effects Analyses of Variance and Covariance , Review of Educational Research 42, 237 (1972)

work page 1972

[56] [56]

R. R. Hake, Interactive-engagement versus traditional methods: A six -thousand-student survey of mechanics test data for introductory physics courses, American Journal of Physics 66, 64 (1998)

work page 1998

[57] [57]

McKagan, E

S. McKagan, E. Sayre, and A. Madsen, Normalized gain. What is it and when and how should I use it? , 2022, https://www.physport.org/recommendations/Ent ry.cfm?ID=93334

work page 2022

[58] [58]

Hake, Lessons from the Physics Education Reform Effort, CE 5 (2002)

R. Hake, Lessons from the Physics Education Reform Effort, CE 5 (2002)

work page 2002

[59] [59]

V. P. Coletta and J. J. Steinert, Why normalized gain should continue to be used in analyzing preinstruction and postinstruction scores on concept inventories, Phys. Rev. Phys. Educ. Res. 16 (2020)

work page 2020

[60] [60]

L. J. Cronbach, Coefficient alpha and the internal structure of tests, Psychometrika 16, 297 (1951)

work page 1951

[61] [61]

Blanz, Forschungsmethoden und Statistik für die Soziale Arbeit

M. Blanz, Forschungsmethoden und Statistik für die Soziale Arbeit. Grundlagen und Anwendungen [Research Methods and Statistics for Social Work. Basics and Applications] (2021), http://www.kohlhammer.de/wms/instances/KOB /appDE/nav_product.php?product=978-3-17- 039818-4

work page 2021

[62] [62]

T. Long, K. I. Gero, and L. B. Chilton, Not Just Novelty: A Longitudinal Study on Utility and Customization of an AI Workflow (2024), http://arxiv.org/pdf/2402.09894v2

work page arXiv 2024

[63] [63]

Stadler, M

M. Stadler, M. Bannert, and M. Sailer, Cognitive ease at a cost: LLMs reduce mental effort but compromise depth in student scientific inquiry , Computers in Human Behavior 160, 108386 (2024)

work page 2024

[64] [64]

Karataş and B

F. Karataş and B. A. Ataç, When TPACK meets artificial intelligence: Analyzing TPACK and AI- TPACK components through structural equation modelling, Educ Inf Technol (2024)

work page 2024

[65] [65]

Chounta, E

I.-A. Chounta, E. Bardone, A. Raudsep, and M. Pedaste, Exploring Teachers’ Perceptions of Artificial Intelligence as a Tool to Support their Practice in Estonian K -12 Education, Int J Artif Intell Educ 32, 725 (2022)

work page 2022