AI in the Wild: A Large Scale Analysis of Authentic Interactions of College Students with Generative AI
Pith reviewed 2026-06-30 01:55 UTC · model grok-4.3
The pith
Student generative AI use in college courses follows a small number of recurring patterns with course-specific variations.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Our analysis reveals that student-AI interaction is highly structured. Across courses, interactions concentrate in a small number of recurring patterns rather than exhibiting highly idiosyncratic use. At the same time, systematic differences emerge across courses, giving rise to distinct interaction profiles associated with different forms of academic work.
What carries the argument
Instruction-guided annotation of student turns along cognitive intent and interaction context dimensions, applied to a dataset of over 15,000 interaction units from voluntary AI use in coursework.
If this is right
- Student-AI interactions are structured around recurring patterns across different courses.
- Distinct interaction profiles emerge that correspond to different forms of academic work.
- Large-scale authentic data provides a more representative view than prior controlled or small-scale studies.
- Patterns can be used to understand engagement with GenAI in learning contexts.
Where Pith is reading between the lines
- Tailored AI assistants could be developed based on common interaction patterns to better support students.
- Educators might design assignments considering typical AI usage profiles for their course type.
- The findings could inform guidelines for responsible AI integration in higher education.
Load-bearing premise
The instruction-guided annotation scheme accurately and consistently captures cognitive intent and interaction context across diverse courses without substantial coder bias or misclassification.
What would settle it
A replication study finding that interaction patterns are highly varied and idiosyncratic across courses, or that annotation categories do not reliably distinguish patterns, would challenge the claim of structured interactions.
Figures
read the original abstract
Generative AI tools (GenAI) are increasingly used by students during coursework, yet empirical understanding of how students engage with these systems in authentic learning contexts remains limited. Existing studies have largely relied on controlled settings, single-domain analyses, or small-scale qualitative data, leaving open how student-AI interaction unfolds across courses and forms of academic work. We present a large-scale analysis of naturally occurring student-AI interactions collected from undergraduate students across multiple university courses and academic domains. The dataset comprises over 15,000 student-AI interaction units drawn from voluntary use of generative AI during real coursework. To characterize these interactions, we analyze each student turn along two complementary dimensions, cognitive intent and interaction context, capturing whether requests are directed toward the task or domain, the student's own work, or prior AI output. Using instruction-guided annotation applied at scale, we examine how these interaction patterns are distributed overall and how they vary across courses. Our analysis reveals that student-AI interaction is highly structured. Across courses, interactions concentrate in a small number of recurring patterns rather than exhibiting highly idiosyncratic use. At the same time, systematic differences emerge across courses, giving rise to distinct interaction profiles associated with different forms of academic work.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims to present a large-scale analysis of over 15,000 authentic student-AI interaction units collected from voluntary undergraduate use across multiple courses and domains. Using instruction-guided annotation on two dimensions (cognitive intent and interaction context), it reports that interactions concentrate in a small number of recurring patterns rather than idiosyncratic use, while systematic differences across courses produce distinct interaction profiles tied to forms of academic work.
Significance. If the annotation scheme is shown to be reliable, the work provides one of the first large-scale empirical descriptions of real-world student-GenAI interactions in coursework, moving beyond controlled or small-scale studies. The scale of the dataset and focus on natural use across domains are clear strengths that could inform both educational practice and tool design.
major comments (1)
- [Methods / annotation procedure] The central claims that interactions concentrate in recurring patterns and exhibit course-specific profiles (abstract and results sections) depend entirely on the instruction-guided annotation of cognitive intent and interaction context being accurate and consistent. The manuscript reports no inter-annotator agreement statistics, no validation results on a held-out set, and no robustness checks for coder bias or domain-specific misclassification, leaving open that the reported concentrations and differences could be annotation artifacts.
minor comments (1)
- [Abstract] The abstract should briefly note the voluntary nature of participation to contextualize the scope of the findings.
Simulated Author's Rebuttal
We thank the referee for the positive assessment of the work's scale and potential contributions, as well as for the constructive feedback on annotation reliability. We address the major comment below and will revise the manuscript to incorporate the requested details.
read point-by-point responses
-
Referee: [Methods / annotation procedure] The central claims that interactions concentrate in recurring patterns and exhibit course-specific profiles (abstract and results sections) depend entirely on the instruction-guided annotation of cognitive intent and interaction context being accurate and consistent. The manuscript reports no inter-annotator agreement statistics, no validation results on a held-out set, and no robustness checks for coder bias or domain-specific misclassification, leaving open that the reported concentrations and differences could be annotation artifacts.
Authors: We agree that the absence of inter-annotator agreement statistics, held-out validation, and robustness checks is a limitation in the submitted manuscript. The annotation procedure is described only at a high level, which leaves the reliability of the cognitive intent and interaction context labels open to question and could undermine the reported concentrations and course differences. In the revised version we will add a dedicated subsection to the Methods that reports: (1) the number of annotators and their training, (2) inter-annotator agreement computed on an overlapping sample (e.g., Cohen’s or Fleiss’ kappa together with raw agreement), (3) any held-out validation or adjudication process, and (4) domain-specific consistency checks. We will also add a brief limitations paragraph discussing residual coder bias. These additions will directly support the central claims and address the possibility of annotation artifacts. revision: yes
Circularity Check
No circularity: purely empirical descriptive analysis of interaction logs
full rationale
The paper conducts a large-scale empirical study of over 15,000 student-AI interaction units collected from real coursework. It applies instruction-guided annotation to classify turns along cognitive intent and interaction context dimensions, then reports observed distributions and course-specific profiles. No equations, fitted parameters, predictions, derivations, or self-citations appear in the provided text. The central claims rest directly on the collected data and annotations rather than reducing to prior results by construction. This is self-contained empirical description with no load-bearing self-referential steps.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Instruction-guided annotation applied at scale produces consistent classifications of cognitive intent and interaction context.
Reference graph
Works this paper leans on
-
[1]
Review of Educational Research73(3), 277–320 (2003)
Aleven, V., Stahl, E., Schworm, S., Fischer, F., Wallace, R.: Help-seeking and help design in interactive learning environments. Review of Educational Research73(3), 277–320 (2003)
2003
-
[2]
arXiv preprint arXiv:2505.24126 (2025)
Ammari, T., Chen, M., Zaman, S., Garimella, K.: How students (really) use chatgpt: Uncovering experiences among undergraduate students. arXiv preprint arXiv:2505.24126 (2025)
-
[3]
Longman, New York (2001)
Anderson, L.W., Krathwohl, D.R.: A taxonomy for learning, teaching, and assess- ing: A revision of bloom’s taxonomy of educational objectives. Longman, New York (2001)
2001
-
[4]
Information and Learning Sciences126(1/2), 1–7 (2025)
Bilal, D., He, J., Liu, J.: Guest editorial: Ai in education: transforming teaching and learning. Information and Learning Sciences126(1/2), 1–7 (2025)
2025
-
[5]
In: Che, W., Nabende, J., Shutova, E., Pilehvar, M.T
Calderon, N., Reichart, R., Dror, R.: The alternative annotator test for LLM- as-a-judge: How to statistically justify replacing human annotators with LLMs. In: Che, W., Nabende, J., Shutova, E., Pilehvar, M.T. (eds.) Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Vol- ume 1: Long Papers). pp. 16051–16081. Associa...
-
[6]
In their own words: What scholars and teachers want you to know about why and how to apply the science of learning in your academic setting pp
Chi,M.T.,Boucher,N.S.:Applyingtheicapframeworktoimproveclassroomlearn- ing. In their own words: What scholars and teachers want you to know about why and how to apply the science of learning in your academic setting pp. 94–110 (2023)
2023
-
[7]
Journal of Language and Education9(4 (36)), 128–138 (2023)
Črček, N., Patekar, J.: Writing with ai: University students’ use of chatgpt. Journal of Language and Education9(4 (36)), 128–138 (2023)
2023
-
[8]
In: International Conference on Artificial Intelligence in Education
Drive, S., Roll, I.: Towards a task-agnostic assessment of self-regulated learning in modeling activities. In: International Conference on Artificial Intelligence in Education. pp. 487–500. Springer (2025)
2025
-
[9]
International Journal of Artificial Intelligence in Education26(1), 124–132 (2016)
Graesser, A.C.: Conversations with autotutor help students learn. International Journal of Artificial Intelligence in Education26(1), 124–132 (2016)
2016
-
[10]
IEEE Access12, 43519–43529 (2024)
Haindl, P., Weinberger, G.: Students’ experiences of using chatgpt in an under- graduate programming course. IEEE Access12, 43519–43529 (2024)
2024
-
[11]
In: Calzolari, N., Kan, M.Y., Hoste, V., Lenci, A., Sakti, S., Xue, N
Han, J., Yoo, H., Myung, J., Kim, M., Lee, T.Y., Ahn, S.Y., Oh, A.: RECIPE4U: Student-ChatGPT interaction dataset in EFL writing education. In: Calzolari, N., Kan, M.Y., Hoste, V., Lenci, A., Sakti, S., Xue, N. (eds.) Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024). ...
2024
-
[12]
In: Thirty-seventh conferenceonneuralinformationprocessingsystems,neuralinformationprocessing systems foundation (2023)
Han, J., Yoo, H., Myung, J., Kim, M., Lee, T.Y., Ahn, S.Y., Oh, A., Answer, A.N.: Exploring student-chatgpt dialogue in efl writing education. In: Thirty-seventh conferenceonneuralinformationprocessingsystems,neuralinformationprocessing systems foundation (2023)
2023
-
[13]
In: Proceedings of the 19th International Conference of the Learning Sciences-ICLS 2025, pp
Hao, Z., Qin, F., Jiang, J., Cao, J., Yu, J., Liu, Z., Zhang, Y.: Ai as learning partners: Students’ interactions and perceptions in a simulated classroom with multiple llm-powered agents. In: Proceedings of the 19th International Conference of the Learning Sciences-ICLS 2025, pp. 1789-1793. International Society of the Learning Sciences (2025)
2025
-
[14]
Report (2024)
Higher Education Policy Institute, Kortext: Student generative ai survey (hepi & kortext report). Report (2024)
2024
-
[15]
In: The Handbook of Competence and Motivation
Karabenick, S.A., Newman, R.S.: Help seeking in academic settings: Goals, groups, and contexts. In: The Handbook of Competence and Motivation. Guilford Press (2006)
2006
-
[16]
Computers & Education p
Klein-Avraham, I., Savir, R., Atias, O., Roll, I., Baram-Tsabari, A.: Measur- ing different types and domains of ai knowledge: Developing and validating a performance-based scale. Computers & Education p. 105573 (2026)
2026
-
[17]
Levine, S., Beck, S.W., Mah, C., Phalen, L., PIttman, J.: How do students use chatgpt as a writing support? Journal of Adolescent & Adult Literacy68(5), 445– 457 (2025)
2025
-
[18]
Innovating assessments to measure and support complex skills pp
Roll, I., Barhak-Rabinowitz, M.: Measuring self-regulated learning using feedback and resources. Innovating assessments to measure and support complex skills pp. 159–171 (2023)
2023
-
[19]
arXiv preprint arXiv:2402.13446 , year=
Tan, Z., Li, D., Wang, S., Beigi, A., Jiang, B., Bhattacharjee, A., Karami, M., Li, J., Cheng, L., Liu, H.: Large language models for data annotation and synthesis: A survey. arXiv preprint arXiv:2402.13446 (2024)
-
[20]
Educational Psychologist46(4), 197–221 (2011)
VanLehn, K.: The relative effectiveness of human tutoring, intelligent tutoring sys- tems, and other tutoring systems. Educational Psychologist46(4), 197–221 (2011)
2011
-
[21]
Human- ities and Social Sciences Communications12(1), 1–21 (2025)
Wang, J., Fan, W.: The effect of chatgpt on students’ learning performance, learn- ing perception, and higher-order thinking: insights from a meta-analysis. Human- ities and Social Sciences Communications12(1), 1–21 (2025)
2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.