AI in the Wild: A Large Scale Analysis of Authentic Interactions of College Students with Generative AI

Ido Roll; Ofra Amir; Taelin Karidi

arxiv: 2606.29442 · v1 · pith:MSPG4Z4Enew · submitted 2026-06-28 · 💻 cs.CY

AI in the Wild: A Large Scale Analysis of Authentic Interactions of College Students with Generative AI

Taelin Karidi , Ofra Amir , Ido Roll This is my paper

Pith reviewed 2026-06-30 01:55 UTC · model grok-4.3

classification 💻 cs.CY

keywords generative AIstudent-AI interactionshigher educationcognitive intentinteraction patternsacademic domainslarge-scale analysisauthentic use

0 comments

The pith

Student generative AI use in college courses follows a small number of recurring patterns with course-specific variations.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper examines over 15,000 real student interactions with generative AI collected from multiple university courses. It characterizes each interaction by cognitive intent and context, such as whether requests target the task, the student's work, or prior AI responses. The analysis shows that these interactions cluster into a limited set of common patterns rather than varying widely. Systematic differences between courses point to distinct profiles tied to specific academic tasks. Understanding these patterns matters for designing AI tools and educational practices that align with how students actually work.

Core claim

Our analysis reveals that student-AI interaction is highly structured. Across courses, interactions concentrate in a small number of recurring patterns rather than exhibiting highly idiosyncratic use. At the same time, systematic differences emerge across courses, giving rise to distinct interaction profiles associated with different forms of academic work.

What carries the argument

Instruction-guided annotation of student turns along cognitive intent and interaction context dimensions, applied to a dataset of over 15,000 interaction units from voluntary AI use in coursework.

If this is right

Student-AI interactions are structured around recurring patterns across different courses.
Distinct interaction profiles emerge that correspond to different forms of academic work.
Large-scale authentic data provides a more representative view than prior controlled or small-scale studies.
Patterns can be used to understand engagement with GenAI in learning contexts.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Tailored AI assistants could be developed based on common interaction patterns to better support students.
Educators might design assignments considering typical AI usage profiles for their course type.
The findings could inform guidelines for responsible AI integration in higher education.

Load-bearing premise

The instruction-guided annotation scheme accurately and consistently captures cognitive intent and interaction context across diverse courses without substantial coder bias or misclassification.

What would settle it

A replication study finding that interaction patterns are highly varied and idiosyncratic across courses, or that annotation categories do not reliably distinguish patterns, would challenge the claim of structured interactions.

Figures

Figures reproduced from arXiv: 2606.29442 by Ido Roll, Ofra Amir, Taelin Karidi.

**Figure 1.** Figure 1: Overall interaction structure. Joint distribution of interaction context and Bloom-level cognitive intent aggregated across all courses. Values represent conversation-normalized percentages, such that each submitted chat log contributes equally, and sum to 100%. 6.1 Overall Interaction Structure Across the dataset, interaction mass is concentrated in a small number of recurring patterns ( [PITH_FULL_IMAG… view at source ↗

**Figure 2.** Figure 2: Interaction Structure Across Courses. Joint distribution of Bloom-level cognitive intent and interaction context across courses. Values are normalized by conversation, such that each conversation contributes equally regardless of length. references to students’ own work and a greater presence of higher-level cognitive activity (evaluate, create). These courses also tend to involve longer interactions on … view at source ↗

read the original abstract

Generative AI tools (GenAI) are increasingly used by students during coursework, yet empirical understanding of how students engage with these systems in authentic learning contexts remains limited. Existing studies have largely relied on controlled settings, single-domain analyses, or small-scale qualitative data, leaving open how student-AI interaction unfolds across courses and forms of academic work. We present a large-scale analysis of naturally occurring student-AI interactions collected from undergraduate students across multiple university courses and academic domains. The dataset comprises over 15,000 student-AI interaction units drawn from voluntary use of generative AI during real coursework. To characterize these interactions, we analyze each student turn along two complementary dimensions, cognitive intent and interaction context, capturing whether requests are directed toward the task or domain, the student's own work, or prior AI output. Using instruction-guided annotation applied at scale, we examine how these interaction patterns are distributed overall and how they vary across courses. Our analysis reveals that student-AI interaction is highly structured. Across courses, interactions concentrate in a small number of recurring patterns rather than exhibiting highly idiosyncratic use. At the same time, systematic differences emerge across courses, giving rise to distinct interaction profiles associated with different forms of academic work.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

New 15k+ dataset of real student-GenAI chats across courses is the main addition, but missing checks on annotation consistency leave the reported patterns open to question.

read the letter

The paper's real contribution is the scale of the data: over 15,000 voluntary student-AI interaction units pulled from actual undergraduate courses in multiple domains. That moves past the small or controlled studies that dominate the area and gives a first look at how use varies by course type.

They code turns on two axes—cognitive intent and interaction context—then show that most activity clusters into a handful of recurring patterns while still differing systematically across courses. The dataset itself and the two-dimensional scheme are the concrete new pieces.

The main weakness is the labeling step. The abstract describes instruction-guided annotation applied at scale but gives no inter-annotator agreement numbers, no validation results, and no robustness checks. Without those, the claim that interactions are highly structured rather than idiosyncratic rests on untested coder consistency. Voluntary participation adds another possible bias that is not addressed.

This is straightforward descriptive work with no fitted models or derivations. The methods section will need to show that the coding scheme holds up across domains before the pattern distributions can be taken as reliable.

Researchers working on AI tools for education or on policy around student use would find the dataset description useful. It is worth sending to referees because the scale is new and the topic is timely, even though the current write-up leaves the central empirical claims under-supported.

Referee Report

1 major / 1 minor

Summary. The paper claims to present a large-scale analysis of over 15,000 authentic student-AI interaction units collected from voluntary undergraduate use across multiple courses and domains. Using instruction-guided annotation on two dimensions (cognitive intent and interaction context), it reports that interactions concentrate in a small number of recurring patterns rather than idiosyncratic use, while systematic differences across courses produce distinct interaction profiles tied to forms of academic work.

Significance. If the annotation scheme is shown to be reliable, the work provides one of the first large-scale empirical descriptions of real-world student-GenAI interactions in coursework, moving beyond controlled or small-scale studies. The scale of the dataset and focus on natural use across domains are clear strengths that could inform both educational practice and tool design.

major comments (1)

[Methods / annotation procedure] The central claims that interactions concentrate in recurring patterns and exhibit course-specific profiles (abstract and results sections) depend entirely on the instruction-guided annotation of cognitive intent and interaction context being accurate and consistent. The manuscript reports no inter-annotator agreement statistics, no validation results on a held-out set, and no robustness checks for coder bias or domain-specific misclassification, leaving open that the reported concentrations and differences could be annotation artifacts.

minor comments (1)

[Abstract] The abstract should briefly note the voluntary nature of participation to contextualize the scope of the findings.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the positive assessment of the work's scale and potential contributions, as well as for the constructive feedback on annotation reliability. We address the major comment below and will revise the manuscript to incorporate the requested details.

read point-by-point responses

Referee: [Methods / annotation procedure] The central claims that interactions concentrate in recurring patterns and exhibit course-specific profiles (abstract and results sections) depend entirely on the instruction-guided annotation of cognitive intent and interaction context being accurate and consistent. The manuscript reports no inter-annotator agreement statistics, no validation results on a held-out set, and no robustness checks for coder bias or domain-specific misclassification, leaving open that the reported concentrations and differences could be annotation artifacts.

Authors: We agree that the absence of inter-annotator agreement statistics, held-out validation, and robustness checks is a limitation in the submitted manuscript. The annotation procedure is described only at a high level, which leaves the reliability of the cognitive intent and interaction context labels open to question and could undermine the reported concentrations and course differences. In the revised version we will add a dedicated subsection to the Methods that reports: (1) the number of annotators and their training, (2) inter-annotator agreement computed on an overlapping sample (e.g., Cohen’s or Fleiss’ kappa together with raw agreement), (3) any held-out validation or adjudication process, and (4) domain-specific consistency checks. We will also add a brief limitations paragraph discussing residual coder bias. These additions will directly support the central claims and address the possibility of annotation artifacts. revision: yes

Circularity Check

0 steps flagged

No circularity: purely empirical descriptive analysis of interaction logs

full rationale

The paper conducts a large-scale empirical study of over 15,000 student-AI interaction units collected from real coursework. It applies instruction-guided annotation to classify turns along cognitive intent and interaction context dimensions, then reports observed distributions and course-specific profiles. No equations, fitted parameters, predictions, derivations, or self-citations appear in the provided text. The central claims rest directly on the collected data and annotations rather than reducing to prior results by construction. This is self-contained empirical description with no load-bearing self-referential steps.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The analysis depends on the assumption that voluntary submissions represent typical student behavior and that the annotation instructions produce reliable labels; no free parameters or invented entities are introduced.

axioms (1)

domain assumption Instruction-guided annotation applied at scale produces consistent classifications of cognitive intent and interaction context.
Central to the method described in the abstract for characterizing the 15k interactions.

pith-pipeline@v0.9.1-grok · 5754 in / 1216 out tokens · 31085 ms · 2026-06-30T01:55:30.219879+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

21 extracted references · 3 canonical work pages

[1]

Review of Educational Research73(3), 277–320 (2003)

Aleven, V., Stahl, E., Schworm, S., Fischer, F., Wallace, R.: Help-seeking and help design in interactive learning environments. Review of Educational Research73(3), 277–320 (2003)

2003
[2]

arXiv preprint arXiv:2505.24126 (2025)

Ammari, T., Chen, M., Zaman, S., Garimella, K.: How students (really) use chatgpt: Uncovering experiences among undergraduate students. arXiv preprint arXiv:2505.24126 (2025)

work page arXiv 2025
[3]

Longman, New York (2001)

Anderson, L.W., Krathwohl, D.R.: A taxonomy for learning, teaching, and assess- ing: A revision of bloom’s taxonomy of educational objectives. Longman, New York (2001)

2001
[4]

Information and Learning Sciences126(1/2), 1–7 (2025)

Bilal, D., He, J., Liu, J.: Guest editorial: Ai in education: transforming teaching and learning. Information and Learning Sciences126(1/2), 1–7 (2025)

2025
[5]

In: Che, W., Nabende, J., Shutova, E., Pilehvar, M.T

Calderon, N., Reichart, R., Dror, R.: The alternative annotator test for LLM- as-a-judge: How to statistically justify replacing human annotators with LLMs. In: Che, W., Nabende, J., Shutova, E., Pilehvar, M.T. (eds.) Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Vol- ume 1: Long Papers). pp. 16051–16081. Associa...

work page doi:10.18653/v1/2025.acl-long.782 2025
[6]

In their own words: What scholars and teachers want you to know about why and how to apply the science of learning in your academic setting pp

Chi,M.T.,Boucher,N.S.:Applyingtheicapframeworktoimproveclassroomlearn- ing. In their own words: What scholars and teachers want you to know about why and how to apply the science of learning in your academic setting pp. 94–110 (2023)

2023
[7]

Journal of Language and Education9(4 (36)), 128–138 (2023)

Črček, N., Patekar, J.: Writing with ai: University students’ use of chatgpt. Journal of Language and Education9(4 (36)), 128–138 (2023)

2023
[8]

In: International Conference on Artificial Intelligence in Education

Drive, S., Roll, I.: Towards a task-agnostic assessment of self-regulated learning in modeling activities. In: International Conference on Artificial Intelligence in Education. pp. 487–500. Springer (2025)

2025
[9]

International Journal of Artificial Intelligence in Education26(1), 124–132 (2016)

Graesser, A.C.: Conversations with autotutor help students learn. International Journal of Artificial Intelligence in Education26(1), 124–132 (2016)

2016
[10]

IEEE Access12, 43519–43529 (2024)

Haindl, P., Weinberger, G.: Students’ experiences of using chatgpt in an under- graduate programming course. IEEE Access12, 43519–43529 (2024)

2024
[11]

In: Calzolari, N., Kan, M.Y., Hoste, V., Lenci, A., Sakti, S., Xue, N

Han, J., Yoo, H., Myung, J., Kim, M., Lee, T.Y., Ahn, S.Y., Oh, A.: RECIPE4U: Student-ChatGPT interaction dataset in EFL writing education. In: Calzolari, N., Kan, M.Y., Hoste, V., Lenci, A., Sakti, S., Xue, N. (eds.) Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024). ...

2024
[12]

In: Thirty-seventh conferenceonneuralinformationprocessingsystems,neuralinformationprocessing systems foundation (2023)

Han, J., Yoo, H., Myung, J., Kim, M., Lee, T.Y., Ahn, S.Y., Oh, A., Answer, A.N.: Exploring student-chatgpt dialogue in efl writing education. In: Thirty-seventh conferenceonneuralinformationprocessingsystems,neuralinformationprocessing systems foundation (2023)

2023
[13]

In: Proceedings of the 19th International Conference of the Learning Sciences-ICLS 2025, pp

Hao, Z., Qin, F., Jiang, J., Cao, J., Yu, J., Liu, Z., Zhang, Y.: Ai as learning partners: Students’ interactions and perceptions in a simulated classroom with multiple llm-powered agents. In: Proceedings of the 19th International Conference of the Learning Sciences-ICLS 2025, pp. 1789-1793. International Society of the Learning Sciences (2025)

2025
[14]

Report (2024)

Higher Education Policy Institute, Kortext: Student generative ai survey (hepi & kortext report). Report (2024)

2024
[15]

In: The Handbook of Competence and Motivation

Karabenick, S.A., Newman, R.S.: Help seeking in academic settings: Goals, groups, and contexts. In: The Handbook of Competence and Motivation. Guilford Press (2006)

2006
[16]

Computers & Education p

Klein-Avraham, I., Savir, R., Atias, O., Roll, I., Baram-Tsabari, A.: Measur- ing different types and domains of ai knowledge: Developing and validating a performance-based scale. Computers & Education p. 105573 (2026)

2026
[17]

Levine, S., Beck, S.W., Mah, C., Phalen, L., PIttman, J.: How do students use chatgpt as a writing support? Journal of Adolescent & Adult Literacy68(5), 445– 457 (2025)

2025
[18]

Innovating assessments to measure and support complex skills pp

Roll, I., Barhak-Rabinowitz, M.: Measuring self-regulated learning using feedback and resources. Innovating assessments to measure and support complex skills pp. 159–171 (2023)

2023
[19]

arXiv preprint arXiv:2402.13446 , year=

Tan, Z., Li, D., Wang, S., Beigi, A., Jiang, B., Bhattacharjee, A., Karami, M., Li, J., Cheng, L., Liu, H.: Large language models for data annotation and synthesis: A survey. arXiv preprint arXiv:2402.13446 (2024)

work page arXiv 2024
[20]

Educational Psychologist46(4), 197–221 (2011)

VanLehn, K.: The relative effectiveness of human tutoring, intelligent tutoring sys- tems, and other tutoring systems. Educational Psychologist46(4), 197–221 (2011)

2011
[21]

Human- ities and Social Sciences Communications12(1), 1–21 (2025)

Wang, J., Fan, W.: The effect of chatgpt on students’ learning performance, learn- ing perception, and higher-order thinking: insights from a meta-analysis. Human- ities and Social Sciences Communications12(1), 1–21 (2025)

2025

[1] [1]

Review of Educational Research73(3), 277–320 (2003)

Aleven, V., Stahl, E., Schworm, S., Fischer, F., Wallace, R.: Help-seeking and help design in interactive learning environments. Review of Educational Research73(3), 277–320 (2003)

2003

[2] [2]

arXiv preprint arXiv:2505.24126 (2025)

Ammari, T., Chen, M., Zaman, S., Garimella, K.: How students (really) use chatgpt: Uncovering experiences among undergraduate students. arXiv preprint arXiv:2505.24126 (2025)

work page arXiv 2025

[3] [3]

Longman, New York (2001)

Anderson, L.W., Krathwohl, D.R.: A taxonomy for learning, teaching, and assess- ing: A revision of bloom’s taxonomy of educational objectives. Longman, New York (2001)

2001

[4] [4]

Information and Learning Sciences126(1/2), 1–7 (2025)

Bilal, D., He, J., Liu, J.: Guest editorial: Ai in education: transforming teaching and learning. Information and Learning Sciences126(1/2), 1–7 (2025)

2025

[5] [5]

In: Che, W., Nabende, J., Shutova, E., Pilehvar, M.T

Calderon, N., Reichart, R., Dror, R.: The alternative annotator test for LLM- as-a-judge: How to statistically justify replacing human annotators with LLMs. In: Che, W., Nabende, J., Shutova, E., Pilehvar, M.T. (eds.) Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Vol- ume 1: Long Papers). pp. 16051–16081. Associa...

work page doi:10.18653/v1/2025.acl-long.782 2025

[6] [6]

In their own words: What scholars and teachers want you to know about why and how to apply the science of learning in your academic setting pp

Chi,M.T.,Boucher,N.S.:Applyingtheicapframeworktoimproveclassroomlearn- ing. In their own words: What scholars and teachers want you to know about why and how to apply the science of learning in your academic setting pp. 94–110 (2023)

2023

[7] [7]

Journal of Language and Education9(4 (36)), 128–138 (2023)

Črček, N., Patekar, J.: Writing with ai: University students’ use of chatgpt. Journal of Language and Education9(4 (36)), 128–138 (2023)

2023

[8] [8]

In: International Conference on Artificial Intelligence in Education

Drive, S., Roll, I.: Towards a task-agnostic assessment of self-regulated learning in modeling activities. In: International Conference on Artificial Intelligence in Education. pp. 487–500. Springer (2025)

2025

[9] [9]

International Journal of Artificial Intelligence in Education26(1), 124–132 (2016)

Graesser, A.C.: Conversations with autotutor help students learn. International Journal of Artificial Intelligence in Education26(1), 124–132 (2016)

2016

[10] [10]

IEEE Access12, 43519–43529 (2024)

Haindl, P., Weinberger, G.: Students’ experiences of using chatgpt in an under- graduate programming course. IEEE Access12, 43519–43529 (2024)

2024

[11] [11]

In: Calzolari, N., Kan, M.Y., Hoste, V., Lenci, A., Sakti, S., Xue, N

Han, J., Yoo, H., Myung, J., Kim, M., Lee, T.Y., Ahn, S.Y., Oh, A.: RECIPE4U: Student-ChatGPT interaction dataset in EFL writing education. In: Calzolari, N., Kan, M.Y., Hoste, V., Lenci, A., Sakti, S., Xue, N. (eds.) Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024). ...

2024

[12] [12]

In: Thirty-seventh conferenceonneuralinformationprocessingsystems,neuralinformationprocessing systems foundation (2023)

Han, J., Yoo, H., Myung, J., Kim, M., Lee, T.Y., Ahn, S.Y., Oh, A., Answer, A.N.: Exploring student-chatgpt dialogue in efl writing education. In: Thirty-seventh conferenceonneuralinformationprocessingsystems,neuralinformationprocessing systems foundation (2023)

2023

[13] [13]

In: Proceedings of the 19th International Conference of the Learning Sciences-ICLS 2025, pp

Hao, Z., Qin, F., Jiang, J., Cao, J., Yu, J., Liu, Z., Zhang, Y.: Ai as learning partners: Students’ interactions and perceptions in a simulated classroom with multiple llm-powered agents. In: Proceedings of the 19th International Conference of the Learning Sciences-ICLS 2025, pp. 1789-1793. International Society of the Learning Sciences (2025)

2025

[14] [14]

Report (2024)

Higher Education Policy Institute, Kortext: Student generative ai survey (hepi & kortext report). Report (2024)

2024

[15] [15]

In: The Handbook of Competence and Motivation

Karabenick, S.A., Newman, R.S.: Help seeking in academic settings: Goals, groups, and contexts. In: The Handbook of Competence and Motivation. Guilford Press (2006)

2006

[16] [16]

Computers & Education p

Klein-Avraham, I., Savir, R., Atias, O., Roll, I., Baram-Tsabari, A.: Measur- ing different types and domains of ai knowledge: Developing and validating a performance-based scale. Computers & Education p. 105573 (2026)

2026

[17] [17]

Levine, S., Beck, S.W., Mah, C., Phalen, L., PIttman, J.: How do students use chatgpt as a writing support? Journal of Adolescent & Adult Literacy68(5), 445– 457 (2025)

2025

[18] [18]

Innovating assessments to measure and support complex skills pp

Roll, I., Barhak-Rabinowitz, M.: Measuring self-regulated learning using feedback and resources. Innovating assessments to measure and support complex skills pp. 159–171 (2023)

2023

[19] [19]

arXiv preprint arXiv:2402.13446 , year=

Tan, Z., Li, D., Wang, S., Beigi, A., Jiang, B., Bhattacharjee, A., Karami, M., Li, J., Cheng, L., Liu, H.: Large language models for data annotation and synthesis: A survey. arXiv preprint arXiv:2402.13446 (2024)

work page arXiv 2024

[20] [20]

Educational Psychologist46(4), 197–221 (2011)

VanLehn, K.: The relative effectiveness of human tutoring, intelligent tutoring sys- tems, and other tutoring systems. Educational Psychologist46(4), 197–221 (2011)

2011

[21] [21]

Human- ities and Social Sciences Communications12(1), 1–21 (2025)

Wang, J., Fan, W.: The effect of chatgpt on students’ learning performance, learn- ing perception, and higher-order thinking: insights from a meta-analysis. Human- ities and Social Sciences Communications12(1), 1–21 (2025)

2025