Students' Perception Accuracy of Partners' AI Use and its Relation to Collaboration Performance

Laura Graf; Oleksandra Poquet; Ramona Beinstingel; Stephan Krusche

arxiv: 2606.23237 · v1 · pith:EHCGJAJGnew · submitted 2026-06-22 · 💻 cs.HC · cs.CY

Students' Perception Accuracy of Partners' AI Use and its Relation to Collaboration Performance

Laura Graf , Ramona Beinstingel , Stephan Krusche , Oleksandra Poquet This is my paper

Pith reviewed 2026-06-26 07:26 UTC · model grok-4.3

classification 💻 cs.HC cs.CY

keywords AI perceptioncollaborative programmingteam performancemisalignmentstudent teamssoftware engineering educationperception accuracy

0 comments

The pith

Greater misalignment in students' beliefs about partners' AI use early in projects links to lower final scores, especially in weaker teams.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper investigates how accurately students perceive their partners' use of generative AI in collaborative programming assignments. It establishes that mismatched beliefs about each other's AI use at the project's start correlate with poorer team performance on the final project. This association is particularly pronounced among teams that had lower programming skills beforehand. The study also shows that such misalignments do not reliably resolve through scheduled in-person pair programming, pointing to a potential need for additional transparency measures in educational collaborations.

Core claim

In a three-wave longitudinal study of 103 student pairs in an introductory software engineering course, greater misalignment between partners' beliefs about each other's AI use early in the project was associated with lower final project scores. The effect of such misaligned perceptions is the strongest in teams with lower prior programming performance. The perception misalignment does not consistently decrease through face-to-face pair-programming sessions.

What carries the argument

Perception misalignment measure derived from differences in self-reported beliefs about partner's AI use, used to predict team project scores.

If this is right

Misaligned perceptions may require targeted interventions for transparency in AI use.
Teams with lower prior performance suffer more from these misalignments.
In-person sessions alone may not align perceptions effectively.
Transparency about AI use could support better collaboration in programming education.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Similar perception issues might affect professional software development teams using AI tools.
Explicit disclosure mechanisms could be tested to reduce misalignment effects.
The cost of misalignment could extend to other invisible tool uses in collaborative work.

Load-bearing premise

The study assumes that self-reported beliefs about partners' AI use accurately capture true perceptions without substantial bias or inaccuracy.

What would settle it

Replicating the study with objective logs of actual AI usage rather than self-reports to see if the performance association holds.

read the original abstract

Collaborative assignments are a cornerstone of programming education. Effective collaboration during a programming project depends on the formation of reasonably accurate beliefs about how each partner works. Generative AI tools, now widely used by undergraduate students, have introduced a consequential and largely invisible new dimension into collaboration: each student's use of AI. When partners collaborate remotely, they interpret partners' ability and effort through their code. This raises the question of how accurately students perceive each other's AI use in collaborations, and if a misalignment in these perceptions relates to team performance. To address this question, we conducted a three-wave longitudinal study of 103 student pairs in an introductory software engineering course. We found that greater misalignment between partners' beliefs about each other's AI use early in the project was associated with lower final project scores. The effect of such misaligned perceptions is the strongest in teams with lower prior programming performance, suggesting that low performing students pay a higher cost of misaligned perceptions. The perception misalignment does not consistently decrease through face-to-face pair-programming sessions. This suggests that ways to foster transparency may be needed to support student teams in collaborative programming.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives new longitudinal data linking early misalignment in student pairs' beliefs about each other's AI use to lower project scores, strongest in weaker teams, but the self-report measure has no validation against actual usage.

read the letter

The key point is that greater early misalignment between partners on how much each is using AI predicted lower final project scores in these 103 pairs, with a bigger hit for teams that started with lower programming performance. The misalignment did not reliably drop after face-to-face sessions.

The work adds concrete numbers from a real course setting to the question of how AI tools change what students think their partners are doing. Prior collaboration studies have looked at perception accuracy in teams, but this one isolates the AI-use dimension and tracks it over three waves against actual grades. That is a step forward from attitude surveys alone.

The main limitation is the misalignment score itself. It rests on self-reported beliefs with no reported check against code traces, commit logs, or any objective indicator of AI use. If lower-performing students differ in how they interpret the questions or in social-desirability bias, the interaction with prior performance could be partly artifactual. The abstract also gives no effect sizes, controls, or missing-data handling, so the strength of the association is hard to judge from the summary.

This is for researchers focused on collaborative CS education and the effects of generative tools in teams. A reader wanting empirical patterns on perception gaps would find usable data here.

Send it for peer review. The question is timely and the sample is decent, but referees will need to see the exact items, validation steps, and robustness checks before the claim can be taken as solid.

Referee Report

2 major / 0 minor

Summary. The paper reports results from a three-wave longitudinal survey of 103 student pairs in an introductory software engineering course. It claims that greater misalignment between partners' beliefs about each other's generative AI use early in a collaborative project is associated with lower final project scores, with the association strongest among teams having lower prior programming performance. The study additionally finds that this misalignment does not consistently decrease across face-to-face pair-programming sessions.

Significance. If the associations prove robust after appropriate controls and if the misalignment measure can be shown to be valid, the findings would contribute to understanding how perceptions of AI tool use affect collaborative outcomes in programming education. The work could support development of transparency interventions, particularly benefiting lower-performing students, and would underscore limitations of pair-programming alone for aligning perceptions.

major comments (2)

[Methods] Methods: The misalignment score is constructed solely from self-reported beliefs on Likert-style or categorical survey items with no reported validation or robustness check against objective indicators such as git commit histories, AI-tool usage logs, or blinded code review. Because the central claim rests on an association between this score and project performance (and its interaction with prior performance), the absence of external validation leaves open the possibility that reporting biases or differential interpretation systematically affect the key predictor.
[Results] Results: The abstract and reported findings supply no details on the statistical models (e.g., regression specification, covariates for team communication quality or other confounds, effect sizes, missing-data handling, or correction for multiple tests). Without these elements the evidential support for the reported associations and the moderation by prior performance cannot be evaluated.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive review. We address each major comment below and indicate revisions to the manuscript where feasible.

read point-by-point responses

Referee: [Methods] Methods: The misalignment score is constructed solely from self-reported beliefs on Likert-style or categorical survey items with no reported validation or robustness check against objective indicators such as git commit histories, AI-tool usage logs, or blinded code review. Because the central claim rests on an association between this score and project performance (and its interaction with prior performance), the absence of external validation leaves open the possibility that reporting biases or differential interpretation systematically affect the key predictor.

Authors: We agree that external validation against objective indicators would strengthen the measure. However, the study design did not collect git commit histories, AI-tool usage logs, or conduct blinded code reviews, as these were not part of the course data collection. Self-report is standard for perception studies, but we will add an explicit limitations section discussing potential reporting biases and differential interpretation. We will also report additional robustness checks using alternative misalignment computations (e.g., varying thresholds for 'misalignment'). revision: partial
Referee: [Results] Results: The abstract and reported findings supply no details on the statistical models (e.g., regression specification, covariates for team communication quality or other confounds, effect sizes, missing-data handling, or correction for multiple tests). Without these elements the evidential support for the reported associations and the moderation by prior performance cannot be evaluated.

Authors: We accept that the current manuscript lacks sufficient statistical detail. In the revision we will expand the Methods section with full regression specifications (including all covariates such as team communication quality where measured), effect sizes, missing-data handling procedures, and any multiple-testing corrections. The Results section will report these elements explicitly, and the abstract will be updated to note the modeling approach. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical associations from survey data

full rationale

This is a standard observational study that collects three-wave survey responses on perceived AI use, computes a misalignment metric as the difference between partners' reports, and reports its statistical association with final project scores (stronger in low-prior-performance teams). The central claim is a measured correlation, not a derivation, prediction, or model that reduces to its own inputs by construction. No equations, fitted parameters renamed as predictions, or self-citation chains appear in the load-bearing steps. The analysis is self-contained against the collected data.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on the validity of self-reported survey measures of perception and the assumption that standard statistical associations can be interpreted without major unmeasured confounding in this educational setting.

axioms (2)

domain assumption Self-reported beliefs collected via survey accurately reflect students' internal perceptions of partners' AI use.
The misalignment variable is constructed from these reports; if reports are biased the measured association loses meaning.
standard math Standard assumptions of correlation/regression analysis (linearity, independence) hold for the collected data.
The reported association implies use of such tests.

pith-pipeline@v0.9.1-grok · 5734 in / 1376 out tokens · 25139 ms · 2026-06-26T07:26:34.239278+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

19 extracted references · 16 canonical work pages

[1]

Small Group Design Meetings: An Analysis of Collaboration,

G. M. Olson, J. S. Olson, M. R. Carter, and M. Storrosten, “Small Group Design Meetings: An Analysis of Collaboration,” Human–Computer Interact., vol. 7, no. 4, pp. 347–374, Dec. 1992, doi: 10.1207/s15327051hci0704_1

work page doi:10.1207/s15327051hci0704_1 1992
[2]

Safe- DS: A domain specific language to make data science safe,

A. Sajadi, K. Damevski, and P. Chatterjee, “Interpersonal Trust in OSS: Exploring Dimensions of Trust in GitHub Pull Requests,” in 2023 IEEE/ACM 45th Interna-tional Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER), May 2023, pp. 19–24. doi: 10.1109/ICSE-NIER58687.2023.00010

work page doi:10.1109/icse-nier58687.2023.00010 2023
[3]

Becker, Andrew Luxton-Reilly, and James Prather

J. Finnie-Ansley, P. Denny, B. A. Becker, A. Luxton-Reilly, and J. Prather, “The Robots Are Coming: Exploring the Implications of OpenAI Codex on Introductory Programming,” in Proceedings of the 24th Australasian Computing Education Conference, in ACE ’22. New York, NY, USA: Association for Computing Ma-chinery, Feb. 2022, pp. 10–19. doi: 10.1145/3511861.3511863

work page doi:10.1145/3511861.3511863 2022
[4]

Awareness and coordination in shared workspaces,

P. Dourish and V. Bellotti, “Awareness and coordination in shared workspaces,” in Proceedings of the 1992 ACM conference on Computer-supported cooperative work - CSCW ’92, Toronto, Ontario, Canada: ACM Press, 1992, pp. 107–114. doi: 10.1145/143457.143468

work page doi:10.1145/143457.143468 1992
[5]

doi: 10.11575/PRISM/20172

work page doi:10.11575/prism/20172
[6]

Learning analytics: The emergence of a discipline [J]

G. Siemens, “Learning Analytics: The Emergence of a Discipline,” Am. Behav. Sci., vol. 57, no. 10, pp. 1380–1400, Oct. 2013, doi: 10.1177/0002764213498851

work page doi:10.1177/0002764213498851 2013
[7]

Group awareness tools: It’s what you do with it that matters,

J. Janssen, G. Erkens, and P. A. Kirschner, “Group awareness tools: It’s what you do with it that matters,” Comput. Hum. Behav., vol. 27, no. 3, pp. 1046–1058, May 2011, doi: 10.1016/j.chb.2010.06.002

work page doi:10.1016/j.chb.2010.06.002 2011
[8]

Transactive memory: Learning who knows what in work groups and organizations,

Moreland, Richard L, “Transactive memory: Learning who knows what in work groups and organizations,” in Shared Cognition in Organizations, L. L. Thomp-son, Ed., Erlbaum, 1999, pp. 3–31

1999
[9]

AI Tools in Programming Education: Student Perspec-tives and Usage Trends,

M. Maček and M. Novak, “AI Tools in Programming Education: Student Perspec-tives and Usage Trends,” J. Inf. Organ. Sci., vol. 50, no. 1, pp. 83–104, Feb. 2026, doi: 10.31341/jios.50.1.5

work page doi:10.31341/jios.50.1.5 2026
[10]

Less stress, better scores, same learning: The dissociation of performance and learning in AI-supported programming education,

P. Bassner, B. Lenk-Ostendorf, R. Beinstingel, T. Wasner, and S. Krusche, “Less stress, better scores, same learning: The dissociation of performance and learning in AI-supported programming education,” Comput. Educ. Artif. Intell., vol. 10, p. 100537, Jun. 2026, doi: 10.1016/j.caeai.2025.100537

work page doi:10.1016/j.caeai.2025.100537 2026
[11]

The Effects of GitHub Copilot on Computing Students’ Programming Effective-ness, Efficiency, and Processes in Brownfield Coding Tasks,

M. I. H. Shihab, C. Hundhausen, A. Tariq, S. Haque, Y. Qiao, and B. W. Mulanda, “The Effects of GitHub Copilot on Computing Students’ Programming Effective-ness, Efficiency, and Processes in Brownfield Coding Tasks,” in Proceedings of the 2025 ACM Conference on International Computing Education Research V.1, AI Use Perception and Collaboration Performance...

work page doi:10.1145/3702652.3744219 2025
[12]

AI Hasn’t Fixed Teamwork, But It Shifted Collaborative Culture: A Longitudinal Study in a Project-Based Software Development Organization (2023-2025),

Q. Xiao, X. E. Hu, M. E. Whiting, A. Karunakaran, H. Shen, and H. Cao, “AI Hasn’t Fixed Teamwork, But It Shifted Collaborative Culture: A Longitudinal Study in a Project-Based Software Development Organization (2023-2025),” Sep. 13, 2025, arXiv: arXiv:2509.10956. doi: 10.48550/arXiv.2509.10956

work page doi:10.48550/arxiv.2509.10956 2023
[13]

D. Katz, F. H. Allport, and M. B. Jenness, Students’ attitudes; a report of the Sy-racuse University reaction study. in Students’ attitudes; a report of the Syracuse University reaction study. Oxford, England: Craftsman Press, 1931, pp. xxviii,

1931
[14]

Transactive memory: A contemporary analysis of the group mind,

D. M. Wegner, “Transactive memory: A contemporary analysis of the group mind,” in Theories of Group Behavior, B. Mullen and G. R. Goethals, Eds., New York: Springer, 1987, pp. 185–208

1987
[15]

Students’ Use of GitHub Copilot for Working with Large Code Bases,

A. Shah, A. Chernova, E. Tomson, L. Porter, W. G. Griswold, and A. G. Soosai Raj, “Students’ Use of GitHub Copilot for Working with Large Code Bases,” in Proceedings of the 56th ACM Technical Symposium on Computer Science Educa-tion V. 1, Pittsburgh PA USA: ACM, Feb. 2025, pp. 1050–1056. doi: 10.1145/3641554.3701800

work page doi:10.1145/3641554.3701800 2025
[16]

Robust Estimation of a Location Parameter

P. J. Huber, “Robust Estimation of a Location Parameter,” in Breakthroughs in Statistics: Methodology and Distribution, S. Kotz and N. L. Johnson, Eds., New York, NY: Springer, 1992, pp. 492–518. doi: 10.1007/978-1-4612-4380-9_35

work page doi:10.1007/978-1-4612-4380-9_35 1992
[17]

Evaluating Teamwork Components in Large Undergraduate Software Engineering Teams,

L. Dorić, N. Luburić, J. Slivka, S. Prokić, and A. Ðukić, “Evaluating Teamwork Components in Large Undergraduate Software Engineering Teams,” ACM Trans Comput Educ, vol. 25, no. 3, p. 28:1-28:25, Jun. 2025, doi: 10.1145/3733840

work page doi:10.1145/3733840 2025
[18]

LLMs Integration in Software Engineering Team Projects: Roles, Impact, and a Pedagogical Design Space for AI Tools in Computing Education,

A. Kharrufa, S. Alghamdi, A. Aziz, and C. Bull, “LLMs Integration in Software Engineering Team Projects: Roles, Impact, and a Pedagogical Design Space for AI Tools in Computing Education,” ACM Trans. Comput. Educ., vol. 26, no. 2, p. 20:1-20:27, Jan. 2026, doi: 10.1145/3779296

work page doi:10.1145/3779296 2026
[19]

Human-Human-AI Triadic Programming: Uncovering the Role of AI Agent and the Value of Human Partner in Collaborative Learning,

T. Daryanto et al., “Human-Human-AI Triadic Programming: Uncovering the Role of AI Agent and the Value of Human Partner in Collaborative Learning,” Jan. 17, 2026, arXiv: arXiv:2601.12134. doi: 10.48550/arXiv.2601.12134

work page doi:10.48550/arxiv.2601.12134 2026

[1] [1]

Small Group Design Meetings: An Analysis of Collaboration,

G. M. Olson, J. S. Olson, M. R. Carter, and M. Storrosten, “Small Group Design Meetings: An Analysis of Collaboration,” Human–Computer Interact., vol. 7, no. 4, pp. 347–374, Dec. 1992, doi: 10.1207/s15327051hci0704_1

work page doi:10.1207/s15327051hci0704_1 1992

[2] [2]

Safe- DS: A domain specific language to make data science safe,

A. Sajadi, K. Damevski, and P. Chatterjee, “Interpersonal Trust in OSS: Exploring Dimensions of Trust in GitHub Pull Requests,” in 2023 IEEE/ACM 45th Interna-tional Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER), May 2023, pp. 19–24. doi: 10.1109/ICSE-NIER58687.2023.00010

work page doi:10.1109/icse-nier58687.2023.00010 2023

[3] [3]

Becker, Andrew Luxton-Reilly, and James Prather

J. Finnie-Ansley, P. Denny, B. A. Becker, A. Luxton-Reilly, and J. Prather, “The Robots Are Coming: Exploring the Implications of OpenAI Codex on Introductory Programming,” in Proceedings of the 24th Australasian Computing Education Conference, in ACE ’22. New York, NY, USA: Association for Computing Ma-chinery, Feb. 2022, pp. 10–19. doi: 10.1145/3511861.3511863

work page doi:10.1145/3511861.3511863 2022

[4] [4]

Awareness and coordination in shared workspaces,

P. Dourish and V. Bellotti, “Awareness and coordination in shared workspaces,” in Proceedings of the 1992 ACM conference on Computer-supported cooperative work - CSCW ’92, Toronto, Ontario, Canada: ACM Press, 1992, pp. 107–114. doi: 10.1145/143457.143468

work page doi:10.1145/143457.143468 1992

[5] [5]

doi: 10.11575/PRISM/20172

work page doi:10.11575/prism/20172

[6] [6]

Learning analytics: The emergence of a discipline [J]

G. Siemens, “Learning Analytics: The Emergence of a Discipline,” Am. Behav. Sci., vol. 57, no. 10, pp. 1380–1400, Oct. 2013, doi: 10.1177/0002764213498851

work page doi:10.1177/0002764213498851 2013

[7] [7]

Group awareness tools: It’s what you do with it that matters,

J. Janssen, G. Erkens, and P. A. Kirschner, “Group awareness tools: It’s what you do with it that matters,” Comput. Hum. Behav., vol. 27, no. 3, pp. 1046–1058, May 2011, doi: 10.1016/j.chb.2010.06.002

work page doi:10.1016/j.chb.2010.06.002 2011

[8] [8]

Transactive memory: Learning who knows what in work groups and organizations,

Moreland, Richard L, “Transactive memory: Learning who knows what in work groups and organizations,” in Shared Cognition in Organizations, L. L. Thomp-son, Ed., Erlbaum, 1999, pp. 3–31

1999

[9] [9]

AI Tools in Programming Education: Student Perspec-tives and Usage Trends,

M. Maček and M. Novak, “AI Tools in Programming Education: Student Perspec-tives and Usage Trends,” J. Inf. Organ. Sci., vol. 50, no. 1, pp. 83–104, Feb. 2026, doi: 10.31341/jios.50.1.5

work page doi:10.31341/jios.50.1.5 2026

[10] [10]

Less stress, better scores, same learning: The dissociation of performance and learning in AI-supported programming education,

P. Bassner, B. Lenk-Ostendorf, R. Beinstingel, T. Wasner, and S. Krusche, “Less stress, better scores, same learning: The dissociation of performance and learning in AI-supported programming education,” Comput. Educ. Artif. Intell., vol. 10, p. 100537, Jun. 2026, doi: 10.1016/j.caeai.2025.100537

work page doi:10.1016/j.caeai.2025.100537 2026

[11] [11]

The Effects of GitHub Copilot on Computing Students’ Programming Effective-ness, Efficiency, and Processes in Brownfield Coding Tasks,

M. I. H. Shihab, C. Hundhausen, A. Tariq, S. Haque, Y. Qiao, and B. W. Mulanda, “The Effects of GitHub Copilot on Computing Students’ Programming Effective-ness, Efficiency, and Processes in Brownfield Coding Tasks,” in Proceedings of the 2025 ACM Conference on International Computing Education Research V.1, AI Use Perception and Collaboration Performance...

work page doi:10.1145/3702652.3744219 2025

[12] [12]

AI Hasn’t Fixed Teamwork, But It Shifted Collaborative Culture: A Longitudinal Study in a Project-Based Software Development Organization (2023-2025),

Q. Xiao, X. E. Hu, M. E. Whiting, A. Karunakaran, H. Shen, and H. Cao, “AI Hasn’t Fixed Teamwork, But It Shifted Collaborative Culture: A Longitudinal Study in a Project-Based Software Development Organization (2023-2025),” Sep. 13, 2025, arXiv: arXiv:2509.10956. doi: 10.48550/arXiv.2509.10956

work page doi:10.48550/arxiv.2509.10956 2023

[13] [13]

D. Katz, F. H. Allport, and M. B. Jenness, Students’ attitudes; a report of the Sy-racuse University reaction study. in Students’ attitudes; a report of the Syracuse University reaction study. Oxford, England: Craftsman Press, 1931, pp. xxviii,

1931

[14] [14]

Transactive memory: A contemporary analysis of the group mind,

D. M. Wegner, “Transactive memory: A contemporary analysis of the group mind,” in Theories of Group Behavior, B. Mullen and G. R. Goethals, Eds., New York: Springer, 1987, pp. 185–208

1987

[15] [15]

Students’ Use of GitHub Copilot for Working with Large Code Bases,

A. Shah, A. Chernova, E. Tomson, L. Porter, W. G. Griswold, and A. G. Soosai Raj, “Students’ Use of GitHub Copilot for Working with Large Code Bases,” in Proceedings of the 56th ACM Technical Symposium on Computer Science Educa-tion V. 1, Pittsburgh PA USA: ACM, Feb. 2025, pp. 1050–1056. doi: 10.1145/3641554.3701800

work page doi:10.1145/3641554.3701800 2025

[16] [16]

Robust Estimation of a Location Parameter

P. J. Huber, “Robust Estimation of a Location Parameter,” in Breakthroughs in Statistics: Methodology and Distribution, S. Kotz and N. L. Johnson, Eds., New York, NY: Springer, 1992, pp. 492–518. doi: 10.1007/978-1-4612-4380-9_35

work page doi:10.1007/978-1-4612-4380-9_35 1992

[17] [17]

Evaluating Teamwork Components in Large Undergraduate Software Engineering Teams,

L. Dorić, N. Luburić, J. Slivka, S. Prokić, and A. Ðukić, “Evaluating Teamwork Components in Large Undergraduate Software Engineering Teams,” ACM Trans Comput Educ, vol. 25, no. 3, p. 28:1-28:25, Jun. 2025, doi: 10.1145/3733840

work page doi:10.1145/3733840 2025

[18] [18]

LLMs Integration in Software Engineering Team Projects: Roles, Impact, and a Pedagogical Design Space for AI Tools in Computing Education,

A. Kharrufa, S. Alghamdi, A. Aziz, and C. Bull, “LLMs Integration in Software Engineering Team Projects: Roles, Impact, and a Pedagogical Design Space for AI Tools in Computing Education,” ACM Trans. Comput. Educ., vol. 26, no. 2, p. 20:1-20:27, Jan. 2026, doi: 10.1145/3779296

work page doi:10.1145/3779296 2026

[19] [19]

Human-Human-AI Triadic Programming: Uncovering the Role of AI Agent and the Value of Human Partner in Collaborative Learning,

T. Daryanto et al., “Human-Human-AI Triadic Programming: Uncovering the Role of AI Agent and the Value of Human Partner in Collaborative Learning,” Jan. 17, 2026, arXiv: arXiv:2601.12134. doi: 10.48550/arXiv.2601.12134

work page doi:10.48550/arxiv.2601.12134 2026