pith. sign in

arxiv: 2606.23237 · v1 · pith:EHCGJAJGnew · submitted 2026-06-22 · 💻 cs.HC · cs.CY

Students' Perception Accuracy of Partners' AI Use and its Relation to Collaboration Performance

Pith reviewed 2026-06-26 07:26 UTC · model grok-4.3

classification 💻 cs.HC cs.CY
keywords AI perceptioncollaborative programmingteam performancemisalignmentstudent teamssoftware engineering educationperception accuracy
0
0 comments X

The pith

Greater misalignment in students' beliefs about partners' AI use early in projects links to lower final scores, especially in weaker teams.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper investigates how accurately students perceive their partners' use of generative AI in collaborative programming assignments. It establishes that mismatched beliefs about each other's AI use at the project's start correlate with poorer team performance on the final project. This association is particularly pronounced among teams that had lower programming skills beforehand. The study also shows that such misalignments do not reliably resolve through scheduled in-person pair programming, pointing to a potential need for additional transparency measures in educational collaborations.

Core claim

In a three-wave longitudinal study of 103 student pairs in an introductory software engineering course, greater misalignment between partners' beliefs about each other's AI use early in the project was associated with lower final project scores. The effect of such misaligned perceptions is the strongest in teams with lower prior programming performance. The perception misalignment does not consistently decrease through face-to-face pair-programming sessions.

What carries the argument

Perception misalignment measure derived from differences in self-reported beliefs about partner's AI use, used to predict team project scores.

If this is right

  • Misaligned perceptions may require targeted interventions for transparency in AI use.
  • Teams with lower prior performance suffer more from these misalignments.
  • In-person sessions alone may not align perceptions effectively.
  • Transparency about AI use could support better collaboration in programming education.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Similar perception issues might affect professional software development teams using AI tools.
  • Explicit disclosure mechanisms could be tested to reduce misalignment effects.
  • The cost of misalignment could extend to other invisible tool uses in collaborative work.

Load-bearing premise

The study assumes that self-reported beliefs about partners' AI use accurately capture true perceptions without substantial bias or inaccuracy.

What would settle it

Replicating the study with objective logs of actual AI usage rather than self-reports to see if the performance association holds.

read the original abstract

Collaborative assignments are a cornerstone of programming education. Effective collaboration during a programming project depends on the formation of reasonably accurate beliefs about how each partner works. Generative AI tools, now widely used by undergraduate students, have introduced a consequential and largely invisible new dimension into collaboration: each student's use of AI. When partners collaborate remotely, they interpret partners' ability and effort through their code. This raises the question of how accurately students perceive each other's AI use in collaborations, and if a misalignment in these perceptions relates to team performance. To address this question, we conducted a three-wave longitudinal study of 103 student pairs in an introductory software engineering course. We found that greater misalignment between partners' beliefs about each other's AI use early in the project was associated with lower final project scores. The effect of such misaligned perceptions is the strongest in teams with lower prior programming performance, suggesting that low performing students pay a higher cost of misaligned perceptions. The perception misalignment does not consistently decrease through face-to-face pair-programming sessions. This suggests that ways to foster transparency may be needed to support student teams in collaborative programming.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 0 minor

Summary. The paper reports results from a three-wave longitudinal survey of 103 student pairs in an introductory software engineering course. It claims that greater misalignment between partners' beliefs about each other's generative AI use early in a collaborative project is associated with lower final project scores, with the association strongest among teams having lower prior programming performance. The study additionally finds that this misalignment does not consistently decrease across face-to-face pair-programming sessions.

Significance. If the associations prove robust after appropriate controls and if the misalignment measure can be shown to be valid, the findings would contribute to understanding how perceptions of AI tool use affect collaborative outcomes in programming education. The work could support development of transparency interventions, particularly benefiting lower-performing students, and would underscore limitations of pair-programming alone for aligning perceptions.

major comments (2)
  1. [Methods] Methods: The misalignment score is constructed solely from self-reported beliefs on Likert-style or categorical survey items with no reported validation or robustness check against objective indicators such as git commit histories, AI-tool usage logs, or blinded code review. Because the central claim rests on an association between this score and project performance (and its interaction with prior performance), the absence of external validation leaves open the possibility that reporting biases or differential interpretation systematically affect the key predictor.
  2. [Results] Results: The abstract and reported findings supply no details on the statistical models (e.g., regression specification, covariates for team communication quality or other confounds, effect sizes, missing-data handling, or correction for multiple tests). Without these elements the evidential support for the reported associations and the moderation by prior performance cannot be evaluated.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive review. We address each major comment below and indicate revisions to the manuscript where feasible.

read point-by-point responses
  1. Referee: [Methods] Methods: The misalignment score is constructed solely from self-reported beliefs on Likert-style or categorical survey items with no reported validation or robustness check against objective indicators such as git commit histories, AI-tool usage logs, or blinded code review. Because the central claim rests on an association between this score and project performance (and its interaction with prior performance), the absence of external validation leaves open the possibility that reporting biases or differential interpretation systematically affect the key predictor.

    Authors: We agree that external validation against objective indicators would strengthen the measure. However, the study design did not collect git commit histories, AI-tool usage logs, or conduct blinded code reviews, as these were not part of the course data collection. Self-report is standard for perception studies, but we will add an explicit limitations section discussing potential reporting biases and differential interpretation. We will also report additional robustness checks using alternative misalignment computations (e.g., varying thresholds for 'misalignment'). revision: partial

  2. Referee: [Results] Results: The abstract and reported findings supply no details on the statistical models (e.g., regression specification, covariates for team communication quality or other confounds, effect sizes, missing-data handling, or correction for multiple tests). Without these elements the evidential support for the reported associations and the moderation by prior performance cannot be evaluated.

    Authors: We accept that the current manuscript lacks sufficient statistical detail. In the revision we will expand the Methods section with full regression specifications (including all covariates such as team communication quality where measured), effect sizes, missing-data handling procedures, and any multiple-testing corrections. The Results section will report these elements explicitly, and the abstract will be updated to note the modeling approach. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical associations from survey data

full rationale

This is a standard observational study that collects three-wave survey responses on perceived AI use, computes a misalignment metric as the difference between partners' reports, and reports its statistical association with final project scores (stronger in low-prior-performance teams). The central claim is a measured correlation, not a derivation, prediction, or model that reduces to its own inputs by construction. No equations, fitted parameters renamed as predictions, or self-citation chains appear in the load-bearing steps. The analysis is self-contained against the collected data.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on the validity of self-reported survey measures of perception and the assumption that standard statistical associations can be interpreted without major unmeasured confounding in this educational setting.

axioms (2)
  • domain assumption Self-reported beliefs collected via survey accurately reflect students' internal perceptions of partners' AI use.
    The misalignment variable is constructed from these reports; if reports are biased the measured association loses meaning.
  • standard math Standard assumptions of correlation/regression analysis (linearity, independence) hold for the collected data.
    The reported association implies use of such tests.

pith-pipeline@v0.9.1-grok · 5734 in / 1376 out tokens · 25139 ms · 2026-06-26T07:26:34.239278+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

19 extracted references · 16 canonical work pages

  1. [1]

    Small Group Design Meetings: An Analysis of Collaboration,

    G. M. Olson, J. S. Olson, M. R. Carter, and M. Storrosten, “Small Group Design Meetings: An Analysis of Collaboration,” Human–Computer Interact., vol. 7, no. 4, pp. 347–374, Dec. 1992, doi: 10.1207/s15327051hci0704_1

  2. [2]

    Safe- DS: A domain specific language to make data science safe,

    A. Sajadi, K. Damevski, and P. Chatterjee, “Interpersonal Trust in OSS: Exploring Dimensions of Trust in GitHub Pull Requests,” in 2023 IEEE/ACM 45th Interna-tional Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER), May 2023, pp. 19–24. doi: 10.1109/ICSE-NIER58687.2023.00010

  3. [3]

    Becker, Andrew Luxton-Reilly, and James Prather

    J. Finnie-Ansley, P. Denny, B. A. Becker, A. Luxton-Reilly, and J. Prather, “The Robots Are Coming: Exploring the Implications of OpenAI Codex on Introductory Programming,” in Proceedings of the 24th Australasian Computing Education Conference, in ACE ’22. New York, NY, USA: Association for Computing Ma-chinery, Feb. 2022, pp. 10–19. doi: 10.1145/3511861.3511863

  4. [4]

    Awareness and coordination in shared workspaces,

    P. Dourish and V. Bellotti, “Awareness and coordination in shared workspaces,” in Proceedings of the 1992 ACM conference on Computer-supported cooperative work - CSCW ’92, Toronto, Ontario, Canada: ACM Press, 1992, pp. 107–114. doi: 10.1145/143457.143468

  5. [5]

    doi: 10.11575/PRISM/20172

  6. [6]

    Learning analytics: The emergence of a discipline [J]

    G. Siemens, “Learning Analytics: The Emergence of a Discipline,” Am. Behav. Sci., vol. 57, no. 10, pp. 1380–1400, Oct. 2013, doi: 10.1177/0002764213498851

  7. [7]

    Group awareness tools: It’s what you do with it that matters,

    J. Janssen, G. Erkens, and P. A. Kirschner, “Group awareness tools: It’s what you do with it that matters,” Comput. Hum. Behav., vol. 27, no. 3, pp. 1046–1058, May 2011, doi: 10.1016/j.chb.2010.06.002

  8. [8]

    Transactive memory: Learning who knows what in work groups and organizations,

    Moreland, Richard L, “Transactive memory: Learning who knows what in work groups and organizations,” in Shared Cognition in Organizations, L. L. Thomp-son, Ed., Erlbaum, 1999, pp. 3–31

  9. [9]

    AI Tools in Programming Education: Student Perspec-tives and Usage Trends,

    M. Maček and M. Novak, “AI Tools in Programming Education: Student Perspec-tives and Usage Trends,” J. Inf. Organ. Sci., vol. 50, no. 1, pp. 83–104, Feb. 2026, doi: 10.31341/jios.50.1.5

  10. [10]

    Less stress, better scores, same learning: The dissociation of performance and learning in AI-supported programming education,

    P. Bassner, B. Lenk-Ostendorf, R. Beinstingel, T. Wasner, and S. Krusche, “Less stress, better scores, same learning: The dissociation of performance and learning in AI-supported programming education,” Comput. Educ. Artif. Intell., vol. 10, p. 100537, Jun. 2026, doi: 10.1016/j.caeai.2025.100537

  11. [11]

    The Effects of GitHub Copilot on Computing Students’ Programming Effective-ness, Efficiency, and Processes in Brownfield Coding Tasks,

    M. I. H. Shihab, C. Hundhausen, A. Tariq, S. Haque, Y. Qiao, and B. W. Mulanda, “The Effects of GitHub Copilot on Computing Students’ Programming Effective-ness, Efficiency, and Processes in Brownfield Coding Tasks,” in Proceedings of the 2025 ACM Conference on International Computing Education Research V.1, AI Use Perception and Collaboration Performance...

  12. [12]

    AI Hasn’t Fixed Teamwork, But It Shifted Collaborative Culture: A Longitudinal Study in a Project-Based Software Development Organization (2023-2025),

    Q. Xiao, X. E. Hu, M. E. Whiting, A. Karunakaran, H. Shen, and H. Cao, “AI Hasn’t Fixed Teamwork, But It Shifted Collaborative Culture: A Longitudinal Study in a Project-Based Software Development Organization (2023-2025),” Sep. 13, 2025, arXiv: arXiv:2509.10956. doi: 10.48550/arXiv.2509.10956

  13. [13]

    D. Katz, F. H. Allport, and M. B. Jenness, Students’ attitudes; a report of the Sy-racuse University reaction study. in Students’ attitudes; a report of the Syracuse University reaction study. Oxford, England: Craftsman Press, 1931, pp. xxviii,

  14. [14]

    Transactive memory: A contemporary analysis of the group mind,

    D. M. Wegner, “Transactive memory: A contemporary analysis of the group mind,” in Theories of Group Behavior, B. Mullen and G. R. Goethals, Eds., New York: Springer, 1987, pp. 185–208

  15. [15]

    Students’ Use of GitHub Copilot for Working with Large Code Bases,

    A. Shah, A. Chernova, E. Tomson, L. Porter, W. G. Griswold, and A. G. Soosai Raj, “Students’ Use of GitHub Copilot for Working with Large Code Bases,” in Proceedings of the 56th ACM Technical Symposium on Computer Science Educa-tion V. 1, Pittsburgh PA USA: ACM, Feb. 2025, pp. 1050–1056. doi: 10.1145/3641554.3701800

  16. [16]

    Robust Estimation of a Location Parameter

    P. J. Huber, “Robust Estimation of a Location Parameter,” in Breakthroughs in Statistics: Methodology and Distribution, S. Kotz and N. L. Johnson, Eds., New York, NY: Springer, 1992, pp. 492–518. doi: 10.1007/978-1-4612-4380-9_35

  17. [17]

    Evaluating Teamwork Components in Large Undergraduate Software Engineering Teams,

    L. Dorić, N. Luburić, J. Slivka, S. Prokić, and A. Ðukić, “Evaluating Teamwork Components in Large Undergraduate Software Engineering Teams,” ACM Trans Comput Educ, vol. 25, no. 3, p. 28:1-28:25, Jun. 2025, doi: 10.1145/3733840

  18. [18]

    LLMs Integration in Software Engineering Team Projects: Roles, Impact, and a Pedagogical Design Space for AI Tools in Computing Education,

    A. Kharrufa, S. Alghamdi, A. Aziz, and C. Bull, “LLMs Integration in Software Engineering Team Projects: Roles, Impact, and a Pedagogical Design Space for AI Tools in Computing Education,” ACM Trans. Comput. Educ., vol. 26, no. 2, p. 20:1-20:27, Jan. 2026, doi: 10.1145/3779296

  19. [19]

    Human-Human-AI Triadic Programming: Uncovering the Role of AI Agent and the Value of Human Partner in Collaborative Learning,

    T. Daryanto et al., “Human-Human-AI Triadic Programming: Uncovering the Role of AI Agent and the Value of Human Partner in Collaborative Learning,” Jan. 17, 2026, arXiv: arXiv:2601.12134. doi: 10.48550/arXiv.2601.12134