Adoption and Impact of Command-Line AI Coding Agents: A Study of Microsoft's Early 2026 Rollout of Claude Code and GitHub Copilot CLI

Alexandra Savelieva; Emerson Murphy-Hill; Jenna Butler

arxiv: 2607.01418 · v1 · pith:CJODPBICnew · submitted 2026-07-01 · 💻 cs.SE · cs.AI· cs.HC

Adoption and Impact of Command-Line AI Coding Agents: A Study of Microsoft's Early 2026 Rollout of Claude Code and GitHub Copilot CLI

Emerson Murphy-Hill , Jenna Butler , Alexandra Savelieva This is my paper

Pith reviewed 2026-07-03 19:12 UTC · model grok-4.3

classification 💻 cs.SE cs.AIcs.HC

keywords AI coding agentscommand-line toolstechnology adoptionproductivitypull requestssocial networksretentionMicrosoft

0 comments

The pith

Microsoft engineers who adopted command-line AI coding agents merged 24% more pull requests than similar non-adopters, with the gain holding over four months.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

A study of tens of thousands of engineers at Microsoft during the early 2026 rollout of Claude Code and GitHub Copilot CLI tracked who tried the tools, who kept using them, and what output changed. First use spread mainly through social networks among peers. Retention linked more to an engineer's prior coding activity than to demographics or role. Adopters produced roughly 24% more merged pull requests than matched non-adopters, and this difference stayed steady across the four-month period when merged pull requests served as the output measure. The pattern shows CLI coding agents spread unevenly and produce lasting changes rather than short novelty effects.

Core claim

In a study of tens of thousands of Microsoft engineers during the early 2026 rollout of Claude Code and GitHub Copilot CLI, first use diffused primarily through social networks, retention correlated with prior coding activity, and adopters merged roughly 24% more pull requests than they otherwise would have, with the effect persisting over four months when using merged pull requests as the output proxy.

What carries the argument

Comparison of merged pull request counts between adopters and non-adopters after matching or regression controls to isolate the contribution of tool use.

Load-bearing premise

Differences in merged pull request counts between adopters and non-adopters can be attributed to tool use rather than unobserved differences in engineer behavior or project characteristics.

What would settle it

A before-and-after comparison of the same engineers or a randomized rollout that shows no difference in merged pull request volume would indicate the reported lift is not caused by the agents.

Figures

Figures reproduced from arXiv: 2607.01418 by Alexandra Savelieva, Emerson Murphy-Hill, Jenna Butler.

**Figure 1.** Figure 1: Change in odds of initial use and retention of Copilot CLI by social exposure, versus an engineer with no coworkers who used Copilot CLI in the prior 14 days [PITH_FULL_IMAGE:figures/full_fig_p007_1.png] view at source ↗

**Figure 2.** Figure 2: Change in odds of initial use and retention of Copilot CLI by prior IDE Copilot use, versus an engineer who did not use IDE Copilot during the pre-period [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗

**Figure 3.** Figure 3: Change in odds of initial use and retention of Copilot CLI by baseline pull-request activity, versus an engineer who created no pull requests during the pre-period [PITH_FULL_IMAGE:figures/full_fig_p009_3.png] view at source ↗

**Figure 4.** Figure 4: Change in odds of initial use and retention of Copilot CLI by career stage, versus a mid-level individual contributor (IC4). • Senior ICs were more likely to try it. IC5 and IC6 engineers had higher odds than a mid-level IC — about +22% for IC5 — but their retention markers sat near zero. • Managers looked no different from the reference. M4–M6 engineers showed no statistically distinguishable difference … view at source ↗

**Figure 5.** Figure 5: Change in odds of initial use and retention of Copilot CLI by tenure, versus an engineer who had been at Microsoft for 5–15 years. modest edge in trying Copilot CLI; that same spare capacity may feed shared team resources, reinforcing the peer-usage association. 4.7.5 Tenure [PITH_FULL_IMAGE:figures/full_fig_p011_5.png] view at source ↗

**Figure 6.** Figure 6: Daily merged PRs per engineer for single-tool adopters, observed versus their synthetic [PITH_FULL_IMAGE:figures/full_fig_p012_6.png] view at source ↗

**Figure 7.** Figure 7: Percent change in merged PRs per engineer-week by days of tool use that week, versus a [PITH_FULL_IMAGE:figures/full_fig_p014_7.png] view at source ↗

**Figure 8.** Figure 8: Percent change in merged PRs per engineer-week for Claude Code [PITH_FULL_IMAGE:figures/full_fig_p015_8.png] view at source ↗

**Figure 7.** Figure 7: Both evaluate the same point—three days of use per week—but because the curve in [PITH_FULL_IMAGE:figures/full_fig_p015_7.png] view at source ↗

**Figure 9.** Figure 9: Percent PR lift at 3 days/week for individual contributors [PITH_FULL_IMAGE:figures/full_fig_p016_9.png] view at source ↗

**Figure 10.** Figure 10: Percent PR lift at 3 days/week by tenure, versus an engineer who had been at Microsoft [PITH_FULL_IMAGE:figures/full_fig_p017_10.png] view at source ↗

read the original abstract

Organizations rolling out agentic command line tools like Anthropic's Claude Code and GitHub's Copilot CLI need to know who will try them, who will keep using them, and whether the tools produce enough output to justify their cost. At organizational scale, token spend can run into millions of dollars annually, so misreading adoption, retention, or impact can make a rollout expensive without changing engineering velocity. Studying tens of thousands of engineers at Microsoft over its early-2026 rollout, we find that first use spread primarily through social networks, retention was associated more with engineers' coding activity than with demographics, and adopters merged roughly 24% more pull requests than they would have otherwise. We use merged pull requests as our proxy for output -- acknowledging that a merged PR is not the same as the value it delivers -- and the lift persists across our four-month window. These results suggest that CLI coding agents are neither uniformly adopted nor mere novelty effects and that organizations should treat visible peer use as central to rollout strategy.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives rare large-scale telemetry on CLI AI agent rollout at Microsoft but the 24% PR lift claim rests on an identification strategy the abstract does not describe.

read the letter

The headline result here is a quantified 24% increase in merged PRs among adopters of Claude Code and Copilot CLI during Microsoft's early 2026 rollout, alongside findings that adoption spread through social networks and retention tracked coding activity more than demographics. That combination of scale and specific numbers on two particular tools is what stands out.

The work draws on internal data from tens of thousands of engineers and tracks outcomes over four months, which is harder to get than most academic studies manage. It also flags that merged PRs are only a proxy and not a direct measure of value delivered. Those choices keep the claims grounded in what the data can actually show.

The soft spot is the causal step. The abstract presents the 24% figure as the lift adopters would not have had otherwise, yet gives no information on how adopters were matched to non-adopters, what controls went into any regression, or whether pre-trends were checked. Without those details the difference could reflect who decided to try the tools rather than the tools themselves. The stress-test note correctly flags this as the load-bearing assumption.

This is observational work on a timely industry question, not a derivation or closed-form model, so there is no circularity issue. The citation pattern is not visible from the abstract alone.

The paper is aimed at researchers and practitioners who need data on how agentic CLI tools actually diffuse and whether they move output metrics inside a large firm. A reader working on AI tooling adoption or internal productivity measurement would get concrete numbers to compare against. It deserves peer review because the sample size and setting are uncommon; a referee can check whether the identification holds once the methods section is available. I would bring it to a reading group for the adoption patterns even if the impact estimate needs more scrutiny.

Referee Report

2 major / 1 minor

Summary. The paper studies the early-2026 rollout of command-line AI coding agents (Claude Code and GitHub Copilot CLI) at Microsoft using data on tens of thousands of engineers. It reports that adoption spreads primarily via social networks, retention correlates more with coding activity than demographics, and adopters merged roughly 24% more pull requests than they would have otherwise, with the effect persisting over a four-month window. Merged PR count is used as an output proxy while acknowledging its limitations.

Significance. If the 24% causal lift in merged PRs holds after proper identification, the results would inform organizational strategies for scaling agentic coding tools by highlighting peer-driven adoption and the role of baseline coding activity in retention. The large internal sample and explicit proxy caveat are strengths for an empirical software engineering study.

major comments (2)

[Abstract] Abstract: the headline causal claim that adopters merged 24% more PRs 'than they would have otherwise' is presented without any description of the sample construction, matching procedure, regression specification, or robustness checks. This identification strategy is load-bearing for the central impact result and cannot be evaluated from the provided text.
[Abstract and Results] The manuscript notes that adoption spread via social networks and retention tied to coding activity, yet supplies no detail on how these or other observables (e.g., pre-adoption trends, project characteristics) enter the matching or regression controls used to isolate the treatment effect.

minor comments (1)

[Abstract] The abstract could more explicitly quantify the sample size and time window in the opening sentence for immediate context.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for highlighting the need for greater transparency in the abstract regarding our identification strategy. We agree that the central causal claim requires sufficient detail for evaluation and will revise the abstract and results sections accordingly. Point-by-point responses follow.

read point-by-point responses

Referee: [Abstract] Abstract: the headline causal claim that adopters merged 24% more PRs 'than they would have otherwise' is presented without any description of the sample construction, matching procedure, regression specification, or robustness checks. This identification strategy is load-bearing for the central impact result and cannot be evaluated from the provided text.

Authors: We agree the abstract omits these details. The full manuscript uses propensity-score matching on pre-adoption merged PRs, coding activity, tenure, team size, and project characteristics, followed by a difference-in-differences regression with engineer and time fixed effects plus robustness checks (alternative calipers, placebo tests on non-adopters). We will revise the abstract to concisely summarize the sample (tens of thousands of engineers), matching procedure, regression specification, and key robustness results. revision: yes
Referee: [Abstract and Results] The manuscript notes that adoption spread via social networks and retention tied to coding activity, yet supplies no detail on how these or other observables (e.g., pre-adoption trends, project characteristics) enter the matching or regression controls used to isolate the treatment effect.

Authors: Social-network diffusion and activity-based retention are analyzed descriptively via network graphs and logistic regressions on usage frequency. For the impact estimates, pre-adoption trends, project characteristics, and the listed observables are used both as matching covariates and as controls in the regression. We will add explicit language in the abstract and results clarifying their role in the identification strategy. revision: yes

Circularity Check

0 steps flagged

No circularity: observational empirical study with no derivation chain

full rationale

The paper is a purely observational empirical analysis of adoption and impact using merged PR counts as a proxy. The 24% lift is presented as an estimated difference between adopters and non-adopters after controls, not as a quantity derived from or identical to any fitted parameter or self-citation. No equations, ansatzes, uniqueness theorems, or self-referential predictions appear in the provided text. The identification strategy (matching or regression) is an external methodological choice whose validity is debatable on causal grounds but does not constitute circularity by construction. The result is therefore self-contained as a data-driven estimate rather than a definitional identity.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Empirical observational study; central claims rest on the assumption that merged PR counts are a usable (if imperfect) proxy for output and that the counterfactual comparison isolates the effect of tool adoption. No free parameters or invented entities are introduced. No machine-checked proofs or external benchmarks are referenced.

axioms (1)

domain assumption Merged pull requests constitute a reasonable proxy for engineering output and impact
Explicitly stated in the abstract with the parenthetical acknowledgment that a merged PR is not the same as the value it delivers.

pith-pipeline@v0.9.1-grok · 5725 in / 1284 out tokens · 26337 ms · 2026-07-03T19:12:02.006100+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

45 extracted references · 16 canonical work pages · 7 internal anchors

[1]

The Fast and Spurious: Developer Productivity with GenAI

Sadia Afroz, Zixuan Feng, Tyler Menezes, Katie Kimura, Bianca Trinkenreich, Igor Stein- macher, and Anita Sarma. The fast and spurious: Developer productivity with genai, 2026. URLhttps://arxiv.org/abs/2510.24265

work page internal anchor Pith review Pith/arXiv arXiv 2026
[2]

AI IDEs or autonomous agents? measuring the impact of coding agents on software development

Shyam Agarwal, Hao He, and Bogdan Vasilescu. AI IDEs or autonomous agents? measuring the impact of coding agents on software development. InProceedings of the 23rd International Conference on Mining Software Repositories (MSR), Mining Challenge Track, 2026

2026
[3]

Distinguishing influence-based contagion from homophily-driven diffusion in dynamic networks.Proceedings of the National Academy of Sciences, 106(51):21544–21549, 2009

Sinan Aral, Lev Muchnik, and Arun Sundararajan. Distinguishing influence-based contagion from homophily-driven diffusion in dynamic networks.Proceedings of the National Academy of Sciences, 106(51):21544–21549, 2009

2009
[4]

Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity, July 2025

Joel Becker, Nate Rush, Elizabeth Barnes, and David Rein. Measuring the impact of early- 2025 ai on experienced open-source developer productivity.arXiv preprint arXiv:2507.09089, 2025. 19

work page arXiv 2025
[5]

Understanding information systems continuance: An expectation- confirmation model.MIS quarterly, 25(3):351–370, 2001

Anol Bhattacherjee. Understanding information systems continuance: An expectation- confirmation model.MIS quarterly, 25(3):351–370, 2001

2001
[6]

Developers’ experience with generative ai–first insights from an empirical mixed-methods field study

Charlotte Brandebusemeyer, Tobias Schimmer, and Bert Arnrich. Developers’ experience with generative ai–first insights from an empirical mixed-methods field study. InProceedings of the International Conference on Software Engineering (ICSE), Software Engineering in Practice Track (SEIP), 2026

2026
[7]

Inferring causal impact using Bayesian structural time-series models.Annals of Applied Statistics, 9: 247–274, 2015

Kay H Brodersen, Fabian Gallusser, Jim Koehler, Nicolas Remy, and Steven L Scott. Inferring causal impact using Bayesian structural time-series models.Annals of Applied Statistics, 9: 247–274, 2015

2015
[8]

Dear diary: A randomized controlled trial of generative ai coding tools in the workplace

Jenna Butler, Jina Suh, Sankeerti Haniyur, and Constance Hadley. Dear diary: A randomized controlled trial of generative ai coding tools in the workplace. In2025 IEEE/ACM 47th International Conference on Software Engineering: Software Engineering in Practice (ICSE- SEIP), pages 319–329, 2025. doi: 10.1109/ICSE-SEIP66354.2025.00034

work page doi:10.1109/icse-seip66354.2025.00034 2025
[9]

The Productivity Effects of Generative AI: Evidence from a Field Experiment with GitHub Copilot.An MIT Exploration of Generative AI, mar 27 2024

Kevin Zheyuan Cui, Mert Demirer, Sonia Jaffe, Leon Musolff, Sida Peng, and Tobias Salz. The Productivity Effects of Generative AI: Evidence from a Field Experiment with GitHub Copilot.An MIT Exploration of Generative AI, mar 27 2024. https://mit- genai.pubpub.org/pub/v5iixksv

2024
[10]

The effects of generative AI on high-skilled work: Evidence from three field experiments with software developers.Management Science, 2025

Zheyuan Cui, Mert Demirer, Sonia Jaffe, Leon Musolff, Sida Peng, and Tobias Salz. The effects of generative AI on high-skilled work: Evidence from three field experiments with software developers.Management Science, 2025. doi: 10.2139/ssrn.4945566. Forthcoming

work page doi:10.2139/ssrn.4945566 2025
[11]

Who is using ai to code? global diffusion and impact of generative ai.Science, page eadz9311, 2026

Simone Daniotti, Johannes Wachs, Xiangnan Feng, and Frank Neffke. Who is using ai to code? global diffusion and impact of generative ai.Science, page eadz9311, 2026

2026
[12]

Perceived usefulness, perceived ease of use, and user acceptance of information technology.MIS quarterly, 13(3):319–340, 1989

Fred D Davis. Perceived usefulness, perceived ease of use, and user acceptance of information technology.MIS quarterly, 13(3):319–340, 1989

1989
[13]

Writing code vs

Mert Demirer, Leon Musolff, and Liyuan Yang. Writing code vs. shipping code: Productiv- ity effects across generations of ai coding tools. Working Paper 35275, National Bureau of Economic Research, May 2026. URLhttp://www.nber.org/papers/w35275

2026
[14]

A Meta employee created a dashboard so coworkers can compete to be the company’s no

Fortune. A Meta employee created a dashboard so coworkers can compete to be the company’s no. 1 AI token user – and Zuckerberg doesn’t even rank in the top 250.https://fortune. com/2026/04/09/meta-killed-employee-ai-token-dashboard/, April 2026. Accessed 2026-05-20

2026
[15]

The state of generative ai in software de- velopment: Insights from literature and a developer survey.arXiv preprint arXiv:2603.16975, 2026

Vincent Gurgul, Robin Gubela, and Stefan Lessmann. The state of generative ai in software de- velopment: Insights from literature and a developer survey.arXiv preprint arXiv:2603.16975, 2026

work page arXiv 2026
[16]

Speed at the cost of quality: How Cursor AI increases short-term velocity and long-term complexity in open-source projects

Hao He, Courtney Miller, Shyam Agarwal, Christian K¨ astner, and Bogdan Vasilescu. Speed at the cost of quality: How Cursor AI increases short-term velocity and long-term complexity in open-source projects. InProceedings of the 23rd International Conference on Mining Software Repositories (MSR), 2026

2026
[17]

GitHub Copilot and developer produc- tivity: An observational dose-response analysis, 2026

Alex Heilman, Alex Kyllo, and Emerson Murphy-Hill. GitHub Copilot and developer produc- tivity: An observational dose-response analysis, 2026. 20

2026
[18]

The heterogeneous productivity effects of generative ai

David Kreitmeir and Paul A Raschky. The heterogeneous productivity effects of generative ai. arXiv preprint arXiv:2403.01964, 2024

work page arXiv 2024
[19]

Why ai agents still need you: Findings from developer-agent collaborations in the wild

Aayush Kumar, Yasharth Bajpai, Sumit Gulwani, Gustavo Soares, and Emerson Murphy-Hill. Why ai agents still need you: Findings from developer-agent collaborations in the wild. In 2025 40th IEEE/ACM International Conference on Automated Software Engineering (ASE), page 432–444. IEEE Press, 2025. doi: 10.1109/ASE63991.2025.00043. URLhttps://doi.or g/10.1109/...

work page doi:10.1109/ase63991.2025.00043 2025
[20]

Intuition to evidence: Measuring ai’s true impact on developer productivity.arXiv preprint arXiv:2509.19708, 2025

Anand Kumar, Vishal Khare, Deepak Sharma, Satyam Kumar, Vijay Saini, Anshul Yadav, Sachendra Jain, Ankit Rana, Pratham Verma, Vaibhav Meena, et al. Intuition to evidence: Measuring ai’s true impact on developer productivity.arXiv preprint arXiv:2509.19708, 2025

work page arXiv 2025
[21]

The impact of llm-assistants on software developer productivity: A systematic review and mapping study.ACM Trans

Amr Mohamed, Maram Assi, and Mariam Guizani. The impact of llm-assistants on software developer productivity: A systematic review and mapping study.ACM Trans. Softw. Eng. Methodol., April 2026. ISSN 1049-331X. doi: 10.1145/3809494. URLhttps://doi.org/10.1 145/3809494. Just Accepted

work page doi:10.1145/3809494 2026
[22]

Reading between the lines: Modeling user behavior and costs in ai-assisted programming

Hussein Mozannar, Gagan Bansal, Adam Fourney, and Eric Horvitz. Reading between the lines: Modeling user behavior and costs in ai-assisted programming. InProceedings of the 2024 CHI conference on human factors in computing systems, pages 1–16, 2024

2024
[23]

Peer interaction effectively, yet infrequently, enables programmers to discover new tools

Emerson Murphy-Hill and Gail C Murphy. Peer interaction effectively, yet infrequently, enables programmers to discover new tools. InProceedings of the ACM 2011 conference on Computer supported cooperative work, pages 405–414, 2011

2011
[24]

How do users discover new tools in software development and beyond?Computer Supported Cooperative Work (CSCW), 24(5):389–422, 2015

Emerson Murphy-Hill, Da Young Lee, Gail C Murphy, and Joanna McGrenere. How do users discover new tools in software development and beyond?Computer Supported Cooperative Work (CSCW), 24(5):389–422, 2015

2015
[25]

A survey of generative AI adoption and perceived productivity among scientists who program

Gabrielle O’Brien, Alexis Parker, Nasir Eisty, and Jeffrey Carver. More code, less vali- dation: Risk factors for over-reliance on ai coding tools among scientists.arXiv preprint arXiv:2512.19644, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[26]

AI Tooling for Software Engineers in 2026.https://newsletter.pragmatic engineer.com/p/ai-tooling-2026, 2026

Gergely Orosz. AI Tooling for Software Engineers in 2026.https://newsletter.pragmatic engineer.com/p/ai-tooling-2026, 2026. Accessed 2026-05-20

2026
[27]

How much does AI impact development speed? an enterprise-based randomized controlled trial

Elise Paradis, Kate Grey, Quinn Madison, Daye Nam, Andrew Macvean, Vahid Meimand, Nan Zhang, Ben Ferrari-Church, and Satish Chandra. How much does AI impact development speed? an enterprise-based randomized controlled trial. In2025 IEEE/ACM 47th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP), pages 618–629. ...

2025
[28]

The Impact of AI on Developer Productivity: Evidence from GitHub Copilot

Sida Peng, Eirini Kalliamvakou, Peter Cihon, and Mert Demirer. The impact of AI on devel- oper productivity: Evidence from GitHub Copilot.arXiv preprint arXiv:2302.06590, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023
[29]

Coding Beyond Your Training: Claude Code and the Technological Frontier of Software Developers

Alexander Quispe. Coding beyond your training: Claude code and the technological frontier of software developers.arXiv preprint arXiv:2605.25438, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026
[30]

Adop- tion of ai tools in software development: a systematic literature review.Science of Computer Programming, 254:103521, 2026

Dar´ ıo Reyes-Reina, Jenny Marcela Sanch´ ez-Torres, and Iv´ an Mauricio Rueda-C´ aceres. Adop- tion of ai tools in software development: a systematic literature review.Science of Computer Programming, 254:103521, 2026. 21

2026
[31]

Agentic Much? Adoption of Coding Agents on GitHub

Romain Robbes, Th´ eo Matricon, Thomas Degueule, Andre Hora, and Stefano Zacchiroli. Agen- tic much? adoption of coding agents on github.arXiv preprint arXiv:2601.18341, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026
[32]

Rogers.Diffusion of Innovations, 5th Edition

Everett M. Rogers.Diffusion of Innovations, 5th Edition. Simon and Schuster, 2003. ISBN 9780743258234

2003
[33]

The effects of github copilot on computing students’ program- ming effectiveness, efficiency, and processes in brownfield coding tasks

Md Istiak Hossain Shihab, Christopher Hundhausen, Ahsun Tariq, Summit Haque, Yunhan Qiao, and Brian Wise Mulanda. The effects of github copilot on computing students’ program- ming effectiveness, efficiency, and processes in brownfield coding tasks. InProceedings of the 2025 ACM Conference on International Computing Education Research V. 1, pages 407–420, 2025

2025
[34]

The Impact of Generative AI on Collaborative Open-Source Software Development: Evidence from GitHub Copilot

Fangchen Song, Ashish Agarwal, and Wen Wen. The impact of generative ai on collab- orative open-source software development: Evidence from github copilot.arXiv preprint arXiv:2410.02091, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024
[35]

Stack Overflow. Developers remain willing but reluctant to use AI: The 2025 developer survey results are here.https://stackoverflow.blog/2025/12/29/developers-remain-wil ling-but-reluctant-to-use-ai-the-2025-developer-survey-results-are-here/, December 2025. Accessed 2026-05-20

2025
[36]

Developer productivity with and without github copilot: A longitudinal mixed-methods case study

Viktoria Stray, Elias Goldmann Brandtzæg, Viggo Wivestad, Astri Barbala, and Nils Brede Moe. Developer productivity with and without github copilot: A longitudinal mixed-methods case study. InProceedings of the 59th Hawaii International Conference on System Sciences, 2026

2026
[37]

Impacts of generative ai on agile teams’ productivity: A multi-case longitudinal study.arXiv preprint arXiv:2602.13766, 2026

Rafael Tomaz, Paloma Guenes, Allysson Allex Ara˜A¯ ejo, Maria Teresa Baldassarre, and Marcos Kalinowski. Impacts of generative ai on agile teams’ productivity: A multi-case longitudinal study.arXiv preprint arXiv:2602.13766, 2026

work page arXiv 2026
[38]

Expectation vs

Priyan Vaithilingam, Tianyi Zhang, and Elena L Glassman. Expectation vs. experience: Eval- uating the usability of code generation tools powered by large language models. InCHI conference on human factors in computing systems extended abstracts, pages 1–7, 2022

2022
[39]

A theoretical extension of the technology acceptance model: Four longitudinal field studies.Management science, 46(2):186–204, 2000

Viswanath Venkatesh and Fred D Davis. A theoretical extension of the technology acceptance model: Four longitudinal field studies.Management science, 46(2):186–204, 2000

2000
[40]

User acceptance of information technology: Toward a unified view1.MIS quarterly, 27(3):425–478, 2003

Viswanath Venkatesh, Michael G Morris, Gordon B Davis, and Fred D Davis. User acceptance of information technology: Toward a unified view1.MIS quarterly, 27(3):425–478, 2003

2003
[41]

Significant productivity gains through programming with large language models.Proceedings of the ACM on Human-Computer Interaction, 8(EICS):1–29, 2024

Thomas Weber, Maximilian Brandmaier, Albrecht Schmidt, and Sven Mayer. Significant productivity gains through programming with large language models.Proceedings of the ACM on Human-Computer Interaction, 8(EICS):1–29, 2024

2024
[42]

Social influences on secure develop- ment tool adoption: why security tools spread

Shundan Xiao, Jim Witschey, and Emerson Murphy-Hill. Social influences on secure develop- ment tool adoption: why security tools spread. InProceedings of the 17th ACM conference on Computer supported cooperative work & social computing, pages 1095–1106, 2014

2014
[43]

Claude code scientists: Measuring ai adoption and productivity among scien- tists.Available at SSRN 6803624, 2026

Charles Yang. Claude code scientists: Measuring ai adoption and productivity among scien- tists.Available at SSRN 6803624, 2026. 22

2026
[44]

The Impact of Large Language Models on Open-source Innovation: Evidence from GitHub Copilot

Doron Yeverechyahu, Raveesh Mayya, and Gal Oestreicher-Singer. The impact of large language models on open-source innovation: Evidence from github copilot.arXiv preprint arXiv:2409.08379, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024
[45]

Productivity assessment of neural code completion

Albert Ziegler, Eirini Kalliamvakou, X Alice Li, Andrew Rice, Devon Rifkin, Shawn Simis- ter, Ganesh Sittampalam, and Edward Aftandilian. Productivity assessment of neural code completion. InProceedings of the 6th ACM SIGPLAN international symposium on machine programming, pages 21–29, 2022. 23

2022

[1] [1]

The Fast and Spurious: Developer Productivity with GenAI

Sadia Afroz, Zixuan Feng, Tyler Menezes, Katie Kimura, Bianca Trinkenreich, Igor Stein- macher, and Anita Sarma. The fast and spurious: Developer productivity with genai, 2026. URLhttps://arxiv.org/abs/2510.24265

work page internal anchor Pith review Pith/arXiv arXiv 2026

[2] [2]

AI IDEs or autonomous agents? measuring the impact of coding agents on software development

Shyam Agarwal, Hao He, and Bogdan Vasilescu. AI IDEs or autonomous agents? measuring the impact of coding agents on software development. InProceedings of the 23rd International Conference on Mining Software Repositories (MSR), Mining Challenge Track, 2026

2026

[3] [3]

Distinguishing influence-based contagion from homophily-driven diffusion in dynamic networks.Proceedings of the National Academy of Sciences, 106(51):21544–21549, 2009

Sinan Aral, Lev Muchnik, and Arun Sundararajan. Distinguishing influence-based contagion from homophily-driven diffusion in dynamic networks.Proceedings of the National Academy of Sciences, 106(51):21544–21549, 2009

2009

[4] [4]

Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity, July 2025

Joel Becker, Nate Rush, Elizabeth Barnes, and David Rein. Measuring the impact of early- 2025 ai on experienced open-source developer productivity.arXiv preprint arXiv:2507.09089, 2025. 19

work page arXiv 2025

[5] [5]

Understanding information systems continuance: An expectation- confirmation model.MIS quarterly, 25(3):351–370, 2001

Anol Bhattacherjee. Understanding information systems continuance: An expectation- confirmation model.MIS quarterly, 25(3):351–370, 2001

2001

[6] [6]

Developers’ experience with generative ai–first insights from an empirical mixed-methods field study

Charlotte Brandebusemeyer, Tobias Schimmer, and Bert Arnrich. Developers’ experience with generative ai–first insights from an empirical mixed-methods field study. InProceedings of the International Conference on Software Engineering (ICSE), Software Engineering in Practice Track (SEIP), 2026

2026

[7] [7]

Inferring causal impact using Bayesian structural time-series models.Annals of Applied Statistics, 9: 247–274, 2015

Kay H Brodersen, Fabian Gallusser, Jim Koehler, Nicolas Remy, and Steven L Scott. Inferring causal impact using Bayesian structural time-series models.Annals of Applied Statistics, 9: 247–274, 2015

2015

[8] [8]

Dear diary: A randomized controlled trial of generative ai coding tools in the workplace

Jenna Butler, Jina Suh, Sankeerti Haniyur, and Constance Hadley. Dear diary: A randomized controlled trial of generative ai coding tools in the workplace. In2025 IEEE/ACM 47th International Conference on Software Engineering: Software Engineering in Practice (ICSE- SEIP), pages 319–329, 2025. doi: 10.1109/ICSE-SEIP66354.2025.00034

work page doi:10.1109/icse-seip66354.2025.00034 2025

[9] [9]

The Productivity Effects of Generative AI: Evidence from a Field Experiment with GitHub Copilot.An MIT Exploration of Generative AI, mar 27 2024

Kevin Zheyuan Cui, Mert Demirer, Sonia Jaffe, Leon Musolff, Sida Peng, and Tobias Salz. The Productivity Effects of Generative AI: Evidence from a Field Experiment with GitHub Copilot.An MIT Exploration of Generative AI, mar 27 2024. https://mit- genai.pubpub.org/pub/v5iixksv

2024

[10] [10]

The effects of generative AI on high-skilled work: Evidence from three field experiments with software developers.Management Science, 2025

Zheyuan Cui, Mert Demirer, Sonia Jaffe, Leon Musolff, Sida Peng, and Tobias Salz. The effects of generative AI on high-skilled work: Evidence from three field experiments with software developers.Management Science, 2025. doi: 10.2139/ssrn.4945566. Forthcoming

work page doi:10.2139/ssrn.4945566 2025

[11] [11]

Who is using ai to code? global diffusion and impact of generative ai.Science, page eadz9311, 2026

Simone Daniotti, Johannes Wachs, Xiangnan Feng, and Frank Neffke. Who is using ai to code? global diffusion and impact of generative ai.Science, page eadz9311, 2026

2026

[12] [12]

Perceived usefulness, perceived ease of use, and user acceptance of information technology.MIS quarterly, 13(3):319–340, 1989

Fred D Davis. Perceived usefulness, perceived ease of use, and user acceptance of information technology.MIS quarterly, 13(3):319–340, 1989

1989

[13] [13]

Writing code vs

Mert Demirer, Leon Musolff, and Liyuan Yang. Writing code vs. shipping code: Productiv- ity effects across generations of ai coding tools. Working Paper 35275, National Bureau of Economic Research, May 2026. URLhttp://www.nber.org/papers/w35275

2026

[14] [14]

A Meta employee created a dashboard so coworkers can compete to be the company’s no

Fortune. A Meta employee created a dashboard so coworkers can compete to be the company’s no. 1 AI token user – and Zuckerberg doesn’t even rank in the top 250.https://fortune. com/2026/04/09/meta-killed-employee-ai-token-dashboard/, April 2026. Accessed 2026-05-20

2026

[15] [15]

The state of generative ai in software de- velopment: Insights from literature and a developer survey.arXiv preprint arXiv:2603.16975, 2026

Vincent Gurgul, Robin Gubela, and Stefan Lessmann. The state of generative ai in software de- velopment: Insights from literature and a developer survey.arXiv preprint arXiv:2603.16975, 2026

work page arXiv 2026

[16] [16]

Speed at the cost of quality: How Cursor AI increases short-term velocity and long-term complexity in open-source projects

Hao He, Courtney Miller, Shyam Agarwal, Christian K¨ astner, and Bogdan Vasilescu. Speed at the cost of quality: How Cursor AI increases short-term velocity and long-term complexity in open-source projects. InProceedings of the 23rd International Conference on Mining Software Repositories (MSR), 2026

2026

[17] [17]

GitHub Copilot and developer produc- tivity: An observational dose-response analysis, 2026

Alex Heilman, Alex Kyllo, and Emerson Murphy-Hill. GitHub Copilot and developer produc- tivity: An observational dose-response analysis, 2026. 20

2026

[18] [18]

The heterogeneous productivity effects of generative ai

David Kreitmeir and Paul A Raschky. The heterogeneous productivity effects of generative ai. arXiv preprint arXiv:2403.01964, 2024

work page arXiv 2024

[19] [19]

Why ai agents still need you: Findings from developer-agent collaborations in the wild

Aayush Kumar, Yasharth Bajpai, Sumit Gulwani, Gustavo Soares, and Emerson Murphy-Hill. Why ai agents still need you: Findings from developer-agent collaborations in the wild. In 2025 40th IEEE/ACM International Conference on Automated Software Engineering (ASE), page 432–444. IEEE Press, 2025. doi: 10.1109/ASE63991.2025.00043. URLhttps://doi.or g/10.1109/...

work page doi:10.1109/ase63991.2025.00043 2025

[20] [20]

Intuition to evidence: Measuring ai’s true impact on developer productivity.arXiv preprint arXiv:2509.19708, 2025

Anand Kumar, Vishal Khare, Deepak Sharma, Satyam Kumar, Vijay Saini, Anshul Yadav, Sachendra Jain, Ankit Rana, Pratham Verma, Vaibhav Meena, et al. Intuition to evidence: Measuring ai’s true impact on developer productivity.arXiv preprint arXiv:2509.19708, 2025

work page arXiv 2025

[21] [21]

The impact of llm-assistants on software developer productivity: A systematic review and mapping study.ACM Trans

Amr Mohamed, Maram Assi, and Mariam Guizani. The impact of llm-assistants on software developer productivity: A systematic review and mapping study.ACM Trans. Softw. Eng. Methodol., April 2026. ISSN 1049-331X. doi: 10.1145/3809494. URLhttps://doi.org/10.1 145/3809494. Just Accepted

work page doi:10.1145/3809494 2026

[22] [22]

Reading between the lines: Modeling user behavior and costs in ai-assisted programming

Hussein Mozannar, Gagan Bansal, Adam Fourney, and Eric Horvitz. Reading between the lines: Modeling user behavior and costs in ai-assisted programming. InProceedings of the 2024 CHI conference on human factors in computing systems, pages 1–16, 2024

2024

[23] [23]

Peer interaction effectively, yet infrequently, enables programmers to discover new tools

Emerson Murphy-Hill and Gail C Murphy. Peer interaction effectively, yet infrequently, enables programmers to discover new tools. InProceedings of the ACM 2011 conference on Computer supported cooperative work, pages 405–414, 2011

2011

[24] [24]

How do users discover new tools in software development and beyond?Computer Supported Cooperative Work (CSCW), 24(5):389–422, 2015

Emerson Murphy-Hill, Da Young Lee, Gail C Murphy, and Joanna McGrenere. How do users discover new tools in software development and beyond?Computer Supported Cooperative Work (CSCW), 24(5):389–422, 2015

2015

[25] [25]

A survey of generative AI adoption and perceived productivity among scientists who program

Gabrielle O’Brien, Alexis Parker, Nasir Eisty, and Jeffrey Carver. More code, less vali- dation: Risk factors for over-reliance on ai coding tools among scientists.arXiv preprint arXiv:2512.19644, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025

[26] [26]

AI Tooling for Software Engineers in 2026.https://newsletter.pragmatic engineer.com/p/ai-tooling-2026, 2026

Gergely Orosz. AI Tooling for Software Engineers in 2026.https://newsletter.pragmatic engineer.com/p/ai-tooling-2026, 2026. Accessed 2026-05-20

2026

[27] [27]

How much does AI impact development speed? an enterprise-based randomized controlled trial

Elise Paradis, Kate Grey, Quinn Madison, Daye Nam, Andrew Macvean, Vahid Meimand, Nan Zhang, Ben Ferrari-Church, and Satish Chandra. How much does AI impact development speed? an enterprise-based randomized controlled trial. In2025 IEEE/ACM 47th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP), pages 618–629. ...

2025

[28] [28]

The Impact of AI on Developer Productivity: Evidence from GitHub Copilot

Sida Peng, Eirini Kalliamvakou, Peter Cihon, and Mert Demirer. The impact of AI on devel- oper productivity: Evidence from GitHub Copilot.arXiv preprint arXiv:2302.06590, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023

[29] [29]

Coding Beyond Your Training: Claude Code and the Technological Frontier of Software Developers

Alexander Quispe. Coding beyond your training: Claude code and the technological frontier of software developers.arXiv preprint arXiv:2605.25438, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026

[30] [30]

Adop- tion of ai tools in software development: a systematic literature review.Science of Computer Programming, 254:103521, 2026

Dar´ ıo Reyes-Reina, Jenny Marcela Sanch´ ez-Torres, and Iv´ an Mauricio Rueda-C´ aceres. Adop- tion of ai tools in software development: a systematic literature review.Science of Computer Programming, 254:103521, 2026. 21

2026

[31] [31]

Agentic Much? Adoption of Coding Agents on GitHub

Romain Robbes, Th´ eo Matricon, Thomas Degueule, Andre Hora, and Stefano Zacchiroli. Agen- tic much? adoption of coding agents on github.arXiv preprint arXiv:2601.18341, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026

[32] [32]

Rogers.Diffusion of Innovations, 5th Edition

Everett M. Rogers.Diffusion of Innovations, 5th Edition. Simon and Schuster, 2003. ISBN 9780743258234

2003

[33] [33]

The effects of github copilot on computing students’ program- ming effectiveness, efficiency, and processes in brownfield coding tasks

Md Istiak Hossain Shihab, Christopher Hundhausen, Ahsun Tariq, Summit Haque, Yunhan Qiao, and Brian Wise Mulanda. The effects of github copilot on computing students’ program- ming effectiveness, efficiency, and processes in brownfield coding tasks. InProceedings of the 2025 ACM Conference on International Computing Education Research V. 1, pages 407–420, 2025

2025

[34] [34]

The Impact of Generative AI on Collaborative Open-Source Software Development: Evidence from GitHub Copilot

Fangchen Song, Ashish Agarwal, and Wen Wen. The impact of generative ai on collab- orative open-source software development: Evidence from github copilot.arXiv preprint arXiv:2410.02091, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024

[35] [35]

Stack Overflow. Developers remain willing but reluctant to use AI: The 2025 developer survey results are here.https://stackoverflow.blog/2025/12/29/developers-remain-wil ling-but-reluctant-to-use-ai-the-2025-developer-survey-results-are-here/, December 2025. Accessed 2026-05-20

2025

[36] [36]

Developer productivity with and without github copilot: A longitudinal mixed-methods case study

Viktoria Stray, Elias Goldmann Brandtzæg, Viggo Wivestad, Astri Barbala, and Nils Brede Moe. Developer productivity with and without github copilot: A longitudinal mixed-methods case study. InProceedings of the 59th Hawaii International Conference on System Sciences, 2026

2026

[37] [37]

Impacts of generative ai on agile teams’ productivity: A multi-case longitudinal study.arXiv preprint arXiv:2602.13766, 2026

Rafael Tomaz, Paloma Guenes, Allysson Allex Ara˜A¯ ejo, Maria Teresa Baldassarre, and Marcos Kalinowski. Impacts of generative ai on agile teams’ productivity: A multi-case longitudinal study.arXiv preprint arXiv:2602.13766, 2026

work page arXiv 2026

[38] [38]

Expectation vs

Priyan Vaithilingam, Tianyi Zhang, and Elena L Glassman. Expectation vs. experience: Eval- uating the usability of code generation tools powered by large language models. InCHI conference on human factors in computing systems extended abstracts, pages 1–7, 2022

2022

[39] [39]

A theoretical extension of the technology acceptance model: Four longitudinal field studies.Management science, 46(2):186–204, 2000

Viswanath Venkatesh and Fred D Davis. A theoretical extension of the technology acceptance model: Four longitudinal field studies.Management science, 46(2):186–204, 2000

2000

[40] [40]

User acceptance of information technology: Toward a unified view1.MIS quarterly, 27(3):425–478, 2003

Viswanath Venkatesh, Michael G Morris, Gordon B Davis, and Fred D Davis. User acceptance of information technology: Toward a unified view1.MIS quarterly, 27(3):425–478, 2003

2003

[41] [41]

Significant productivity gains through programming with large language models.Proceedings of the ACM on Human-Computer Interaction, 8(EICS):1–29, 2024

Thomas Weber, Maximilian Brandmaier, Albrecht Schmidt, and Sven Mayer. Significant productivity gains through programming with large language models.Proceedings of the ACM on Human-Computer Interaction, 8(EICS):1–29, 2024

2024

[42] [42]

Social influences on secure develop- ment tool adoption: why security tools spread

Shundan Xiao, Jim Witschey, and Emerson Murphy-Hill. Social influences on secure develop- ment tool adoption: why security tools spread. InProceedings of the 17th ACM conference on Computer supported cooperative work & social computing, pages 1095–1106, 2014

2014

[43] [43]

Claude code scientists: Measuring ai adoption and productivity among scien- tists.Available at SSRN 6803624, 2026

Charles Yang. Claude code scientists: Measuring ai adoption and productivity among scien- tists.Available at SSRN 6803624, 2026. 22

2026

[44] [44]

The Impact of Large Language Models on Open-source Innovation: Evidence from GitHub Copilot

Doron Yeverechyahu, Raveesh Mayya, and Gal Oestreicher-Singer. The impact of large language models on open-source innovation: Evidence from github copilot.arXiv preprint arXiv:2409.08379, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024

[45] [45]

Productivity assessment of neural code completion

Albert Ziegler, Eirini Kalliamvakou, X Alice Li, Andrew Rice, Devon Rifkin, Shawn Simis- ter, Ganesh Sittampalam, and Edward Aftandilian. Productivity assessment of neural code completion. InProceedings of the 6th ACM SIGPLAN international symposium on machine programming, pages 21–29, 2022. 23

2022