Augmentation with Dilution: A Large-Scale Empirical Study of Human Contributor Ecosystems After AI Coding Agent Adoption

Anne Koziolek; Bowen Jiang; Weixing Zhang

arxiv: 2606.26289 · v1 · pith:XSJSOQGVnew · submitted 2026-06-24 · 💻 cs.SE

Augmentation with Dilution: A Large-Scale Empirical Study of Human Contributor Ecosystems After AI Coding Agent Adoption

Weixing Zhang , Bowen Jiang , Anne Koziolek This is my paper

Pith reviewed 2026-06-26 01:18 UTC · model grok-4.3

classification 💻 cs.SE

keywords AI coding agentsopen source softwarehuman contributorsdifference-in-differencesGitHub repositoriescontributor densitynewcomer participationcode review depth

0 comments

The pith

AI coding agent adoption leaves the absolute number of human contributors unchanged while reducing their relative density and newcomer share in open-source projects.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper investigates how the arrival of AI coding agents alters the human side of open-source development rather than treating humans as a fixed background. Using data from over eleven thousand GitHub repositories and a staggered difference-in-differences approach, it shows that total human contributor counts do not drop measurably after adoption, yet the proportion of activity coming from humans falls, newcomers participate less, and existing contributors spend more time on reviews. These shifts point to a pattern the authors label augmentation with dilution, in which AI handles more initial code production while humans remain but occupy a narrower slice of the ecosystem. The findings matter for understanding whether open-source communities can absorb AI tools without losing the diversity and entry points that sustain them over time.

Core claim

Adoption of AI coding agents produces no statistically significant change in the absolute count of human contributors, yet it lowers human contributor density, reduces the relative share of newcomers by 3.7 percentage points, and raises review depth by 5.3 percent. The effects appear immediately after adoption and persist, varying with project size, language, and maturity. The overall pattern is described as augmentation with dilution rather than displacement.

What carries the argument

Staggered difference-in-differences design with the Sun and Abraham estimator applied to the timing of AI coding agent adoption across repositories.

If this is right

Absolute human contributor counts remain stable after AI adoption.
Human contributor density declines as AI-generated contributions accumulate.
The relative participation share of newcomers falls immediately and stays lower.
Review depth increases as human effort shifts from code production to evaluation.
The size of these changes differs across project size, programming language, and maturity levels.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Projects may need deliberate mechanisms to preserve newcomer entry points if the observed dilution continues.
Increased review burden could raise the value of experienced human reviewers and change contribution norms.
Longer-term ecosystem health may depend on whether diluted human participation still supplies enough novel ideas and maintenance effort.
The pattern suggests AI tools redistribute rather than eliminate human roles, which could be tested by tracking contributor retention rates over longer windows.

Load-bearing premise

The timing of AI agent adoption across repositories is unrelated to any factors that would also drive changes in contributor numbers, density, or newcomer shares.

What would settle it

A dataset showing that repositories adopting AI agents already exhibited different pre-adoption trends in contributor density or newcomer share compared with non-adopters would undermine the causal interpretation.

Figures

Figures reproduced from arXiv: 2606.26289 by Anne Koziolek, Bowen Jiang, Weixing Zhang.

**Figure 1.** Figure 1: Event study estimates for logtransformed human contributor count (RQ1-A, robustness check). Periods relative to AI agent adoption −0.1 0.0 0.1 0.2 0.3 −10 −5 0 5 10 Estimate and 95% Conf. Int [PITH_FULL_IMAGE:figures/full_fig_p006_1.png] view at source ↗

**Figure 3.** Figure 3: Event study estimates for newcomer ratio (RQ2-B). Months relative to AI agent adoption −1.0 −0.6 −0.2 0.2 −10 −5 0 5 10 Estimate and 95% Conf. Int [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

**Figure 6.** Figure 6: Event study estimates for newcomer ratio by project size (low vs. [PITH_FULL_IMAGE:figures/full_fig_p008_6.png] view at source ↗

**Figure 7.** Figure 7: Event study estimates for newcomer ratio by programming language (Top 6). Horizontal axis: months relative to AI agent adoption. Vertical axis: [PITH_FULL_IMAGE:figures/full_fig_p009_7.png] view at source ↗

**Figure 8.** Figure 8: Event study estimates for newcomer ratio by project maturity (low [PITH_FULL_IMAGE:figures/full_fig_p009_8.png] view at source ↗

read the original abstract

AI coding agents are penetrating open-source software development at an unprecedented pace, yet existing research predominantly treats human contributors as a static backdrop rather than as the subject of inquiry. This paper presents the first large-scale empirical study that takes the human contributor ecosystem as its dependent variable, examining how the number, composition, and behavior of human participants change following AI coding agent adoption in open-source projects. Using a staggered difference-in-differences design on a dataset of 11,097 GitHub repositories spanning January 2023 to May 2026, we provide causal evidence via the Sun and Abraham estimator. Our results show that AI agent adoption does not significantly change the absolute number of human contributors (ATT = 0.014, p = 0.224), but significantly reduces human contributor density (ATT = -0.019, p = 0.002), indicating that the relative share of human participation declines as AI-generated pull requests accumulate. The relative participation share of newcomers declines significantly by 3.7 percentage points (ATT = -0.037, p < 0.001), with the effect emerging immediately after adoption and remaining stable throughout the observation window. Review depth increases significantly by 5.3% (ATT = +0.0168, p < 0.001), indicating that AI agents shift burden from the code production stage to the review stage. Moderator analysis reveals that these effects vary systematically with project size, programming language, and project maturity. Together, these findings present a pattern of augmentation with dilution: AI agents are not displacing human contributors, but are systematically reshaping the participation structure of open-source ecosystems.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives the first large-scale causal estimates on how AI coding agents change OSS contributor numbers, density, newcomer share, and review behavior, but the staggered adoption timing looks endogenous to the outcomes.

read the letter

The core claim is that AI adoption leaves absolute human contributor counts unchanged but reduces their density, cuts newcomer participation by 3.7 points, and raises review depth by 5.3 percent, using Sun and Abraham on 11k GitHub repos from 2023-2026. That framing of the human ecosystem as the outcome variable is new for this literature.

The work does a few things right. It applies a suitable estimator for staggered treatment, reports concrete ATT numbers with p-values, and includes moderator checks by project size, language, and maturity. The dataset scale is real and the question is timely.

The soft spot is identification. Adoption dates are treated as exogenous conditional on fixed effects, yet projects plausibly turn to AI agents when human participation is already declining or shifting. The abstract shows no pre-trend coefficients, event-study plots, or tests for anticipation, so the parallel-trends assumption stays unsecured. That directly affects how much weight the ATT estimates can carry.

This is for empirical software engineering readers who track AI effects on open source. They will get usable numbers and a clear design to build on or challenge.

Send it to peer review. The data effort and causal framing are substantial enough to justify referee time, even if the identification needs more defense.

Referee Report

2 major / 1 minor

Summary. This paper conducts the first large-scale causal study of AI coding agent adoption's impact on human contributors in open-source GitHub repositories. Using a staggered DiD design with the Sun and Abraham estimator on 11,097 repositories from January 2023 to May 2026, it finds no significant effect on the absolute number of human contributors (ATT = 0.014, p = 0.224) but a reduction in contributor density (ATT = -0.019, p = 0.002), a 3.7 percentage point drop in newcomer share (ATT = -0.037, p < 0.001), and a 5.3% increase in review depth (ATT = +0.0168, p < 0.001), concluding that AI leads to 'augmentation with dilution' of human participation, with heterogeneity by project size, language, and maturity.

Significance. If the causal identification holds, this provides novel evidence on how AI agents reshape rather than displace human participation structures in OSS ecosystems, with implications for project sustainability, governance, and the division of labor between code production and review. The large sample size and application of the Sun and Abraham estimator tailored to staggered adoption are strengths that could advance empirical software engineering research on tool adoption effects.

major comments (2)

[Empirical Strategy] Empirical Strategy section: The central causal claims rest on the Sun and Abraham estimator recovering unbiased ATT effects under parallel trends and no anticipation, yet the manuscript reports no pre-trend coefficients, event-study plots, or robustness checks against time-varying confounders or anticipation. This directly affects the credibility of the key estimates on contributor density, newcomer share, and review depth.
[Data and Sample] Data and Sample section: AI coding agent adoption timing is assumed exogenous conditional on fixed effects, but no tests, discussions, or sensitivity analyses address potential endogeneity (e.g., adoption driven by declining human participation or project maturity), which could bias the reported effects on the outcome variables.

minor comments (1)

[Abstract] The abstract introduces the interpretive phrase 'augmentation with dilution' without a concise definition; a brief operationalization in the introduction would improve clarity for readers.

Simulated Author's Rebuttal

2 responses · 0 unresolved

Thank you for the constructive feedback on our manuscript. We address each major comment below, agreeing where revisions are warranted to strengthen the causal claims and identification discussion. We propose targeted additions to the Empirical Strategy and Data sections.

read point-by-point responses

Referee: [Empirical Strategy] Empirical Strategy section: The central causal claims rest on the Sun and Abraham estimator recovering unbiased ATT effects under parallel trends and no anticipation, yet the manuscript reports no pre-trend coefficients, event-study plots, or robustness checks against time-varying confounders or anticipation. This directly affects the credibility of the key estimates on contributor density, newcomer share, and review depth.

Authors: We agree that explicit validation of the identifying assumptions is necessary. In the revised manuscript we will add event-study plots and pre-treatment coefficients using the Sun and Abraham estimator to document parallel trends, along with robustness checks that shift adoption dates to test for anticipation. These will be placed in a new subsection of the Empirical Strategy section and referenced in the Results. revision: yes
Referee: [Data and Sample] Data and Sample section: AI coding agent adoption timing is assumed exogenous conditional on fixed effects, but no tests, discussions, or sensitivity analyses address potential endogeneity (e.g., adoption driven by declining human participation or project maturity), which could bias the reported effects on the outcome variables.

Authors: We acknowledge the need for greater transparency on this point. The revised version will include an expanded discussion of the exogeneity assumption conditional on fixed effects and add sensitivity analyses examining whether pre-adoption trends in contributor outcomes predict adoption timing. We will also report results from alternative specifications that control for project maturity proxies. Full resolution of all endogeneity channels may be limited by available covariates, which we will note. revision: partial

Circularity Check

0 steps flagged

No circularity: results are data-driven ATT estimates from external GitHub repositories via standard Sun-Abraham estimator

full rationale

The paper reports causal estimates (ATT on contributor count, density, newcomer share, review depth) obtained by applying the Sun and Abraham (2021) staggered DiD estimator to an external dataset of 11,097 GitHub repositories. No equations, parameters, or predictions are defined in terms of the target outcomes; the estimator is an off-the-shelf method whose validity rests on external identifying assumptions (parallel trends, conditional exogeneity of adoption timing) rather than any self-referential construction. No self-citations are load-bearing, no fitted inputs are relabeled as predictions, and no ansatz or uniqueness claim reduces the results to the inputs by definition. The derivation chain is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The study relies on standard econometric assumptions for causal inference in a staggered adoption setting rather than introducing new free parameters, axioms beyond domain standards, or invented entities.

axioms (1)

domain assumption The parallel trends assumption holds for the staggered adoption of AI coding agents across the selected repositories.
This is required for unbiased causal identification using the difference-in-differences estimator.

pith-pipeline@v0.9.1-grok · 5835 in / 1310 out tokens · 34260 ms · 2026-06-26T01:18:36.097562+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

28 extracted references · 1 canonical work pages · 1 internal anchor

[1]

Agentic much? adoption of coding agents on github,

R. Robbes, T. Matricon, T. Degueule, A. Hora, and S. Zacchiroli, “Agentic much? adoption of coding agents on github,”arXiv preprint arXiv:2601.18341, 2026

Pith/arXiv arXiv 2026
[2]

A systematic literature review on the barriers faced by newcomers to open source software projects,

I. Steinmacher, M. A. G. Silva, M. A. Gerosa, and D. F. Redmiles, “A systematic literature review on the barriers faced by newcomers to open source software projects,”Information and Software Technology, vol. 59, pp. 67–85, 2015

2015
[3]

Why do people give up flossing? a study of contributor disengagement in open source,

C. Miller, D. G. Widder, C. K ¨astner, and B. Vasilescu, “Why do people give up flossing? a study of contributor disengagement in open source,” inIFIP International Conference on Open Source Systems. Springer, 2019, pp. 116–129

2019
[4]

Gender and tenure diversity in github teams,

B. Vasilescu, D. Posnett, B. Ray, M. G. van den Brand, A. Serebrenik, P. Devanbu, and V . Filkov, “Gender and tenure diversity in github teams,” inProceedings of the 33rd annual ACM conference on human factors in computing systems, 2015, pp. 3789–3798

2015
[5]

How ai coding agents modify code: A large-scale study of github pull requests,

D. Ogenrwot and J. Businge, “How ai coding agents modify code: A large-scale study of github pull requests,”arXiv preprint arXiv:2601.17581, 2026

Pith/arXiv arXiv 2026
[6]

Will it survive? deciphering the fate of ai- generated code in open source,

M. Rahman and E. Shihab, “Will it survive? deciphering the fate of ai- generated code in open source,”arXiv preprint arXiv:2601.16809, 2026

arXiv 2026
[7]

Debt behind the ai boom: A large-scale empirical study of ai-generated code in the wild,

Y . Liu, R. Widyasari, Y . Zhao, I. C. Irsan, J. Chen, and D. Lo, “Debt behind the ai boom: A large-scale empirical study of ai-generated code in the wild,”arXiv preprint arXiv:2603.28592, 2026

Pith/arXiv arXiv 2026
[8]

On autopilot? an empirical study of human-ai teaming and review practices in open source,

H. Gao, P. Banyongrakkul, H. Guan, M. Zahedi, and C. Treude, “On autopilot? an empirical study of human-ai teaming and review practices in open source,”arXiv preprint arXiv:2601.13754, 2026

arXiv 2026
[9]

AI IDEs or autonomous agents? Measuring the impact of coding agents on software development,

S. Agarwal, H. He, and B. Vasilescu, “AI IDEs or autonomous agents? Measuring the impact of coding agents on software development,” in Proceedings of the 23rd International Conference on Mining Software Repositories (MSR), 2026

2026
[10]

We are changing our developer productivity experiment design,

J. Becker, N. Rush, T. Cunningham, D. Rein, and K. Mahamud, “We are changing our developer productivity experiment design,” https://metr. org/blog/2026-02-24-uplift-update/, 02 2026

2026
[11]

Work practices and challenges in pull-based development: The integrator’s perspective,

G. Gousios, A. Zaidman, M.-A. Storey, and A. Van Deursen, “Work practices and challenges in pull-based development: The integrator’s perspective,” in2015 IEEE/ACM 37th IEEE International Conference on Software Engineering, vol. 1. IEEE, 2015, pp. 358–368

2015
[12]

Difference-in-differences with multi- ple time periods,

B. Callaway and P. H. Sant’Anna, “Difference-in-differences with multi- ple time periods,”Journal of econometrics, vol. 225, no. 2, pp. 200–230, 2021

2021
[13]

Towards causal analysis of empir- ical software engineering data: The impact of programming languages on coding competitions,

C. A. Furia, R. Torkar, and R. Feldt, “Towards causal analysis of empir- ical software engineering data: The impact of programming languages on coding competitions,”ACM Transactions on Software Engineering and Methodology, vol. 33, no. 1, pp. 1–35, 2023

2023
[14]

Replication package for ‘Augmentation with Dilution: A Large-Scale Empirical Study of Human Contributor Ecosystems After AI Coding Agent Adoption’,

A. Author, “Replication package for ‘Augmentation with Dilution: A Large-Scale Empirical Study of Human Contributor Ecosystems After AI Coding Agent Adoption’,” 2026. [Online]. Available: https://osf.io/ phzk7/overview?view only=73979e746c5f4a0aa59317b5457204ff

2026
[15]

Where do ai coding agents fail? an empirical study of failed agentic pull requests in github,

R. Ehsani, S. Pathak, S. Rawal, A. A. Mujahid, M. M. Imran, and P. Chatterjee, “Where do ai coding agents fail? an empirical study of failed agentic pull requests in github,”arXiv preprint arXiv:2601.15195, 2026

arXiv 2026
[16]

On the use of agentic coding: An empirical study of pull requests on github,

M. Watanabe, H. Li, Y . Kashiwa, B. Reid, H. Iida, and A. E. Hassan, “On the use of agentic coding: An empirical study of pull requests on github,”ACM Transactions on Software Engineering and Methodology, 2025

2025
[17]

The end of code review: Coding agents supersede human inspection,

M. Monperrus, “The end of code review: Coding agents supersede human inspection,”arXiv preprint arXiv:2606.13175, 2026

Pith/arXiv arXiv 2026
[18]

Ecosystem-level determinants of sustained activity in open-source projects: A case study of the pypi ecosystem,

M. Valiev, B. Vasilescu, and J. Herbsleb, “Ecosystem-level determinants of sustained activity in open-source projects: A case study of the pypi ecosystem,” inProceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2018, pp. 644–655

2018
[19]

The Rise of AI Teammates in Software Engineering (SE) 3.0: How Autonomous Coding Agents Are Reshaping Software Engineering

H. Li, H. Zhang, and A. E. Hassan, “The rise of AI teammates in software engineering (SE) 3.0: How autonomous coding agents are reshaping software engineering,” 2025. [Online]. Available: https://doi.org/10.48550/arXiv.2507.15003

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2507.15003 2025
[20]

Speed at the cost of quality: How Cursor AI increases short-term velocity and long-term complexity in open-source projects,

H. He, C. Miller, S. Agarwal, C. K ¨astner, and B. Vasilescu, “Speed at the cost of quality: How Cursor AI increases short-term velocity and long-term complexity in open-source projects,” inProceedings of the 23rd International Conference on Mining Software Repositories (MSR), 2026

2026
[21]

The central role of the propensity score in observational studies for causal effects,

P. R. Rosenbaum and D. B. Rubin, “The central role of the propensity score in observational studies for causal effects,”Biometrika, vol. 70, no. 1, pp. 41–55, 1983

1983
[22]

An introduction to propensity score methods for reduc- ing the effects of confounding in observational studies,

P. C. Austin, “An introduction to propensity score methods for reduc- ing the effects of confounding in observational studies,”Multivariate behavioral research, vol. 46, no. 3, pp. 399–424, 2011

2011
[23]

Two-way fixed effects, the two-way mundlak regres- sion, and difference-in-differences estimators,

J. M. Wooldridge, “Two-way fixed effects, the two-way mundlak regres- sion, and difference-in-differences estimators,”Empirical Economics, vol. 69, no. 5, pp. 2545–2587, 2025

2025
[24]

Estimating dynamic treatment effects in event studies with heterogeneous treatment effects,

L. Sun and S. Abraham, “Estimating dynamic treatment effects in event studies with heterogeneous treatment effects,”Journal of econometrics, vol. 225, no. 2, pp. 175–199, 2021

2021
[25]

Early-stage prediction of review effort in ai-generated pull requests,

D. S. D. Minh, H. T. Kiet, N. L. P. Quy, P. P. Hoa, T. C. Nguyen, N. D. H. Duong, and T. B. Tran, “Early-stage prediction of review effort in ai-generated pull requests,”arXiv preprint arXiv:2601.00753, 2026

arXiv 2026
[26]

Influence of social and technical factors for evaluating contribution in github,

J. Tsay, L. Dabbish, and J. Herbsleb, “Influence of social and technical factors for evaluating contribution in github,” inProceedings of the 36th international conference on Software engineering, 2014, pp. 356–366

2014
[27]

From industry claims to empirical reality: An empirical study of code review agents in pull requests,

K. Chowdhury, D. Banik, K. Ferdous, and S. I. Shamim, “From industry claims to empirical reality: An empirical study of code review agents in pull requests,”arXiv preprint arXiv:2604.03196, 2026

Pith/arXiv arXiv 2026
[28]

An exploratory study of the pull-based software development model,

G. Gousios, M. Pinzger, and A. v. Deursen, “An exploratory study of the pull-based software development model,” inProceedings of the 36th international conference on software engineering, 2014, pp. 345–355

2014

[1] [1]

Agentic much? adoption of coding agents on github,

R. Robbes, T. Matricon, T. Degueule, A. Hora, and S. Zacchiroli, “Agentic much? adoption of coding agents on github,”arXiv preprint arXiv:2601.18341, 2026

Pith/arXiv arXiv 2026

[2] [2]

A systematic literature review on the barriers faced by newcomers to open source software projects,

I. Steinmacher, M. A. G. Silva, M. A. Gerosa, and D. F. Redmiles, “A systematic literature review on the barriers faced by newcomers to open source software projects,”Information and Software Technology, vol. 59, pp. 67–85, 2015

2015

[3] [3]

Why do people give up flossing? a study of contributor disengagement in open source,

C. Miller, D. G. Widder, C. K ¨astner, and B. Vasilescu, “Why do people give up flossing? a study of contributor disengagement in open source,” inIFIP International Conference on Open Source Systems. Springer, 2019, pp. 116–129

2019

[4] [4]

Gender and tenure diversity in github teams,

B. Vasilescu, D. Posnett, B. Ray, M. G. van den Brand, A. Serebrenik, P. Devanbu, and V . Filkov, “Gender and tenure diversity in github teams,” inProceedings of the 33rd annual ACM conference on human factors in computing systems, 2015, pp. 3789–3798

2015

[5] [5]

How ai coding agents modify code: A large-scale study of github pull requests,

D. Ogenrwot and J. Businge, “How ai coding agents modify code: A large-scale study of github pull requests,”arXiv preprint arXiv:2601.17581, 2026

Pith/arXiv arXiv 2026

[6] [6]

Will it survive? deciphering the fate of ai- generated code in open source,

M. Rahman and E. Shihab, “Will it survive? deciphering the fate of ai- generated code in open source,”arXiv preprint arXiv:2601.16809, 2026

arXiv 2026

[7] [7]

Debt behind the ai boom: A large-scale empirical study of ai-generated code in the wild,

Y . Liu, R. Widyasari, Y . Zhao, I. C. Irsan, J. Chen, and D. Lo, “Debt behind the ai boom: A large-scale empirical study of ai-generated code in the wild,”arXiv preprint arXiv:2603.28592, 2026

Pith/arXiv arXiv 2026

[8] [8]

On autopilot? an empirical study of human-ai teaming and review practices in open source,

H. Gao, P. Banyongrakkul, H. Guan, M. Zahedi, and C. Treude, “On autopilot? an empirical study of human-ai teaming and review practices in open source,”arXiv preprint arXiv:2601.13754, 2026

arXiv 2026

[9] [9]

AI IDEs or autonomous agents? Measuring the impact of coding agents on software development,

S. Agarwal, H. He, and B. Vasilescu, “AI IDEs or autonomous agents? Measuring the impact of coding agents on software development,” in Proceedings of the 23rd International Conference on Mining Software Repositories (MSR), 2026

2026

[10] [10]

We are changing our developer productivity experiment design,

J. Becker, N. Rush, T. Cunningham, D. Rein, and K. Mahamud, “We are changing our developer productivity experiment design,” https://metr. org/blog/2026-02-24-uplift-update/, 02 2026

2026

[11] [11]

Work practices and challenges in pull-based development: The integrator’s perspective,

G. Gousios, A. Zaidman, M.-A. Storey, and A. Van Deursen, “Work practices and challenges in pull-based development: The integrator’s perspective,” in2015 IEEE/ACM 37th IEEE International Conference on Software Engineering, vol. 1. IEEE, 2015, pp. 358–368

2015

[12] [12]

Difference-in-differences with multi- ple time periods,

B. Callaway and P. H. Sant’Anna, “Difference-in-differences with multi- ple time periods,”Journal of econometrics, vol. 225, no. 2, pp. 200–230, 2021

2021

[13] [13]

Towards causal analysis of empir- ical software engineering data: The impact of programming languages on coding competitions,

C. A. Furia, R. Torkar, and R. Feldt, “Towards causal analysis of empir- ical software engineering data: The impact of programming languages on coding competitions,”ACM Transactions on Software Engineering and Methodology, vol. 33, no. 1, pp. 1–35, 2023

2023

[14] [14]

Replication package for ‘Augmentation with Dilution: A Large-Scale Empirical Study of Human Contributor Ecosystems After AI Coding Agent Adoption’,

A. Author, “Replication package for ‘Augmentation with Dilution: A Large-Scale Empirical Study of Human Contributor Ecosystems After AI Coding Agent Adoption’,” 2026. [Online]. Available: https://osf.io/ phzk7/overview?view only=73979e746c5f4a0aa59317b5457204ff

2026

[15] [15]

Where do ai coding agents fail? an empirical study of failed agentic pull requests in github,

R. Ehsani, S. Pathak, S. Rawal, A. A. Mujahid, M. M. Imran, and P. Chatterjee, “Where do ai coding agents fail? an empirical study of failed agentic pull requests in github,”arXiv preprint arXiv:2601.15195, 2026

arXiv 2026

[16] [16]

On the use of agentic coding: An empirical study of pull requests on github,

M. Watanabe, H. Li, Y . Kashiwa, B. Reid, H. Iida, and A. E. Hassan, “On the use of agentic coding: An empirical study of pull requests on github,”ACM Transactions on Software Engineering and Methodology, 2025

2025

[17] [17]

The end of code review: Coding agents supersede human inspection,

M. Monperrus, “The end of code review: Coding agents supersede human inspection,”arXiv preprint arXiv:2606.13175, 2026

Pith/arXiv arXiv 2026

[18] [18]

Ecosystem-level determinants of sustained activity in open-source projects: A case study of the pypi ecosystem,

M. Valiev, B. Vasilescu, and J. Herbsleb, “Ecosystem-level determinants of sustained activity in open-source projects: A case study of the pypi ecosystem,” inProceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2018, pp. 644–655

2018

[19] [19]

The Rise of AI Teammates in Software Engineering (SE) 3.0: How Autonomous Coding Agents Are Reshaping Software Engineering

H. Li, H. Zhang, and A. E. Hassan, “The rise of AI teammates in software engineering (SE) 3.0: How autonomous coding agents are reshaping software engineering,” 2025. [Online]. Available: https://doi.org/10.48550/arXiv.2507.15003

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2507.15003 2025

[20] [20]

Speed at the cost of quality: How Cursor AI increases short-term velocity and long-term complexity in open-source projects,

H. He, C. Miller, S. Agarwal, C. K ¨astner, and B. Vasilescu, “Speed at the cost of quality: How Cursor AI increases short-term velocity and long-term complexity in open-source projects,” inProceedings of the 23rd International Conference on Mining Software Repositories (MSR), 2026

2026

[21] [21]

The central role of the propensity score in observational studies for causal effects,

P. R. Rosenbaum and D. B. Rubin, “The central role of the propensity score in observational studies for causal effects,”Biometrika, vol. 70, no. 1, pp. 41–55, 1983

1983

[22] [22]

An introduction to propensity score methods for reduc- ing the effects of confounding in observational studies,

P. C. Austin, “An introduction to propensity score methods for reduc- ing the effects of confounding in observational studies,”Multivariate behavioral research, vol. 46, no. 3, pp. 399–424, 2011

2011

[23] [23]

Two-way fixed effects, the two-way mundlak regres- sion, and difference-in-differences estimators,

J. M. Wooldridge, “Two-way fixed effects, the two-way mundlak regres- sion, and difference-in-differences estimators,”Empirical Economics, vol. 69, no. 5, pp. 2545–2587, 2025

2025

[24] [24]

Estimating dynamic treatment effects in event studies with heterogeneous treatment effects,

L. Sun and S. Abraham, “Estimating dynamic treatment effects in event studies with heterogeneous treatment effects,”Journal of econometrics, vol. 225, no. 2, pp. 175–199, 2021

2021

[25] [25]

Early-stage prediction of review effort in ai-generated pull requests,

D. S. D. Minh, H. T. Kiet, N. L. P. Quy, P. P. Hoa, T. C. Nguyen, N. D. H. Duong, and T. B. Tran, “Early-stage prediction of review effort in ai-generated pull requests,”arXiv preprint arXiv:2601.00753, 2026

arXiv 2026

[26] [26]

Influence of social and technical factors for evaluating contribution in github,

J. Tsay, L. Dabbish, and J. Herbsleb, “Influence of social and technical factors for evaluating contribution in github,” inProceedings of the 36th international conference on Software engineering, 2014, pp. 356–366

2014

[27] [27]

From industry claims to empirical reality: An empirical study of code review agents in pull requests,

K. Chowdhury, D. Banik, K. Ferdous, and S. I. Shamim, “From industry claims to empirical reality: An empirical study of code review agents in pull requests,”arXiv preprint arXiv:2604.03196, 2026

Pith/arXiv arXiv 2026

[28] [28]

An exploratory study of the pull-based software development model,

G. Gousios, M. Pinzger, and A. v. Deursen, “An exploratory study of the pull-based software development model,” inProceedings of the 36th international conference on software engineering, 2014, pp. 345–355

2014