pith. sign in

arxiv: 2606.26289 · v1 · pith:XSJSOQGVnew · submitted 2026-06-24 · 💻 cs.SE

Augmentation with Dilution: A Large-Scale Empirical Study of Human Contributor Ecosystems After AI Coding Agent Adoption

Pith reviewed 2026-06-26 01:18 UTC · model grok-4.3

classification 💻 cs.SE
keywords AI coding agentsopen source softwarehuman contributorsdifference-in-differencesGitHub repositoriescontributor densitynewcomer participationcode review depth
0
0 comments X

The pith

AI coding agent adoption leaves the absolute number of human contributors unchanged while reducing their relative density and newcomer share in open-source projects.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper investigates how the arrival of AI coding agents alters the human side of open-source development rather than treating humans as a fixed background. Using data from over eleven thousand GitHub repositories and a staggered difference-in-differences approach, it shows that total human contributor counts do not drop measurably after adoption, yet the proportion of activity coming from humans falls, newcomers participate less, and existing contributors spend more time on reviews. These shifts point to a pattern the authors label augmentation with dilution, in which AI handles more initial code production while humans remain but occupy a narrower slice of the ecosystem. The findings matter for understanding whether open-source communities can absorb AI tools without losing the diversity and entry points that sustain them over time.

Core claim

Adoption of AI coding agents produces no statistically significant change in the absolute count of human contributors, yet it lowers human contributor density, reduces the relative share of newcomers by 3.7 percentage points, and raises review depth by 5.3 percent. The effects appear immediately after adoption and persist, varying with project size, language, and maturity. The overall pattern is described as augmentation with dilution rather than displacement.

What carries the argument

Staggered difference-in-differences design with the Sun and Abraham estimator applied to the timing of AI coding agent adoption across repositories.

If this is right

  • Absolute human contributor counts remain stable after AI adoption.
  • Human contributor density declines as AI-generated contributions accumulate.
  • The relative participation share of newcomers falls immediately and stays lower.
  • Review depth increases as human effort shifts from code production to evaluation.
  • The size of these changes differs across project size, programming language, and maturity levels.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Projects may need deliberate mechanisms to preserve newcomer entry points if the observed dilution continues.
  • Increased review burden could raise the value of experienced human reviewers and change contribution norms.
  • Longer-term ecosystem health may depend on whether diluted human participation still supplies enough novel ideas and maintenance effort.
  • The pattern suggests AI tools redistribute rather than eliminate human roles, which could be tested by tracking contributor retention rates over longer windows.

Load-bearing premise

The timing of AI agent adoption across repositories is unrelated to any factors that would also drive changes in contributor numbers, density, or newcomer shares.

What would settle it

A dataset showing that repositories adopting AI agents already exhibited different pre-adoption trends in contributor density or newcomer share compared with non-adopters would undermine the causal interpretation.

Figures

Figures reproduced from arXiv: 2606.26289 by Anne Koziolek, Bowen Jiang, Weixing Zhang.

Figure 1
Figure 1. Figure 1: Event study estimates for log￾transformed human contributor count (RQ1-A, robustness check). Periods relative to AI agent adoption −0.1 0.0 0.1 0.2 0.3 −10 −5 0 5 10 Estimate and 95% Conf. Int [PITH_FULL_IMAGE:figures/full_fig_p006_1.png] view at source ↗
Figure 3
Figure 3. Figure 3: Event study estimates for newcomer ratio (RQ2-B). Months relative to AI agent adoption −1.0 −0.6 −0.2 0.2 −10 −5 0 5 10 Estimate and 95% Conf. Int [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗
Figure 6
Figure 6. Figure 6: Event study estimates for newcomer ratio by project size (low vs. [PITH_FULL_IMAGE:figures/full_fig_p008_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Event study estimates for newcomer ratio by programming language (Top 6). Horizontal axis: months relative to AI agent adoption. Vertical axis: [PITH_FULL_IMAGE:figures/full_fig_p009_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Event study estimates for newcomer ratio by project maturity (low [PITH_FULL_IMAGE:figures/full_fig_p009_8.png] view at source ↗
read the original abstract

AI coding agents are penetrating open-source software development at an unprecedented pace, yet existing research predominantly treats human contributors as a static backdrop rather than as the subject of inquiry. This paper presents the first large-scale empirical study that takes the human contributor ecosystem as its dependent variable, examining how the number, composition, and behavior of human participants change following AI coding agent adoption in open-source projects. Using a staggered difference-in-differences design on a dataset of 11,097 GitHub repositories spanning January 2023 to May 2026, we provide causal evidence via the Sun and Abraham estimator. Our results show that AI agent adoption does not significantly change the absolute number of human contributors (ATT = 0.014, p = 0.224), but significantly reduces human contributor density (ATT = -0.019, p = 0.002), indicating that the relative share of human participation declines as AI-generated pull requests accumulate. The relative participation share of newcomers declines significantly by 3.7 percentage points (ATT = -0.037, p < 0.001), with the effect emerging immediately after adoption and remaining stable throughout the observation window. Review depth increases significantly by 5.3% (ATT = +0.0168, p < 0.001), indicating that AI agents shift burden from the code production stage to the review stage. Moderator analysis reveals that these effects vary systematically with project size, programming language, and project maturity. Together, these findings present a pattern of augmentation with dilution: AI agents are not displacing human contributors, but are systematically reshaping the participation structure of open-source ecosystems.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. This paper conducts the first large-scale causal study of AI coding agent adoption's impact on human contributors in open-source GitHub repositories. Using a staggered DiD design with the Sun and Abraham estimator on 11,097 repositories from January 2023 to May 2026, it finds no significant effect on the absolute number of human contributors (ATT = 0.014, p = 0.224) but a reduction in contributor density (ATT = -0.019, p = 0.002), a 3.7 percentage point drop in newcomer share (ATT = -0.037, p < 0.001), and a 5.3% increase in review depth (ATT = +0.0168, p < 0.001), concluding that AI leads to 'augmentation with dilution' of human participation, with heterogeneity by project size, language, and maturity.

Significance. If the causal identification holds, this provides novel evidence on how AI agents reshape rather than displace human participation structures in OSS ecosystems, with implications for project sustainability, governance, and the division of labor between code production and review. The large sample size and application of the Sun and Abraham estimator tailored to staggered adoption are strengths that could advance empirical software engineering research on tool adoption effects.

major comments (2)
  1. [Empirical Strategy] Empirical Strategy section: The central causal claims rest on the Sun and Abraham estimator recovering unbiased ATT effects under parallel trends and no anticipation, yet the manuscript reports no pre-trend coefficients, event-study plots, or robustness checks against time-varying confounders or anticipation. This directly affects the credibility of the key estimates on contributor density, newcomer share, and review depth.
  2. [Data and Sample] Data and Sample section: AI coding agent adoption timing is assumed exogenous conditional on fixed effects, but no tests, discussions, or sensitivity analyses address potential endogeneity (e.g., adoption driven by declining human participation or project maturity), which could bias the reported effects on the outcome variables.
minor comments (1)
  1. [Abstract] The abstract introduces the interpretive phrase 'augmentation with dilution' without a concise definition; a brief operationalization in the introduction would improve clarity for readers.

Simulated Author's Rebuttal

2 responses · 0 unresolved

Thank you for the constructive feedback on our manuscript. We address each major comment below, agreeing where revisions are warranted to strengthen the causal claims and identification discussion. We propose targeted additions to the Empirical Strategy and Data sections.

read point-by-point responses
  1. Referee: [Empirical Strategy] Empirical Strategy section: The central causal claims rest on the Sun and Abraham estimator recovering unbiased ATT effects under parallel trends and no anticipation, yet the manuscript reports no pre-trend coefficients, event-study plots, or robustness checks against time-varying confounders or anticipation. This directly affects the credibility of the key estimates on contributor density, newcomer share, and review depth.

    Authors: We agree that explicit validation of the identifying assumptions is necessary. In the revised manuscript we will add event-study plots and pre-treatment coefficients using the Sun and Abraham estimator to document parallel trends, along with robustness checks that shift adoption dates to test for anticipation. These will be placed in a new subsection of the Empirical Strategy section and referenced in the Results. revision: yes

  2. Referee: [Data and Sample] Data and Sample section: AI coding agent adoption timing is assumed exogenous conditional on fixed effects, but no tests, discussions, or sensitivity analyses address potential endogeneity (e.g., adoption driven by declining human participation or project maturity), which could bias the reported effects on the outcome variables.

    Authors: We acknowledge the need for greater transparency on this point. The revised version will include an expanded discussion of the exogeneity assumption conditional on fixed effects and add sensitivity analyses examining whether pre-adoption trends in contributor outcomes predict adoption timing. We will also report results from alternative specifications that control for project maturity proxies. Full resolution of all endogeneity channels may be limited by available covariates, which we will note. revision: partial

Circularity Check

0 steps flagged

No circularity: results are data-driven ATT estimates from external GitHub repositories via standard Sun-Abraham estimator

full rationale

The paper reports causal estimates (ATT on contributor count, density, newcomer share, review depth) obtained by applying the Sun and Abraham (2021) staggered DiD estimator to an external dataset of 11,097 GitHub repositories. No equations, parameters, or predictions are defined in terms of the target outcomes; the estimator is an off-the-shelf method whose validity rests on external identifying assumptions (parallel trends, conditional exogeneity of adoption timing) rather than any self-referential construction. No self-citations are load-bearing, no fitted inputs are relabeled as predictions, and no ansatz or uniqueness claim reduces the results to the inputs by definition. The derivation chain is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The study relies on standard econometric assumptions for causal inference in a staggered adoption setting rather than introducing new free parameters, axioms beyond domain standards, or invented entities.

axioms (1)
  • domain assumption The parallel trends assumption holds for the staggered adoption of AI coding agents across the selected repositories.
    This is required for unbiased causal identification using the difference-in-differences estimator.

pith-pipeline@v0.9.1-grok · 5835 in / 1310 out tokens · 34260 ms · 2026-06-26T01:18:36.097562+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

28 extracted references · 1 canonical work pages · 1 internal anchor

  1. [1]

    Agentic much? adoption of coding agents on github,

    R. Robbes, T. Matricon, T. Degueule, A. Hora, and S. Zacchiroli, “Agentic much? adoption of coding agents on github,”arXiv preprint arXiv:2601.18341, 2026

  2. [2]

    A systematic literature review on the barriers faced by newcomers to open source software projects,

    I. Steinmacher, M. A. G. Silva, M. A. Gerosa, and D. F. Redmiles, “A systematic literature review on the barriers faced by newcomers to open source software projects,”Information and Software Technology, vol. 59, pp. 67–85, 2015

  3. [3]

    Why do people give up flossing? a study of contributor disengagement in open source,

    C. Miller, D. G. Widder, C. K ¨astner, and B. Vasilescu, “Why do people give up flossing? a study of contributor disengagement in open source,” inIFIP International Conference on Open Source Systems. Springer, 2019, pp. 116–129

  4. [4]

    Gender and tenure diversity in github teams,

    B. Vasilescu, D. Posnett, B. Ray, M. G. van den Brand, A. Serebrenik, P. Devanbu, and V . Filkov, “Gender and tenure diversity in github teams,” inProceedings of the 33rd annual ACM conference on human factors in computing systems, 2015, pp. 3789–3798

  5. [5]

    How ai coding agents modify code: A large-scale study of github pull requests,

    D. Ogenrwot and J. Businge, “How ai coding agents modify code: A large-scale study of github pull requests,”arXiv preprint arXiv:2601.17581, 2026

  6. [6]

    Will it survive? deciphering the fate of ai- generated code in open source,

    M. Rahman and E. Shihab, “Will it survive? deciphering the fate of ai- generated code in open source,”arXiv preprint arXiv:2601.16809, 2026

  7. [7]

    Debt behind the ai boom: A large-scale empirical study of ai-generated code in the wild,

    Y . Liu, R. Widyasari, Y . Zhao, I. C. Irsan, J. Chen, and D. Lo, “Debt behind the ai boom: A large-scale empirical study of ai-generated code in the wild,”arXiv preprint arXiv:2603.28592, 2026

  8. [8]

    On autopilot? an empirical study of human-ai teaming and review practices in open source,

    H. Gao, P. Banyongrakkul, H. Guan, M. Zahedi, and C. Treude, “On autopilot? an empirical study of human-ai teaming and review practices in open source,”arXiv preprint arXiv:2601.13754, 2026

  9. [9]

    AI IDEs or autonomous agents? Measuring the impact of coding agents on software development,

    S. Agarwal, H. He, and B. Vasilescu, “AI IDEs or autonomous agents? Measuring the impact of coding agents on software development,” in Proceedings of the 23rd International Conference on Mining Software Repositories (MSR), 2026

  10. [10]

    We are changing our developer productivity experiment design,

    J. Becker, N. Rush, T. Cunningham, D. Rein, and K. Mahamud, “We are changing our developer productivity experiment design,” https://metr. org/blog/2026-02-24-uplift-update/, 02 2026

  11. [11]

    Work practices and challenges in pull-based development: The integrator’s perspective,

    G. Gousios, A. Zaidman, M.-A. Storey, and A. Van Deursen, “Work practices and challenges in pull-based development: The integrator’s perspective,” in2015 IEEE/ACM 37th IEEE International Conference on Software Engineering, vol. 1. IEEE, 2015, pp. 358–368

  12. [12]

    Difference-in-differences with multi- ple time periods,

    B. Callaway and P. H. Sant’Anna, “Difference-in-differences with multi- ple time periods,”Journal of econometrics, vol. 225, no. 2, pp. 200–230, 2021

  13. [13]

    Towards causal analysis of empir- ical software engineering data: The impact of programming languages on coding competitions,

    C. A. Furia, R. Torkar, and R. Feldt, “Towards causal analysis of empir- ical software engineering data: The impact of programming languages on coding competitions,”ACM Transactions on Software Engineering and Methodology, vol. 33, no. 1, pp. 1–35, 2023

  14. [14]

    Replication package for ‘Augmentation with Dilution: A Large-Scale Empirical Study of Human Contributor Ecosystems After AI Coding Agent Adoption’,

    A. Author, “Replication package for ‘Augmentation with Dilution: A Large-Scale Empirical Study of Human Contributor Ecosystems After AI Coding Agent Adoption’,” 2026. [Online]. Available: https://osf.io/ phzk7/overview?view only=73979e746c5f4a0aa59317b5457204ff

  15. [15]

    Where do ai coding agents fail? an empirical study of failed agentic pull requests in github,

    R. Ehsani, S. Pathak, S. Rawal, A. A. Mujahid, M. M. Imran, and P. Chatterjee, “Where do ai coding agents fail? an empirical study of failed agentic pull requests in github,”arXiv preprint arXiv:2601.15195, 2026

  16. [16]

    On the use of agentic coding: An empirical study of pull requests on github,

    M. Watanabe, H. Li, Y . Kashiwa, B. Reid, H. Iida, and A. E. Hassan, “On the use of agentic coding: An empirical study of pull requests on github,”ACM Transactions on Software Engineering and Methodology, 2025

  17. [17]

    The end of code review: Coding agents supersede human inspection,

    M. Monperrus, “The end of code review: Coding agents supersede human inspection,”arXiv preprint arXiv:2606.13175, 2026

  18. [18]

    Ecosystem-level determinants of sustained activity in open-source projects: A case study of the pypi ecosystem,

    M. Valiev, B. Vasilescu, and J. Herbsleb, “Ecosystem-level determinants of sustained activity in open-source projects: A case study of the pypi ecosystem,” inProceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2018, pp. 644–655

  19. [19]

    The Rise of AI Teammates in Software Engineering (SE) 3.0: How Autonomous Coding Agents Are Reshaping Software Engineering

    H. Li, H. Zhang, and A. E. Hassan, “The rise of AI teammates in software engineering (SE) 3.0: How autonomous coding agents are reshaping software engineering,” 2025. [Online]. Available: https://doi.org/10.48550/arXiv.2507.15003

  20. [20]

    Speed at the cost of quality: How Cursor AI increases short-term velocity and long-term complexity in open-source projects,

    H. He, C. Miller, S. Agarwal, C. K ¨astner, and B. Vasilescu, “Speed at the cost of quality: How Cursor AI increases short-term velocity and long-term complexity in open-source projects,” inProceedings of the 23rd International Conference on Mining Software Repositories (MSR), 2026

  21. [21]

    The central role of the propensity score in observational studies for causal effects,

    P. R. Rosenbaum and D. B. Rubin, “The central role of the propensity score in observational studies for causal effects,”Biometrika, vol. 70, no. 1, pp. 41–55, 1983

  22. [22]

    An introduction to propensity score methods for reduc- ing the effects of confounding in observational studies,

    P. C. Austin, “An introduction to propensity score methods for reduc- ing the effects of confounding in observational studies,”Multivariate behavioral research, vol. 46, no. 3, pp. 399–424, 2011

  23. [23]

    Two-way fixed effects, the two-way mundlak regres- sion, and difference-in-differences estimators,

    J. M. Wooldridge, “Two-way fixed effects, the two-way mundlak regres- sion, and difference-in-differences estimators,”Empirical Economics, vol. 69, no. 5, pp. 2545–2587, 2025

  24. [24]

    Estimating dynamic treatment effects in event studies with heterogeneous treatment effects,

    L. Sun and S. Abraham, “Estimating dynamic treatment effects in event studies with heterogeneous treatment effects,”Journal of econometrics, vol. 225, no. 2, pp. 175–199, 2021

  25. [25]

    Early-stage prediction of review effort in ai-generated pull requests,

    D. S. D. Minh, H. T. Kiet, N. L. P. Quy, P. P. Hoa, T. C. Nguyen, N. D. H. Duong, and T. B. Tran, “Early-stage prediction of review effort in ai-generated pull requests,”arXiv preprint arXiv:2601.00753, 2026

  26. [26]

    Influence of social and technical factors for evaluating contribution in github,

    J. Tsay, L. Dabbish, and J. Herbsleb, “Influence of social and technical factors for evaluating contribution in github,” inProceedings of the 36th international conference on Software engineering, 2014, pp. 356–366

  27. [27]

    From industry claims to empirical reality: An empirical study of code review agents in pull requests,

    K. Chowdhury, D. Banik, K. Ferdous, and S. I. Shamim, “From industry claims to empirical reality: An empirical study of code review agents in pull requests,”arXiv preprint arXiv:2604.03196, 2026

  28. [28]

    An exploratory study of the pull-based software development model,

    G. Gousios, M. Pinzger, and A. v. Deursen, “An exploratory study of the pull-based software development model,” inProceedings of the 36th international conference on software engineering, 2014, pp. 345–355