pith. sign in

arxiv: 2604.13129 · v1 · submitted 2026-04-13 · 💻 cs.SE

How Developers Adopt, Use, and Evolve CI/CD Caching: An Empirical Study on GitHub Actions

Pith reviewed 2026-05-10 15:02 UTC · model grok-4.3

classification 💻 cs.SE
keywords CI/CD cachingGitHub Actionsempirical studyworkflow evolutionmaintenance patternscontinuous integrationsoftware engineering
0
0 comments X

The pith

Cache-adopting GitHub repositories are more active than non-adopters, and their caching setups evolve through frequent human-driven fixes and later bot-driven version updates.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper analyzes 952 GitHub repositories to compare those that adopt CI/CD caching in GitHub Actions against those that do not. It examines 1,556 workflow files and over 10,000 commits to characterize how caching is configured at the job and step levels and how those configurations change over time. The study finds that adopters tend to run more active and popular projects, that caching appears in many different job types through varied mechanisms, and that updates follow repetitive patterns driven by distinct needs. A reader would care because caching reduces repeated work in builds and tests, yet the data show it demands ongoing maintenance that could be eased with better automation.

Core claim

Through examination of 266 cache-adopting repositories and 686 non-adopters, the work establishes that cache adopters exhibit greater activity and popularity; that caching is applied across multiple CI/CD job types using a variety of mechanisms rather than one standard approach; that caching configurations undergo frequent, repetitive changes with faster evolution in build and test jobs; and that parameter updates are mainly human-driven to resolve issues while version updates occur later and are frequently bot-driven for dependency maintenance.

What carries the argument

The classification of 17,185 workflow configuration changes across 10,373 commits, distinguishing cache-related modifications by type (parameter vs version) and by actor (human vs bot) within GitHub Actions workflow files.

If this is right

  • Cache-adopting repositories show higher levels of activity and popularity compared with non-adopters.
  • Caching appears across many CI/CD job types through diverse mechanisms instead of a single standardized method.
  • Caching configurations change frequently in repetitive patterns, with quicker evolution in build and test jobs than in other types.
  • Parameter updates are driven mainly by humans to fix problems, whereas version updates happen later and are often performed by bots for dependency maintenance.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Tooling that automates parameter tuning based on common failure patterns could reduce the human maintenance burden documented in the study.
  • The observed variety of caching approaches points to a need for platform-level defaults or templates that might lower the barrier to effective use.
  • The distinction between human and bot drivers suggests that dependency bots already handle part of the work, leaving opportunity to extend similar automation to parameter-level fixes.

Load-bearing premise

The 952 repositories and their parsed workflows and commits form a representative sample of GitHub Actions usage without major selection or parsing bias.

What would settle it

A larger or differently sampled study that finds no difference in activity or popularity between cache-adopting and non-adopting repositories would undermine the reported observations.

Figures

Figures reproduced from arXiv: 2604.13129 by Kazi Amit Hasan, Safwat Hassan, Steven H. H. Ding, Yuan Tian.

Figure 1
Figure 1. Figure 1: An example of GitHub Actions workflow illustrating jobs, steps, and [PITH_FULL_IMAGE:figures/full_fig_p005_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Overview of our study. 3 Data Collection and Preparation In this section, we describe how we collected and prepared data for our empirical study [PITH_FULL_IMAGE:figures/full_fig_p007_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Distributions of repository characteristics for cache-adopting (with) and [PITH_FULL_IMAGE:figures/full_fig_p011_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Cache-related maintenance activities in build jobs after enabling caching. Edges report transition probability and time-to-transition in days. In the following paragraphs, we describe how caching configurations evolve over time through cache maintenance activities across CI/CD job types. We begin by describing the overall cache evolution patterns visible in the tran￾sition plots, and then report the domina… view at source ↗
Figure 5
Figure 5. Figure 5: Cache-related maintenance activities in test jobs after enabling caching. Edges report transition probability and time-to-transition in days. refer to the transition probabilities and the median time between consecutive activities. The complete evolution tables for all job types are provided in Appendix A. Observation 2.1: Cache evolution across CI/CD job types follows an iterative pattern of enabling, upd… view at source ↗
Figure 6
Figure 6. Figure 6: Cache-related maintenance activities in integration jobs after enabling caching. Edges report transition probability and time-to-transition in days. roll back previous caching decisions. This suggests that cache configuration is an iterative process involving repeated updates and adjustments, rather than a single-pass setup. Observation 2.2: The structural complexity of cache evolution graphs differs acros… view at source ↗
Figure 7
Figure 7. Figure 7: Cache-related maintenance activities in release jobs after enabling caching. Edges report transition probability and time-to-transition in days. Observation 2.3: Cache maintenance in several CI/CD job types is dominated by repeated self-loop transitions. While Observation 2.1 shows that cache evolution generally involves enabling, updating, adding, and sometimes removing cache configurations, the transitio… view at source ↗
Figure 8
Figure 8. Figure 8: Cache-related maintenance activities in lint jobs after enabling caching. Edges report transition probability and time-to-transition in days. Observation 2.4: Parameter updates occur earlier than cache version updates, but the delay differs across job types. Shortly after enabling cache (EC), developers engage in parameter tuning (EC → P up) relatively quickly, with a median of 105.93 days for Build jobs (… view at source ↗
Figure 9
Figure 9. Figure 9: Cache-related maintenance activities in analyze jobs after enabling caching. Edges report transition probability and time-to-transition in days. Observation 2.5: Cache removal is often part of an immediate re￾placement. In build jobs, the transition from removing a cache to immediately adding a new one (C rm → C add) occurs with 21.84% probability and a median of 0.01 days ( [PITH_FULL_IMAGE:figures/full_… view at source ↗
Figure 10
Figure 10. Figure 10: Cache-related maintenance activities in linux jobs after enabling caching. Edges report transition probability and time-to-transition in days. times of 53.14 days in build jobs and 41.97 days in test jobs. This suggests a two-stage evolution pattern in build and test jobs. Developers first repeatedly refine cache-related parameters soon after adoption, likely to stabilize the initial setup, and only later… view at source ↗
Figure 11
Figure 11. Figure 11: Cache-related maintenance activities in sync jobs after enabling caching. Edges report transition probability and time-to-transition in days. RQ2 Summary: Cache evolution is an iterative and structurally complex process. Across all job types, cache evolution consistently involves maintaining existing configurations through parameter up￾dates and cache version updates, often accompanied by incremental addi… view at source ↗
read the original abstract

Continuous Integration/Continuous Delivery (CI/CD) caching is widely used to reduce repeated computation and improve CI/CD efficiency, yet maintaining effective caching requires ongoing maintenance effort. In this paper, we present the first empirical study on how developers configure and evolve caching in CI/CD workflows on GitHub Actions. We analyze 952 GitHub repositories (266 cache adopters and 686 non-adopters), to compare repository characteristics, characterize caching usage at the job and step levels, uncover patterns in caching configuration evolution, and identify the drivers of cache-related changes. Our analysis spans 1,556 workflow files, 10,373 commits, and 17,185 workflow configuration changes, including an average of 9.37 cache-related changes per repository. Our main observations are: (1) cache-adopting repositories are more active and popular than non-adopters; (2) caching is used across multiple CI/CD job types through a variety of caching mechanisms rather than a single standardized approach; (3) caching configurations evolve through frequent, repetitive maintenance patterns, with rapid updates in build and test jobs and slower evolution in other job types; and (4) cache-related modifications are driven by distinct maintenance needs: parameter updates are mainly human-driven to fix issues, while version updates occur later and are often bot-driven for dependency maintenance. Our findings quantify the substantial maintenance effort involved in CI/CD caching and highlight opportunities to improve reliability and tool support.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper presents the first empirical study of CI/CD caching adoption, usage, and evolution in GitHub Actions. It analyzes 952 repositories (266 cache adopters vs. 686 non-adopters), 1,556 workflow files, 10,373 commits, and 17,185 configuration changes. Key claims are that (1) cache-adopting repositories are more active and popular, (2) caching appears across job types via diverse mechanisms rather than a single standard, (3) configurations evolve through frequent repetitive maintenance (rapid in build/test jobs, slower elsewhere), and (4) parameter updates are mostly human-driven for fixes while version updates are later and often bot-driven for dependencies. The work quantifies maintenance effort and suggests tool improvements.

Significance. If the dataset and classifications are representative, the study offers concrete, large-scale evidence on the practical costs of CI/CD caching in open-source projects. It is the first such analysis focused on GitHub Actions, provides falsifiable patterns (e.g., job-type differences in evolution speed), and directly supports recommendations for better caching tooling. The public-data basis in principle allows replication.

major comments (2)
  1. [§3] §3 (Data Collection and Filtering): The selection of the 952 repositories and the detection of the 266 cache adopters are described only at a high level. No explicit search queries, date ranges, popularity thresholds, or exclusion criteria are provided, nor is there validation that the parsing correctly identifies cache steps (e.g., actions/cache or equivalent). Because every subsequent comparison and statistic conditions on this sample, the risk of selection bias directly undermines observation (1) and the generalizability of (2)–(4).
  2. [§4.3, §5] §4.3 and §5 (Evolution Analysis): The classification of 17,185 configuration changes into human- vs. bot-driven and the attribution of “parameter updates” vs. “version updates” lacks a reproducible rule set or inter-rater validation. Without these details it is impossible to assess whether the reported timing differences and driver patterns in observation (4) are robust or artifacts of the commit-message heuristics used.
minor comments (2)
  1. [Table 2] Table 2 (or equivalent) reports average changes per repository but does not include standard deviations or confidence intervals, making it harder to judge the variability behind the “9.37 cache-related changes” figure.
  2. [Abstract] The abstract states “the first empirical study” without a brief related-work sentence; a single sentence citing the closest prior CI/CD or GitHub Actions studies would strengthen the novelty claim.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. We address each major comment below and outline the revisions we will incorporate to improve clarity, reproducibility, and robustness.

read point-by-point responses
  1. Referee: [§3] §3 (Data Collection and Filtering): The selection of the 952 repositories and the detection of the 266 cache adopters are described only at a high level. No explicit search queries, date ranges, popularity thresholds, or exclusion criteria are provided, nor is there validation that the parsing correctly identifies cache steps (e.g., actions/cache or equivalent). Because every subsequent comparison and statistic conditions on this sample, the risk of selection bias directly undermines observation (1) and the generalizability of (2)–(4).

    Authors: We agree that the original description of the repository selection and cache-step detection process was at a high level and that greater transparency is required to evaluate selection bias and replicability. In the revised manuscript we will expand §3 with the precise GitHub search queries, the exact date range used for repository discovery, all popularity and activity thresholds applied, the full list of exclusion criteria, and a detailed account of the parsing rules (including regular expressions and heuristics) used to identify cache steps such as actions/cache. We will also add a short validation subsection describing how we manually inspected a random sample of detected workflows to confirm correct identification of cache usage. These additions will directly support the generalizability claims in observations (1)–(4). revision: yes

  2. Referee: [§4.3, §5] §4.3 and §5 (Evolution Analysis): The classification of 17,185 configuration changes into human- vs. bot-driven and the attribution of “parameter updates” vs. “version updates” lacks a reproducible rule set or inter-rater validation. Without these details it is impossible to assess whether the reported timing differences and driver patterns in observation (4) are robust or artifacts of the commit-message heuristics used.

    Authors: We acknowledge that the classification rules for human- versus bot-driven changes and for parameter versus version updates were not presented with sufficient detail or validation evidence. In the revised version we will add an explicit, reproducible rule set in §4.3: bot detection will be defined by a combination of author login patterns (e.g., known bot accounts), commit-message keywords (e.g., “dependabot”, “renovate”, “auto-update”), and commit frequency heuristics; parameter updates will be distinguished from version updates by whether the changed fields affect cache-key parameters versus action version specifiers. We will include concrete examples of each category and report the results of a manual validation performed on a stratified sample of 200 changes (with inter-rater agreement statistics). This will allow readers to judge the robustness of the timing and driver patterns reported in observation (4). revision: yes

Circularity Check

0 steps flagged

No circularity: empirical observations derived from independent public data analysis

full rationale

The paper conducts an empirical study by collecting and analyzing public GitHub repository data, workflow files, and commits to derive four observations about CI/CD caching adoption and evolution. No self-definitional relations, fitted parameters renamed as predictions, or load-bearing self-citations appear in the derivation chain; the central claims rest on direct comparison of adopter vs. non-adopter statistics and change patterns extracted from the dataset. Sampling and parsing choices may affect generalizability or introduce bias (a validity concern), but they do not reduce any result to its inputs by construction, as required for circularity. The analysis is self-contained against external benchmarks and does not invoke uniqueness theorems or ansatzes from prior author work.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The empirical claims depend on the validity of data extraction from GitHub and the categorization of changes as human or bot-driven.

axioms (1)
  • domain assumption The chosen repositories and time periods reflect general developer practices in using GitHub Actions.
    Basis for generalizing the findings beyond the sample.

pith-pipeline@v0.9.0 · 5570 in / 1235 out tokens · 83477 ms · 2026-05-10T15:02:34.668235+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

30 extracted references · 30 canonical work pages

  1. [1]

    arXiv preprint arXiv:250718062

    Abrokwah E, Ghaleb TA (2025) An empirical study of complexity, heterogeneity, and compliance of github actions workflows. arXiv preprint arXiv:250718062

  2. [2]

    arXiv preprint arXiv:250616453

    AlMulla B, Assi M, Hassan S (2025) Understanding the challenges and promises of developing generative ai apps: An empirical study. arXiv preprint arXiv:250616453

  3. [3]

    Empirical Software Engineering 24(3):1259--1295

    Baltes S, Diehl S (2019) Usage and attribution of stack overflow code snippets in github projects. Empirical Software Engineering 24(3):1259--1295

  4. [4]

    In: Proceedings of the 2022 ACM Workshop on Software Supply Chain Offensive Research and Ecosystem Defenses, pp 37--45

    Benedetti G, Verderame L, Merlo A (2022) Automatic security assessment of github actions workflows. In: Proceedings of the 2022 ACM Workshop on Software Supply Chain Offensive Research and Ecosystem Defenses, pp 37--45

  5. [5]

    In: Proceedings of the 46th IEEE/ACM International Conference on Software Engineering, pp 1--12

    Bouzenia I, Pradel M (2024) Resource usage and optimization opportunities in workflows of github actions. In: Proceedings of the 46th IEEE/ACM International Conference on Software Engineering, pp 1--12

  6. [6]

    In: 2021 IEEE 21st International Conference on Software Quality, Reliability and Security Companion (QRS-C), IEEE, pp 01--10

    Chen T, Zhang Y, Chen S, Wang T, Wu Y (2021) Let's supercharge the workflows: An empirical study of github actions. In: 2021 IEEE 21st International Conference on Software Quality, Reliability and Security Companion (QRS-C), IEEE, pp 01--10

  7. [7]

    In: 2022 IEEE International Conference on Software Maintenance and Evolution (ICSME), IEEE, pp 235--245

    Decan A, Mens T, Mazrae PR, Golzadeh M (2022) On the use of github actions in software development repositories. In: 2022 IEEE International Conference on Software Maintenance and Evolution (ICSME), IEEE, pp 235--245

  8. [8]

    Journal of Systems and Software 206:111827

    Decan A, Mens T, Delicheh HO (2023) On the outdatedness of workflows in the github actions ecosystem. Journal of Systems and Software 206:111827

  9. [9]

    John Wiley & Sons

    Gagniuc PA (2017) Markov chains: from theory to implementation and experimentation. John Wiley & Sons

  10. [10]

    In: 2019 IEEE International Conference on Software Maintenance and Evolution (ICSME), IEEE, pp 619--623

    Gallaba K (2019) Improving the robustness and efficiency of continuous integration and deployment. In: 2019 IEEE International Conference on Software Maintenance and Evolution (ICSME), IEEE, pp 619--623

  11. [11]

    IEEE Transactions on Software Engineering 48(6):2040--2052

    Gallaba K, Ewart J, Junqueira Y, McIntosh S (2020) Accelerating continuous integration by caching environments and inferring dependencies. IEEE Transactions on Software Engineering 48(6):2040--2052

  12. [12]

    ACM Transactions on Software Engineering and Methodology 35(2):1--40

    Ghaleb TA, Abduljalil O, Hassan S (2026 a ) Ci/cd configuration practices in open source android apps: An empirical study. ACM Transactions on Software Engineering and Methodology 35(2):1--40

  13. [13]

    arXiv preprint arXiv:260119146

    Ghaleb TA, da Costa DA, Zou Y (2026 b ) The promise and reality of continuous integration caching: An empirical study of travis ci builds. arXiv preprint arXiv:260119146

  14. [14]

    Empirical Software Engineering 29(6):150

    Hao H, Hasan KA, Qin H, Macedo M, Tian Y, Ding SH, Hassan AE (2024) An empirical study on developers’ shared conversations with chatgpt in github pull requests and issues. Empirical Software Engineering 29(6):150

  15. [15]

    ACM Transactions on Software Engineering and Methodology

    Huang J, Lin B (2026) On the reruns of github actions workflows. ACM Transactions on Software Engineering and Methodology

  16. [16]

    Tools, And Outcomes

    Rahman N (2023) Exploring the role of continuous integration and continuous deployment (ci/cd) in enhancing automation in modern software development: A study of patterns. Tools, And Outcomes

  17. [17]

    Available at SSRN 5369484

    Rostami Mazrae P, Decan A, Mens T, Wessel M (2025) An empirical study of the evolution of github actions workflows. Available at SSRN 5369484

  18. [18]

    IEEE access 5:3909--3943

    Shahin M, Babar MA, Zhu L (2017) Continuous integration, delivery and deployment: a systematic review on approaches, tools, challenges and practices. IEEE access 5:3909--3943

  19. [19]

    In: 2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER), IEEE, pp 123--127

    Valenzuela-Toledo P, Bergel A (2022) Evolution of github action workflows. In: 2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER), IEEE, pp 123--127

  20. [20]

    In: 2024 IEEE International Conference on Source Code Analysis and Manipulation (SCAM), IEEE, pp 213--223

    Valenzuela-Toledo P, Bergel A, Kehrer T, Nierstrasz O (2024) The hidden costs of automation: An empirical study on github actions workflow maintenance. In: 2024 IEEE International Conference on Source Code Analysis and Manipulation (SCAM), IEEE, pp 213--223

  21. [21]

    Empirical Software Engineering 28(6):131

    Wessel M, Vargovich J, Gerosa MA, Treude C (2023) Github actions: the impact on the pull request process. Empirical Software Engineering 28(6):131

  22. [22]

    Journal of Computer, Signal, and System Research 2(3):59--68

    Yang S (2025) The impact of continuous integration and continuous delivery on software development efficiency. Journal of Computer, Signal, and System Research 2(3):59--68

  23. [23]

    ACM Transactions on Software Engineering and Methodology

    Zheng L, Li S, Huang X, Huang J, Lin B, Chen J, Xuan J (2025) Why do github actions workflows fail? an empirical study. ACM Transactions on Software Engineering and Methodology

  24. [24]

    Zheng S, Adams B, Hassan AE (2024) Does using bazel help speed up continuous integration builds? Empirical Software Engineering 29(5):110

  25. [25]

    , " * write output.state after.block = add.period write newline

    ENTRY address archive author booktitle chapter doi edition editor eid eprint howpublished institution journal key month note number organization pages publisher school series title type url volume year label extra.label sort.label short.list INTEGERS output.state before.all mid.sentence after.sentence after.block FUNCTION init.state.consts #0 'before.all ...

  26. [26]

    write newline

    " write newline "" before.all 'output.state := FUNCTION add.period duplicate empty 'skip "." * add.blank if FUNCTION if.digit duplicate "0" = swap duplicate "1" = swap duplicate "2" = swap duplicate "3" = swap duplicate "4" = swap duplicate "5" = swap duplicate "6" = swap duplicate "7" = swap duplicate "8" = swap "9" = or or or or or or or or or FUNCTION ...

  27. [27]

    , " * write output.state after.block = add.period write newline

    ENTRY address author booktitle chapter doi edition editor eid howpublished institution journal key month note number organization pages publisher school series title type url volume year label INTEGERS output.state before.all mid.sentence after.sentence after.block FUNCTION init.state.consts #0 'before.all := #1 'mid.sentence := #2 'after.sentence := #3 '...

  28. [28]

    write newline

    " write newline "" before.all 'output.state := FUNCTION if.digit duplicate "0" = swap duplicate "1" = swap duplicate "2" = swap duplicate "3" = swap duplicate "4" = swap duplicate "5" = swap duplicate "6" = swap duplicate "7" = swap duplicate "8" = swap "9" = or or or or or or or or or FUNCTION n.separate 't := "" #0 'numnames := t empty not t #-1 #1 subs...

  29. [29]

    , " * write output.state after.block = add.period write newline

    ENTRY address author booktitle chapter doi edition editor eid howpublished institution journal key month note number organization pages publisher school series title type url volume year label INTEGERS output.state before.all mid.sentence after.sentence after.block FUNCTION init.state.consts #0 'before.all := #1 'mid.sentence := #2 'after.sentence := #3 '...

  30. [30]

    write newline

    " write newline "" before.all 'output.state := FUNCTION if.digit duplicate "0" = swap duplicate "1" = swap duplicate "2" = swap duplicate "3" = swap duplicate "4" = swap duplicate "5" = swap duplicate "6" = swap duplicate "7" = swap duplicate "8" = swap "9" = or or or or or or or or or FUNCTION n.separate 't := "" #0 'numnames := t empty not t #-1 #1 subs...