pith. sign in

arxiv: 2604.28146 · v1 · submitted 2026-04-30 · 💻 cs.SE

Unsafe and Unused? A History of Utility Code in Mature Open Source Projects

Pith reviewed 2026-05-07 04:59 UTC · model grok-4.3

classification 💻 cs.SE
keywords utility codenaming conventionssoftware vulnerabilitiesopen source evolutionlongitudinal miningGit history analysiscode reusesecurity correlation
0
0 comments X

The pith

Files with 'util' in their name are up to 2.75 times more likely to be tied to vulnerabilities than other files in mature open source projects.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tracks source files containing 'util' in their path names across the full Git histories of seven long-lived open source projects. It takes snapshots every 30 days, follows renames, and measures how these files change in complexity, who works on them, and how often they appear in vulnerability reports over 147 total project-years. The central finding is that util files do not fade away as projects mature but instead show a markedly higher association with security issues. This pattern holds even though the naming convention is meant to mark code for broad reuse and to avoid duplication. The results indicate that a simple filename choice can serve as a durable signal of where security attention may be needed.

Core claim

By mining commit histories at 30-day intervals and applying rename tracking, the study establishes that util files persist throughout the lifetime of mature projects and can be as much as 2.75 times more likely to be involved in a vulnerability than non-util files. The same longitudinal data also show correlations between util naming, code complexity, and the number of developers who touch the files.

What carries the argument

The 30-day snapshot mining of Git repositories with rename tracking that follows each util file across its entire lifetime in the codebase while linking it to vulnerability reports, complexity metrics, and collaboration counts.

If this is right

  • Util files remain in the codebase rather than being removed or refactored away as projects age.
  • These files tend to receive contributions from fewer developers than other parts of the system.
  • Security issues concentrate in util files at higher rates, suggesting targeted review could reduce overall project risk.
  • Projects that adopt the util naming pattern heavily, such as Tomcat with nearly 18 percent of its files, inherit a persistent security exposure.
  • Initial design decisions about reusable code can produce measurable differences in vulnerability exposure decades later.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • A filename containing 'util' could function as an automatic flag for extra security review in large codebases.
  • Tools that scan for common naming patterns might help new contributors avoid creating files that later become maintenance hotspots.
  • The persistence of util files raises the question of whether other informal naming conventions carry similar long-term risk signals.
  • If the pattern holds in additional projects, organizations could adjust code-organization guidelines to favor more specific names over generic utility directories.

Load-bearing premise

Vulnerability reports can be reliably mapped to the exact source files that introduced the issue, and the string 'util' in a filename accurately marks code that was written with the intent of being reused by many developers rather than serving as a generic catch-all.

What would settle it

Re-mapping the same vulnerability reports to files using stricter authorship or commit-blame rules and finding that the 2.75-times elevation for util files disappears.

Figures

Figures reproduced from arXiv: 2604.28146 by Andy Meneely, Angela Ngo, Brandon Keller, Kaitlin Yandik.

Figure 1
Figure 1. Figure 1: Prevalence of util Files across all seven projects, excluding test util files. HTTPD to over 19% of files being util in Apache Tomcat. A graph of util file prevalence is shown in view at source ↗
Figure 4
Figure 4. Figure 4: Collaboration Patterns in the Django Project view at source ↗
Figure 5
Figure 5. Figure 5: Evolution of Odds Ratios for each project. Test Files view at source ↗
read the original abstract

Filenames are a concise means of conveying information about source code to fellow developers. One such convention is util. Commonly understood to stand for "utility", filenames with the letters util are often an indication that the file contains code that may be broadly useful or reusable. Some projects use this convention heavily, for example, the Apache Tomcat server contains 925 files with util in the path name, which is 17.9% of all source code files in the tree. While the intent of the name may be to prevent duplicate code and reduce workload, what actually happens to util code over time? Do projects move away from util code as they mature? Are util files being used by fellow colleagues, or maintained and used by their author? The goal of our work is to help developers avoid creating unsafe and unused util files when developing their projects. We conducted a longitudinal mining study of the Git repositories of seven open source projects that have a long development history (Linux kernel, Django, FFmpeg, httpd, Struts, systemd, Tomcat). We analyzed how util usage, complexity, developer collaboration, and security are potentially correlated within these projects. Our longitudinal analysis was measured at 30-day intervals throughout the entire history of each project, resulting in 1773 snapshots over 147 project-years of development. We conducted rename tracking at every 30-day snapshot to examine util files over their entire lifetime in a codebase. For example, we found that a util file can be as much as 2.75 times more likely to be involved in a vulnerability than non-util files. While every project can adopt their own naming conventions, the ubiquity and longevity of util files shows a broader developer intent that is useful for understanding the socio-technical nature of software development.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper reports a longitudinal mining study of 'util' files (those with 'util' in the filename or path) across seven mature open source projects (Linux kernel, Django, FFmpeg, httpd, Struts, systemd, Tomcat). It analyzes 1773 snapshots at 30-day intervals over 147 project-years, incorporating rename tracking to follow file lifetimes. The study examines correlations with file usage, complexity, developer collaboration, and security, with the headline result that util files can be as much as 2.75 times more likely to be involved in a vulnerability than non-util files. The work aims to inform developers on avoiding unsafe or unused utility code.

Significance. If the vulnerability association is robust to mapping accuracy and confounding factors, the study offers a large-scale empirical view of how a common naming convention evolves in long-lived projects and its potential security implications. The longitudinal design with rename tracking across 147 project-years is a clear strength, providing socio-technical insights into code reuse practices that could guide naming guidelines in open-source development. The scale of the dataset strengthens the descriptive findings on persistence and collaboration patterns.

major comments (3)
  1. [§4] §4 (Data Collection and Vulnerability Mapping): The paper provides no description of the vulnerability data sources (e.g., specific CVE databases, bug trackers, or fixing commits), the operationalization of 'involved in a vulnerability' (e.g., whether a file must contain the root cause versus appearing in any patch), or any validation such as a manual review set or inter-rater agreement. This directly undermines the 2.75x likelihood claim, as inaccurate attribution of incidental changes could produce the observed ratio without reflecting properties of util code.
  2. [§5.2] §5.2 (Security Results): No statistical controls or matching are reported for known confounders such as file size, age, centrality in the call graph, or number of contributors, all of which correlate with vulnerability likelihood in prior work. Without regression, propensity-score matching, or stratification, the 2.75x ratio cannot be interpreted as an effect of the 'util' naming convention itself.
  3. [§3] §3 (Definition of Util Files): Reliance on string matching for 'util' in paths is presented without validation that this captures developer intent for broadly reusable code rather than catch-all directories. This assumption affects all downstream correlations (usage, complexity, collaboration) and should be tested against a sample of manual classifications.
minor comments (2)
  1. [Abstract] The abstract states the 2.75x result without referencing the specific table or figure that reports it; add an explicit cross-reference.
  2. [§5.1] Longitudinal plots (e.g., of util file counts over time) would benefit from confidence bands or per-project breakdowns to clarify variability across the seven projects.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We are grateful to the referee for their careful reading and valuable suggestions for improving the paper's methodological transparency and statistical rigor. We have prepared revisions to address all major comments and respond point-by-point below.

read point-by-point responses
  1. Referee: [§4] §4 (Data Collection and Vulnerability Mapping): The paper provides no description of the vulnerability data sources (e.g., specific CVE databases, bug trackers, or fixing commits), the operationalization of 'involved in a vulnerability' (e.g., whether a file must contain the root cause versus appearing in any patch), or any validation such as a manual review set or inter-rater agreement. This directly undermines the 2.75x likelihood claim, as inaccurate attribution of incidental changes could produce the observed ratio without reflecting properties of util code.

    Authors: We agree with the referee that the description of vulnerability data collection was insufficient in the original manuscript. In the revised version, we will expand Section 4 to specify the vulnerability data sources (including CVE databases and bug trackers used), the precise operationalization of a file being 'involved in a vulnerability' (files modified in commits addressing reported vulnerabilities), and details of any validation steps such as manual sampling and inter-rater reliability. These additions will strengthen the credibility of the reported 2.75 times likelihood ratio. revision: yes

  2. Referee: [§5.2] §5.2 (Security Results): No statistical controls or matching are reported for known confounders such as file size, age, centrality in the call graph, or number of contributors, all of which correlate with vulnerability likelihood in prior work. Without regression, propensity-score matching, or stratification, the 2.75x ratio cannot be interpreted as an effect of the 'util' naming convention itself.

    Authors: The referee is correct that no controls for potential confounders were applied in the reported analysis. Our presentation of the 2.75x ratio is intended as an unadjusted descriptive statistic highlighting an association, not a claim of causal impact from the 'util' naming practice. To better address this, we will incorporate in the revision a regression analysis in Section 5.2 that controls for file size, age, contributor count, and call graph centrality. We will report both the raw and adjusted associations to clarify the role of the util convention independent of these factors. revision: yes

  3. Referee: [§3] §3 (Definition of Util Files): Reliance on string matching for 'util' in paths is presented without validation that this captures developer intent for broadly reusable code rather than catch-all directories. This assumption affects all downstream correlations (usage, complexity, collaboration) and should be tested against a sample of manual classifications.

    Authors: We acknowledge that the string-matching approach for identifying util files was not validated against manual judgments of developer intent in the submitted manuscript. In the revised Section 3, we will add a validation experiment involving manual classification of a sample of files by multiple authors, reporting agreement metrics and the proportion of string-matched files that align with the intent of reusable utility code. This will help quantify any noise introduced by the automated definition and its impact on the subsequent analyses of usage, complexity, and collaboration. revision: yes

Circularity Check

0 steps flagged

No significant circularity; empirical counts from external data

full rationale

The paper conducts a longitudinal empirical mining study over Git histories of seven projects, computing vulnerability involvement ratios by directly counting attributions from CVE entries, bug trackers, and fixing commits to files containing 'util' in the path name versus all other files. The 2.75x figure is a measured ratio across 1773 snapshots with rename tracking; it does not arise from any internal definition, fitted parameter renamed as a prediction, self-citation chain, or ansatz. All load-bearing steps (file classification, snapshot extraction, vulnerability mapping) operate on observable external repository artifacts rather than reducing to the paper's own inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Only the abstract is available, so the ledger reflects assumptions implied by the described methodology rather than explicit statements in the full text.

axioms (1)
  • domain assumption Vulnerability reports can be reliably attributed to individual source files in the studied repositories
    The 2.75x likelihood claim requires accurate mapping from vulnerability databases to specific files at specific historical snapshots.

pith-pipeline@v0.9.0 · 5625 in / 1336 out tokens · 52150 ms · 2026-05-07T04:59:43.743689+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

23 extracted references · 23 canonical work pages

  1. [1]

    CVE: Common Vulnerabilities and Exposures

  2. [2]

    CWE - Common Weakness Enumeration

  3. [3]

    universal-ctags/ctags, Mar. 2025. original-date: 2010-03-25T10:43:13Z

  4. [4]

    In2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE)(Madrid, ES, May 2021), IEEE, pp

    Alsuhaibani, R., Newman, C., Decker, M., Collard, M., and Maletic, J.On the Naming of Methods: A Survey of Professional Developers. In2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE)(Madrid, ES, May 2021), IEEE, pp. 587–599

  5. [5]

    G.The Language of Programming: On the Vocabu- lary of Names

    Amit, N., and Feitelson, D. G.The Language of Programming: On the Vocabu- lary of Names. In2022 29th Asia-Pacific Software Engineering Conference (APSEC) (Japan, Dec. 2022), IEEE, pp. 21–30

  6. [6]

    2017), 3057–3087

    Antinyan, V., Staron, M., and Sandberg, A.Evaluating code complexity triggers, use of complexity measures and the influence of code complexity on maintenance time.Empirical Software Engineering 22, 6 (Dec. 2017), 3057–3087

  7. [7]

    InInternational Conference on High Performance Computing in Asia-Pacific Region(New York, NY, USA, 2022), HPCAsia2022, Association for Computing Machinery, p

    Arakawa, T., Yashiro, H., and Nakajima, K.Development of a coupler h3- open-util/mp. InInternational Conference on High Performance Computing in Asia-Pacific Region(New York, NY, USA, 2022), HPCAsia2022, Association for Computing Machinery, p. 72–83

  8. [8]

    G.Effects of variable names on comprehension: An empirical study

    Avidan, E., and Feitelson, D. G.Effects of variable names on comprehension: An empirical study. In2017 IEEE/ACM 25th International Conference on Program Comprehension (ICPC)(2017), pp. 55–65. [9]Booth, H.National Vulnerability Database, 2015. [10]Boyter, B.boyter/scc, Mar. 2025. original-date: 2018-03-01T06:44:25Z

  9. [9]

    In2011 27th IEEE International Conference on Software Maintenance (ICSM)(Williamsburg, VA, USA, Sept

    Butler, S., Wermelinger, M., Yu, Y., and Sharp, H.Mining java class naming conventions. In2011 27th IEEE International Conference on Software Maintenance (ICSM)(Williamsburg, VA, USA, Sept. 2011), IEEE, pp. 93–102

  10. [10]

    A., McIntosh, S., Shang, W., Kulesza, U., Coelho, R., and Hassan, A

    Da Costa, D. A., McIntosh, S., Shang, W., Kulesza, U., Coelho, R., and Hassan, A. E.A Framework for Evaluating the Results of the SZZ Approach for Identifying Bug-Introducing Changes.IEEE Transactions on Software Engineering 43, 7 (July 2017), 641–657

  11. [11]

    D.Refactoring sequential java code for concurrency via concurrent libraries

    Dig, D., Marrero, J., and Ernst, M. D.Refactoring sequential java code for concurrency via concurrent libraries. In2009 IEEE 31st International Conference on Software Engineering(2009), pp. 397–407

  12. [12]

    In 2023 IEEE/ACM 31st International Conference on Program Comprehension (ICPC) (Melbourne, Australia, May 2023), IEEE, pp

    Heinonen, A., and Fagerholm, F.Understanding initial API comprehension. In 2023 IEEE/ACM 31st International Conference on Program Comprehension (ICPC) (Melbourne, Australia, May 2023), IEEE, pp. 43–53. [15]Keller, B.bnk5096/Util-Files: EASE2026, Apr. 2026

  13. [13]

    Mashhadi, E., Ahmadvand, H., and Hemmati, H.Method-level bug severity prediction using source code metrics and llms. 635–646

  14. [14]

    A., McQueen, T

    McQueen, M. A., McQueen, T. A., Boyer, W. F., and Chaffin, M. R.Empir- ical Estimates and Observations of 0Day Vulnerabilities. In2009 42nd Hawaii International Conference on System Sciences(Waikoloa, Hawaii, USA, 2009), IEEE, pp. 1–12. [18]Meneely, A., Jha, A., Borger, R., Sxm7571, Thyng, M., Wong, E., 17mgeffert, Mirley, K., Valletta, N., Austin, Vittoz...

  15. [15]

    V., Khomh, F., Adams, B., Engström, E., and Petersen, K.On rapid releases and software testing

    Mäntylä, M. V., Khomh, F., Adams, B., Engström, E., and Petersen, K.On rapid releases and software testing. In2013 IEEE International Conference on Software Maintenance(2013), pp. 20–29

  16. [16]

    C., Da Costa, D

    Neto, E. C., Da Costa, D. A., and Kulesza, U.The impact of refactoring changes on the SZZ algorithm: An empirical study. In2018 IEEE 25th International Con- ference on Software Analysis, Evolution and Reengineering (SANER)(Campobasso, Mar. 2018), IEEE, pp. 380–390

  17. [17]

    A.Evaluating complexity, code churn, and developer activity metrics as indicators of software vulnerabili- ties.IEEE Transactions on Software Engineering 37, 6 (2011), 772–787

    Shin, Y., Meneely, A., Williams, L., and Osborne, J. A.Evaluating complexity, code churn, and developer activity metrics as indicators of software vulnerabili- ties.IEEE Transactions on Software Engineering 37, 6 (2011), 772–787

  18. [18]

    InProceedings of the Second ACM-IEEE international symposium on Empirical software engineering and measurement (Kaiserslautern Germany, Oct

    Shin, Y., and Williams, L.An empirical model to predict security vulnera- bilities using code complexity metrics. InProceedings of the Second ACM-IEEE international symposium on Empirical software engineering and measurement (Kaiserslautern Germany, Oct. 2008), ACM, pp. 315–317

  19. [19]

    InProceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security(New York, NY, USA, 2021), CCS ’21, Association for Computing Machinery, p

    Sieck, F., Berndt, S., Wichelmann, J., and Eisenbarth, T.Util::lookup: Ex- ploiting key decoding in cryptographic libraries. InProceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security(New York, NY, USA, 2021), CCS ’21, Association for Computing Machinery, p. 2456–2473

  20. [20]

    Code vulnerability detection via signal-aware learning

    Suneja, S., Zhuang, Y., Zheng, Y., Laredo, J., Morari, A., and Khurana, U. Code vulnerability detection via signal-aware learning. 506–523

  21. [21]

    D., and Aldrich, J.Searching the state space: A qual- itative study of api protocol usability

    Sunshine, J., Herbsleb, J. D., and Aldrich, J.Searching the state space: A qual- itative study of api protocol usability. In2015 IEEE 23rd International Conference on Program Comprehension(2015), pp. 82–93

  22. [22]

    InProceedings of the 2008 workshop on Defects in large software systems(Seattle Washington, July 2008), ACM, pp

    Williams, C., and Spacco, J.SZZ revisited: verifying when changes induce fixes. InProceedings of the 2008 workshop on Defects in large software systems(Seattle Washington, July 2008), ACM, pp. 32–36

  23. [23]

    Received 20 February 2007; revised 12 March 2009; accepted 5 June 2009

    Śliwerski, J., Zimmermann, T., and Zeller, A.When do changes induce fixes? ACM SIGSOFT Software Engineering Notes 30, 4 (July 2005), 1–5. Received 20 February 2007; revised 12 March 2009; accepted 5 June 2009