pith. sign in

arxiv: 2404.06672 · v6 · submitted 2024-04-10 · 💻 cs.SE · cs.CY

Biomedical Open Source Software: Crucial Packages and Hidden Heroes

Pith reviewed 2026-05-24 02:23 UTC · model grok-4.3

classification 💻 cs.SE cs.CY
keywords software dependenciescentrality metricsbiomedical researchopen source softwarePyPICRANBioconductordependency networks
0
0 comments X

The pith

Centrality metrics on software dependency networks identify the foundational packages biomedical research depends on most.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The authors extract software mentions from biomedical papers and trace their upstream dependencies across three ecosystems. They define centrality measures on the resulting dependency graphs to rank packages by how many others rely on them, directly or indirectly. This approach surfaces packages that sit deep in the stack and are rarely named in papers themselves. The work demonstrates that citation or mention data alone misses much of the actual infrastructure. If the ranking holds, stakeholders can direct maintenance and funding toward the packages whose failure would affect the largest share of research.

Core claim

Using the CZ Software Mentions Dataset, the paper builds directed dependency graphs for packages drawn from PyPI, CRAN, and Bioconductor that appear in biomedical literature, then computes centrality scores on those graphs; the packages that receive the highest scores are presented as the critical, often invisible, components of the biomedical software ecosystem.

What carries the argument

Centrality metrics computed on the directed graph whose nodes are software packages and whose edges point from a package to its upstream dependencies.

If this is right

  • High-centrality packages can be flagged for priority maintenance and funding because their removal would affect the largest number of research workflows.
  • The same network construction can be repeated on other scientific domains to locate their own hidden foundational packages.
  • Metrics that combine direct mentions with indirect dependency reach can replace simple citation counts when evaluating software impact.
  • Ecosystem maintainers gain a quantitative way to decide which packages deserve dedicated support staff or long-term archiving.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If centrality rankings remain stable across successive yearly snapshots of the dataset, they could serve as an early-warning system for packages drifting into critical status.
  • The method could be extended by weighting edges according to how often a dependency is actually invoked in code, rather than treating every declared dependency equally.
  • Cross-ecosystem comparison might reveal whether one language community (Python versus R) concentrates risk in fewer foundational packages than the other.

Load-bearing premise

The CZ Software Mentions Dataset supplies a representative sample of the packages and dependency links actually used in biomedical papers.

What would settle it

A fresh, independent extraction of software mentions from a new corpus of biomedical papers that produces a materially different top-ranked set of packages by the same centrality measures.

Figures

Figures reproduced from arXiv: 2404.06672 by Andrew Nesbitt, Boris Veytsman, Daniel Mietchen, Eva Maxfield Brown, James Howison, Jo\~ao Felipe Pimentel, Laurent H\'ebert-Dufresne, Stephan Druskat.

Figure 1
Figure 1. Figure 1: Classification of software packages inspired by Stokes’ classification system in [21]. “Nebraska” packages are software projects which have few mentions in research articles, but are highly central in a dependency network. “Pasteur” packages are both highly visible with lots of mentions and are highly central in a dependency network. networks [19]. At present, though, the situation is quite different: some… view at source ↗
Figure 2
Figure 2. Figure 2: (a) Network visualization of software packages from three ecosystems (from CRAN in green, PyPI in blue, and Bioconductor in pink) connected through their dependencies within their ecosystem and interconnected through papers that mention them. We label the top 3 most central packages in each ecosystem: ggplot2 [33], SAM [34], and PRISMA [35] for CRAN, velvet [36], tophat and pymol [37] for PyPI and DeSeq2 [… view at source ↗
Figure 3
Figure 3. Figure 3: Distribution of packages by Katz centrality and counts of their mentions in papers. Katz centrality is calculated for an unweighted graph, for a weighted graph with all nodes, or just for the largest connected cluster (LCC) for each ecosystem. In the calculations, we assumed β = 1. November 7, 2025 9/20 [PITH_FULL_IMAGE:figures/full_fig_p009_3.png] view at source ↗
read the original abstract

Despite the importance of scientific software for research, it is often not formally recognized and rewarded. This is especially true for foundational libraries, which are hidden below packages visible to the users (and thus doubly hidden, since even the packages directly used in research are frequently not visible in the paper). Research stakeholders like funders, infrastructure providers, and other organizations need to understand the complex network of computer programs that contemporary research relies upon. In this work, we use the CZ Software Mentions Dataset to map the upstream dependencies of software used in biomedical papers and find the packages critical to scientific software ecosystems. We propose centrality metrics for the network of software dependencies, analyze three ecosystems (PyPi, CRAN, Bioconductor), and determine the packages with the highest centrality.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript claims that the CZ Software Mentions Dataset can be used to map upstream dependencies of software mentioned in biomedical papers, that centrality metrics can be defined on the resulting dependency networks, and that analysis of the PyPI, CRAN, and Bioconductor ecosystems reveals the packages with highest centrality that are critical yet hidden in scientific software stacks.

Significance. If the dataset is shown to be representative and the centrality definitions are made explicit and reproducible, the work could help funders and infrastructure providers identify foundational packages that merit greater recognition. The multi-ecosystem scope is a constructive feature. The grounding in an external mention dataset is noted as a positive, data-driven approach.

major comments (2)
  1. [Data and Methods] Data section: no coverage statistics, comparison against an independent corpus (e.g., PubMed Central full-text), or bias analysis is supplied to establish that the CZ Software Mentions Dataset supplies a representative sample of packages and dependency relations actually invoked in biomedical papers. This assumption is load-bearing for the upstream-dependency mapping and all subsequent centrality rankings.
  2. [Methods] Centrality definition: the manuscript does not supply explicit formulas or pseudocode for the proposed centrality metrics on the dependency graphs, nor does it report how the graphs are constructed from mentions (e.g., edge-weighting, handling of transitive dependencies). Without these, the claim that the highest-centrality packages are the “critical” ones cannot be evaluated.
minor comments (1)
  1. [Abstract] Abstract: the three ecosystems are named but the scale of the extracted networks (number of nodes/edges) is not stated.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments, which identify key areas where additional transparency and validation will strengthen the manuscript. We agree that both the representativeness of the dataset and the explicit definition of centrality metrics require elaboration. We will revise the manuscript to incorporate these elements as detailed below.

read point-by-point responses
  1. Referee: [Data and Methods] Data section: no coverage statistics, comparison against an independent corpus (e.g., PubMed Central full-text), or bias analysis is supplied to establish that the CZ Software Mentions Dataset supplies a representative sample of packages and dependency relations actually invoked in biomedical papers. This assumption is load-bearing for the upstream-dependency mapping and all subsequent centrality rankings.

    Authors: We agree that the manuscript should demonstrate the representativeness of the CZ Software Mentions Dataset. The current version relies on the dataset without providing coverage statistics or bias analysis. In the revision we will add a dedicated subsection to the Data section that reports: the total number of papers and unique packages extracted; basic coverage metrics such as the fraction of biomedical papers containing software mentions; a comparison against a random sample of PubMed Central full-text articles (reporting overlap in mentioned packages); and a brief discussion of potential biases (e.g., field or ecosystem skew). These additions will directly support the validity of the downstream dependency mapping and centrality results. revision: yes

  2. Referee: [Methods] Centrality definition: the manuscript does not supply explicit formulas or pseudocode for the proposed centrality metrics on the dependency graphs, nor does it report how the graphs are constructed from mentions (e.g., edge-weighting, handling of transitive dependencies). Without these, the claim that the highest-centrality packages are the “critical” ones cannot be evaluated.

    Authors: We acknowledge that the manuscript describes the centrality metrics at a conceptual level but omits explicit formulas, pseudocode, and graph-construction details. In the revised Methods section we will insert: (i) the precise mathematical definitions of the centrality measures applied to the directed dependency graphs (including any adaptations of standard metrics such as degree or betweenness); (ii) pseudocode outlining the graph-construction procedure from the mention data; (iii) the chosen edge-weighting scheme (mention frequency); and (iv) the decision to use direct dependencies only, with a short justification for not computing transitive closures. These additions will make the “critical package” identification fully reproducible and evaluable. revision: yes

Circularity Check

0 steps flagged

No circularity; empirical mapping from external dataset using standard network metrics

full rationale

The paper constructs dependency networks from the CZ Software Mentions Dataset (an external resource) and applies standard centrality metrics to rank packages in PyPI, CRAN, and Bioconductor. No equations, fitted parameters, self-definitional steps, or load-bearing self-citations appear in the provided abstract or described approach. Results are presented as direct extractions and rankings from the input data rather than derivations that reduce to the paper's own definitions or prior outputs by construction. The analysis remains self-contained against the external dataset benchmark.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review yields no explicit free parameters, axioms, or invented entities; the central claim rests on the unstated premise that the named dataset faithfully captures real dependency usage.

pith-pipeline@v0.9.0 · 5690 in / 1101 out tokens · 19072 ms · 2026-05-24T02:23:45.417190+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

60 extracted references · 60 canonical work pages · 2 internal anchors

  1. [1]

    Scientific Software Production: Incentives and Collaboration

    Howison J, Herbsleb JD. Scientific Software Production: Incentives and Collaboration. In: Proceedings of the ACM 2011 Conference on Computer Supported Cooperative Work. CSCW ’11. New York, NY, USA: Association for Computing Machinery; 2011. p. 513–522. Available from: https://doi.org/10.1145/1958824.1958904

  2. [2]

    Understanding the scientific software ecosystem and its impact: Current and future measures

    Howison J, Deelman E, McLennan MJ, Ferreira da Silva R, Herbsleb JD. Understanding the scientific software ecosystem and its impact: Current and future measures. Research Evaluation. 2015;24(4):454–470. doi:10.1093/reseval/rvv014

  3. [3]

    The unsung heroes of scientific software

    Singh Chawla D. The unsung heroes of scientific software. Nature. 2016;529(7584):115–116. doi:10.1038/529115a

  4. [4]

    Software in the scientific literature: Problems with seeing, finding, and using software mentioned in the biology literature

    Howison J, Bullard J. Software in the scientific literature: Problems with seeing, finding, and using software mentioned in the biology literature. JASIST. 2016;67(9):2137–2155. doi:10.1002/asi.23538. November 7, 2025 16/20

  5. [5]

    We need to talk about the lack of investment in digital research infrastructure

    Knowles R, Mateen BA, Yehudi Y. We need to talk about the lack of investment in digital research infrastructure. Nature Computational Science. 2021;1(3):169–171. doi:10.1038/s43588-021-00048-5

  6. [6]

    Don’t Mention It: An Approach to Assess Challenges to Using Software Mentions for Citation and Discoverability Research

    Druskat S, Hong NPC, Buzzard S, Konovalov O, Kornek P. Don’t Mention It: An Approach to Assess Challenges to Using Software Mentions for Citation and Discoverability Research. arXiv. 2024;2024(arXiv:2402.14602). doi:10.48550/arXiv.2402.14602

  7. [7]

    SoMeSci—A 5 Star Open Data Gold Standard Knowledge Graph of Software Mentions in Scientific Articles

    Schindler D, Bensmann F, Dietze S, Kr¨ uger F. SoMeSci—A 5 Star Open Data Gold Standard Knowledge Graph of Software Mentions in Scientific Articles. In: Proceedings of the 30th ACM International Conference on Information & Knowledge Management. New York, NY, USA: Association for Computing Machinery; 2021. p. 4574–4583. Available from: https://doi.org/10.1...

  8. [8]

    SoftCite dataset: A dataset of software mentions in biomedical and economic research publications

    Du C, Cohoon J, Lopez P, Howison J. SoftCite dataset: A dataset of software mentions in biomedical and economic research publications. JASIST. 2021;72(7):870–884. doi:10.1002/asi.24454

  9. [9]

    CZ Software Mentions: A large dataset of software mentions in the biomedical literature; 2022

    Istrate AM, Veytsman B, Li D, Taraborelli D, Torkar M, Williams I. CZ Software Mentions: A large dataset of software mentions in the biomedical literature; 2022. Available from: https://datadryad.org/stash/dataset/doi:10.5061/dryad.6wwpzgn2c

  10. [10]

    A large dataset of software mentions in the biomedical literature

    Istrate AM, Li D, Taraborelli D, Torkar M, Veytsman B, Williams I. A large dataset of software mentions in the biomedical literature. arXiv. 2022;doi:10.48550/ARXIV.2209.00693

  11. [11]

    Guiding Development Work Across a Software Ecosystem by Visualizing Usage Data

    Bogart C, Howison J, Herbsleb J. Guiding Development Work Across a Software Ecosystem by Visualizing Usage Data. arXiv e-prints. 2020; p. arXiv:2012.05987. doi:10.48550/arXiv.2012.05987

  12. [12]

    The Nebraska problem in open source software development

    Hatta M. The Nebraska problem in open source software development. Annals of Business Administrative Science. 2022;21(5):91–102. doi:10.7880/abas.0220914a

  13. [13]

    What we know about the xz utils backdoor that almost infected the world; 2024

    Goodin D. What we know about the xz utils backdoor that almost infected the world; 2024. Ars Technica. Available from: https://arstechnica.com/security/2024/04/ what-we-know-about-the-xz-utils-backdoor-that-almost-infected-the-world/

  14. [14]

    Computational reproducibility of Jupyter notebooks from biomedical publications

    Samuel S, Mietchen D. Computational reproducibility of Jupyter notebooks from biomedical publications. GigaScience. 2024;13. doi:10.1093/GIGASCIENCE/GIAD113

  15. [15]

    Dependency; 2020

    Munroe RP. Dependency; 2020. Available from:https://xkcd.com/2347/

  16. [16]

    Transitive Credit as a Means to Address Social and Technological Concerns Stemming from Citation and Attribution of Digital Products

    Katz DS. Transitive Credit as a Means to Address Social and Technological Concerns Stemming from Citation and Attribution of Digital Products. Journal of Open Research Software. 2014;doi:10.5334/jors.be

  17. [17]

    Implementing Transitive Credit with JSON-LD

    Katz DS, Smith AM. Implementing Transitive Credit with JSON-LD. arXiv. 2014;doi:10.48550/arXiv.1407.5117

  18. [18]

    Citation File Format; 2021

    Druskat S, Spaaks JH, Chue Hong N, Haines R, Baker J, Bliven S, et al.. Citation File Format; 2021. Available from: https://doi.org/10.5281/zenodo.5171937. November 7, 2025 17/20

  19. [19]

    Software and Dependencies in Research Citation Graphs

    Druskat S. Software and Dependencies in Research Citation Graphs. Computing in Science & Engineering. 2020;22(2):8–21. doi:10.1109/MCSE.2019.2952840

  20. [20]

    When and How to Make Breaking Changes: Policies and Practices in 18 Open Source Software Ecosystems

    Bogart C, K¨ astner C, Herbsleb J, Thung F. When and How to Make Breaking Changes: Policies and Practices in 18 Open Source Software Ecosystems. ACM Trans Softw Eng Methodol. 2021;30(4). doi:10.1145/3447245

  21. [21]

    Pasteur’s Quadrant: Basic Science and Technological Innovation

    Stokes DE. Pasteur’s Quadrant: Basic Science and Technological Innovation. Washington, D. C.: Brookings Institute Press; 1997

  22. [22]

    Exploring the dependencies of the CZI mentions dataset; 2023

    Brown EM, Nesbitt A, H´ ebert-Dufresne L, Veytsman B, Pimentel JaF, Druskat S, et al.. Exploring the dependencies of the CZI mentions dataset; 2023. Available from:https://github.com/borisveytsman/SoftwareImpactHackathon2023_ Tracing_dependencies

  23. [23]

    Package and Dependency Metadata for CZI Hackathon: Mapping the Impact of Research Software in Science; 2023

    Nesbitt A. Package and Dependency Metadata for CZI Hackathon: Mapping the Impact of Research Software in Science; 2023. Zenodo

  24. [24]

    OpenAlex: A fully-open index of scholarly works, authors, venues, institutions, and concepts

    Priem J, Piwowar H, Orr R. OpenAlex: A fully-open index of scholarly works, authors, venues, institutions, and concepts. arXiv e-prints. 2022; p. arXiv:2205.01833. doi:10.48550/arXiv.2205.01833

  25. [25]

    A Dependency Graph for 460,000 Papers and Their Software Mentions from the CZI Software Mentions Dataset; 2023

    Brown EM. A Dependency Graph for 460,000 Papers and Their Software Mentions from the CZI Software Mentions Dataset; 2023. Available from: https://doi.org/10.5281/zenodo.10048132

  26. [26]

    GEXF File Format; 2009

    GEXF Working Group. GEXF File Format; 2009. Available from: https://gexf.net/

  27. [27]

    Three Perspectives on Centrality

    Borgatti SP, Everett MG. Three Perspectives on Centrality. In: Light R, Moody J, editors. The Oxford Handbook of Social Networks. Oxford University Press

  28. [28]

    Some unique properties of eigenvector centrality

    Bonacich P. Some unique properties of eigenvector centrality. Social Networks. 2007;29(4):555–564. doi:10.1016/J.SOCNET.2007.04.002

  29. [29]

    The anatomy of a large-scale hypertextual Web search engine

    Brin S, Page L. The anatomy of a large-scale hypertextual Web search engine. Computer Networks. 1998;30(1-7):107–117. doi:10.1016/S0169-7552(98)00110-X

  30. [30]

    A new status index derived from sociometric analysis

    Katz L. A new status index derived from sociometric analysis. Psychometrika. 1953;18(1):39–43. doi:10.1007/BF02289026

  31. [31]

    Diffusion of Innovations, 5th Edition

    Rogers EM. Diffusion of Innovations, 5th Edition. Free Press; 2003

  32. [32]

    A Survey of Models and Algorithms for Social Influence Analysis

    Sun J, Tang J. A Survey of Models and Algorithms for Social Influence Analysis. In: Aggarwal CC, editor. Social Network Data Analytics. Boston, MA: Springer US; 2011. p. 177–214

  33. [33]

    Wickham H. ggplot2. Wiley interdisciplinary reviews: computational statistics. 2011;3(2):180–185

  34. [34]

    Sparse additive models

    Ravikumar P, Lafferty J, Liu H, Wasserman L. Sparse additive models. Journal of the Royal Statistical Society Series B: Statistical Methodology. 2009;71(5):1009–1030

  35. [35]

    Learning stateful models for network honeypots

    Krueger T, Gascon H, Kr¨ amer N, Rieck K. Learning stateful models for network honeypots. In: Proceedings of the 5th ACM workshop on Security and artificial intelligence; 2012. p. 37–48. November 7, 2025 18/20

  36. [36]

    Velvet; 2015

    Wood S. Velvet; 2015. Available from:https://pypi.org/project/velvet

  37. [37]

    Pymol: An open-source molecular graphics tool

    DeLano WL, et al. Pymol: An open-source molecular graphics tool. CCP4 Newsl protein crystallogr. 2002;40(1):82–92

  38. [38]

    Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2

    Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome biology. 2014;15(12):550

  39. [39]

    edgeR: a Bioconductor package for differential expression analysis of digital gene expression data

    Robinson MD, McCarthy DJ, Smyth GK. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. bioinformatics. 2010;26(1):139–140

  40. [40]

    Limma: linear models for microarray data

    Smyth GK. Limma: linear models for microarray data. In: Bioinformatics and computational biology solutions using R and Bioconductor. Springer; 2005. p. 397–420

  41. [41]

    limma powers differential expression analyses for RNA-sequencing and microarray studies

    Ritchie ME, Phipson B, Wu D, Hu Y, Law CW, Shi W, et al. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic acids research. 2015;43(7):e47–e47

  42. [42]

    Concentration and dependency ratios [English translation of the original 1909 paper]

    Gini C. Concentration and dependency ratios [English translation of the original 1909 paper]. Rivista di Politica Economica. 1997;87:769–789

  43. [43]

    vctrs: Vector Helpers; 2023

    Wickham H, Henry L, Vaughan D. vctrs: Vector Helpers; 2023. Available from: https://CRAN.R-project.org/package=vctrs

  44. [44]

    withr: Run Code ‘With’ Temporarily Modified Global State; 2024

    Hester J, Henry L, M¨ uller K, Ushey K, Wickham H, Chang W. withr: Run Code ‘With’ Temporarily Modified Global State; 2024. Available from: https://CRAN.R-project.org/package=withr

  45. [45]

    isoband: Generate Isolines and Isobands from Regularly Spaced Elevation Grids; 2022

    Wickham H, Wilke CO, Pedersen TL. isoband: Generate Isolines and Isobands from Regularly Spaced Elevation Grids; 2022. Available from: https://CRAN.R-project.org/package=isoband

  46. [46]

    newick; 2021

    Schultz D, Ebbert M, De Coster W. newick; 2021. Available from: https://pypi.org/project/pauvre/

  47. [47]

    Newick; 2025

    Forkel R. Newick; 2025. Available from:https://pypi.org/project/newick/

  48. [48]

    setuptools; 2025

    Python Packaging Authority. setuptools; 2025. Available from: https://pypi.org/project/setuptools/

  49. [49]

    Welcome to the tidyverse

    Wickham H, Averick M, Bryan J, Chang W, McGowan LD, Fran¸ cois R, et al. Welcome to the tidyverse. Journal of Open Source Software. 2019;4(43):1686. doi:10.21105/joss.01686

  50. [50]

    Velvet [Software]

    Zerbino DR, Foret S, Gurney JM, Slater G, Birney E, Marshall J, et al. Velvet [Software]. Software Heritage. 2014

  51. [51]

    Velvet: Algorithms for de Novo Short Read Assembly Using de Bruijn Graphs

    Zerbino DR, Birney E. Velvet: Algorithms for de Novo Short Read Assembly Using de Bruijn Graphs. Genome Research. 2008;18(5):821–829. doi:10.1101/gr.074492.107

  52. [52]

    TopHat; 2012

    The TopHat developers. TopHat; 2012. Available from: https://pypi.org/project/TopHat

  53. [53]

    TopHat: Discovering Splice Junctions with RNA-Seq

    Trapnell C, Pachter L, Salzberg SL. TopHat: Discovering Splice Junctions with RNA-Seq. Bioinformatics. 2009;25(9):1105–1111. doi:10.1093/bioinformatics/btp120. November 7, 2025 19/20

  54. [54]

    TopHat2: Accurate Alignment of Transcriptomes in the Presence of Insertions, Deletions and Gene Fusions

    Kim D, Pertea G, Trapnell C, Pimentel H, Kelley R, Salzberg SL. TopHat2: Accurate Alignment of Transcriptomes in the Presence of Insertions, Deletions and Gene Fusions. Genome Biology. 2013;14(4):R36. doi:10.1186/gb-2013-14-4-r36

  55. [55]

    GraphPad prism, data analysis, and scientific graphing

    Swift ML. GraphPad prism, data analysis, and scientific graphing. Journal of chemical information and computer sciences. 1997;37(2):411–412

  56. [56]

    Gephi: An Open Source Software for Exploring and Manipulating Networks

    Bastian M, Heymann S, Jacomy M. Gephi: An Open Source Software for Exploring and Manipulating Networks. In: International AAAI Conference on Weblogs and Social Media. AAAI; 2009. p. 361–362. Available from: http://www.aaai.org/ocs/index.php/ICWSM/09/paper/view/154

  57. [57]

    An updated set of basic linear algebra subprograms (BLAS)

    Blackford LS, Petitet A, Pozo R, Remington K, Whaley RC, Demmel J, et al. An updated set of basic linear algebra subprograms (BLAS). ACM Transactions on Mathematical Software. 2002;28(2):135–151

  58. [58]

    LAPACK users’ guide

    Anderson E, Bai Z, Bischof C, Blackford LS, Demmel J, Dongarra J, et al. LAPACK users’ guide. SIAM; 1999

  59. [59]

    The penumbra of open source: projects outside of centralized platforms are longer maintained, more academic and more collaborative

    Trujillo MZ, H´ ebert-Dufresne L, Bagrow J. The penumbra of open source: projects outside of centralized platforms are longer maintained, more academic and more collaborative. EPJ Data Science. 2022;11(1):31

  60. [60]

    Support scientific software infrastructure by requiring SBOMs for federally funded research; 2024

    Howison J, Ram K. Support scientific software infrastructure by requiring SBOMs for federally funded research; 2024. Available from: https://fas.org/publication/sboms-hardware/. November 7, 2025 20/20