Original Sin of npm: A Study on Vulnerability Propagation in JavaScript Dependency Networks
Pith reviewed 2026-05-10 05:10 UTC · model grok-4.3
The pith
21.6 percent of npm packages carry at least one known vulnerability through their dependency networks, most of them high severity.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By tracing dependency networks across 1,077,946 JavaScript packages, the study establishes that 232,836 packages, or 21.60 percent, have at least one known vulnerability in their networks, with 42 percent of those rated high severity. The average interval from publication of the first vulnerable version to a fix is 4 years and 11 months, while vulnerability reports appear about 19 days after fixes become available. A small number of vulnerabilities are highly concentrated, with the top 7 covering 25 percent and the top 23 covering 50 percent of cases.
What carries the argument
Dependency networks that connect packages through direct and transitive links, allowing measurement of how vulnerabilities from a few sources reach many others.
If this is right
- Remediation efforts focused on the small set of high-frequency vulnerabilities could address half of all observed cases.
- Shortening the average five-year window between vulnerable release and fix would reduce overall exposure time across the network.
- Developers can lower risk by tracking and updating packages that appear frequently in many dependency chains.
- Package managers could add alerts that flag transitive dependencies tied to the concentrated top vulnerabilities.
Where Pith is reading between the lines
- The same concentration pattern could be checked in other ecosystems such as Python or Java to see whether a few root packages drive most risk there too.
- The 19-day gap between fixes and reports suggests testing whether faster coordinated disclosure would shrink exposure windows.
- Surveying actual deployed applications and comparing their real vulnerability counts to the network-wide 21.6 percent figure would test how well the model matches production use.
Load-bearing premise
The assembled vulnerability data and dependency connections correctly capture all relevant known issues and reflect actual usage patterns in practice.
What would settle it
Repeating the full analysis with an alternative vulnerability database or dependency mapping tool and obtaining markedly different shares of affected packages would show the reported percentages do not hold.
Figures
read the original abstract
Understanding vulnerability propagation is essential for assessing how vulnerabilities spread across components of a software package. This supports more accurate impact analysis and enhances threat detection and mitigation. In this paper, we investigate how a small number of vulnerable JavaScript packages contribute to the creation of a disproportionately large number of vulnerable packages. This paper presents insights from 1,515 reported vulnerabilities gathered from a custom-built vulnerability database containing 1,077,946 JavaScript packages sourced from `npm-follower' and their associated dependency networks. Dependency networks were constructed using the deps.dev API, with vulnerabilities identified by parsing package names and version numbers through the Google Open Source Vulnerability API. Our findings reveal that 61.30% (660,748) of packages are reliant on one or more dependency packages, and 21.60% (232,836) of total packages have at least one known vulnerability throughout their dependency networks -- of which most (42%) are of High severity. We also found that it takes, on average, approximately 4 years and 11 months to fix a vulnerable package from when the first vulnerable version is published on npm -- although publication times of vulnerabilities occur approximately 19 days after a fix is available. Finally, we observe a high concentration of frequently present vulnerabilities throughout dependency networks, with the top-7 most frequent vulnerabilities accounting for 25% of vulnerability cases and the top-23 most frequent accounting for 50%. Based on these findings, we propose recommendations for developers and package managers to mitigate the threat and occurrence of vulnerabilities within the npm dependency network and the broader software repository community.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper analyzes vulnerability propagation in the npm JavaScript ecosystem using a custom database of 1,077,946 packages from npm-follower, dependency graphs from deps.dev, and vulnerability data from the Google OSV API on 1,515 reported vulnerabilities. It claims that 61.30% of packages rely on one or more dependencies, 21.60% (232,836) have at least one known vulnerability in their networks (42% high severity), the average time to fix a vulnerable package is 4 years and 11 months after the first vulnerable version is published, vulnerabilities are concentrated (top-7 account for 25%, top-23 for 50%), and offers mitigation recommendations.
Significance. If the data pipeline is accurate, the work provides large-scale empirical evidence on transitive vulnerability exposure and remediation delays in npm, which could inform package manager policies and developer practices. The scale of the dataset and focus on concentration effects are strengths.
major comments (3)
- [Methods (data collection and vulnerability identification)] Methods section on data collection and vulnerability identification: the pipeline for matching package names/versions to OSV entries (including transitive dependencies) is described at a high level but provides no validation of matching accuracy, version-range resolution, false-positive rates, or sample-based precision checks. This directly underpins the central 21.60% statistic and severity distributions.
- [Results (time-to-fix analysis)] Results on time-to-fix: the reported average of 4 years and 11 months (and the 19-day offset for vulnerability publication) depends on precise dating of first vulnerable version vs. fixing version, yet no details are given on date extraction, handling of version ordering, or assumptions about when a fix becomes available. This affects the reliability of the remediation timeline claim.
- [Dependency network construction] Dependency network construction: while deps.dev is used to build graphs, there is no assessment of graph completeness, potential missing transitive edges, or how version ranges are resolved when counting affected packages. Any systematic incompleteness would alter the 232,836 count and concentration findings.
minor comments (2)
- [Abstract] Abstract and results: ensure consistent reporting of total unique vulnerabilities vs. the 1,515 reported ones when discussing frequency distributions.
- [Results (vulnerability concentration)] Results on concentration: a table listing the top-7 and top-23 vulnerabilities with their frequencies would improve transparency and allow readers to assess the claim.
Simulated Author's Rebuttal
We thank the referee for their constructive and detailed feedback. The comments identify important areas where additional methodological transparency and validation would strengthen the manuscript. We address each major comment below and commit to revisions that improve rigor without altering the core findings.
read point-by-point responses
-
Referee: Methods section on data collection and vulnerability identification: the pipeline for matching package names/versions to OSV entries (including transitive dependencies) is described at a high level but provides no validation of matching accuracy, version-range resolution, false-positive rates, or sample-based precision checks. This directly underpins the central 21.60% statistic and severity distributions.
Authors: We agree that the current description is high-level and that explicit validation would increase confidence in the 21.60% figure. In the revised manuscript we will expand the Methods section with: (i) the exact procedure for querying the OSV API using package name and resolved version, (ii) how version ranges are interpreted according to OSV's affected range syntax, and (iii) results of a manual precision audit on a random sample of 200 packages (reporting match accuracy and estimated false-positive rate). These additions will directly support the reported statistics. revision: yes
-
Referee: Results on time-to-fix: the reported average of 4 years and 11 months (and the 19-day offset for vulnerability publication) depends on precise dating of first vulnerable version vs. fixing version, yet no details are given on date extraction, handling of version ordering, or assumptions about when a fix becomes available. This affects the reliability of the remediation timeline claim.
Authors: We acknowledge the need for greater detail on the temporal calculations. The revised Results section will include a dedicated subsection describing: (1) extraction of publication dates from npm registry metadata, (2) identification of the earliest vulnerable version using OSV ranges and semantic versioning, (3) determination of the fixing version as the first version outside the vulnerable range, and (4) the precise computation of the 19-day offset between fix availability and vulnerability disclosure. This will make the 4-year-11-month average fully reproducible. revision: yes
-
Referee: Dependency network construction: while deps.dev is used to build graphs, there is no assessment of graph completeness, potential missing transitive edges, or how version ranges are resolved when counting affected packages. Any systematic incompleteness would alter the 232,836 count and concentration findings.
Authors: We recognize that an explicit assessment of deps.dev graph quality is warranted. In the revision we will add to the Methods section: (i) a description of how deps.dev resolves version ranges to concrete versions, (ii) a sample-based completeness check (cross-validation of dependency trees for 100 packages against direct npm queries), and (iii) a limitations paragraph noting that while deps.dev provides broad coverage, isolated missing transitive edges remain possible. These changes will contextualize the 232,836 count and concentration results. revision: yes
Circularity Check
No circularity: direct empirical counts from external APIs
full rationale
The paper's core results (21.60% vulnerable packages, 4y11m average fix time, severity distributions, top-vulnerability concentrations) are computed as straightforward counts, percentages, and means over a dataset assembled from npm-follower, deps.dev graphs, and OSV API lookups. No equations, fitted parameters, predictions, or first-principles derivations appear; the reported figures do not reduce to any internal definition or self-citation chain. The methodology section describes data ingestion and matching steps, but these are external data-processing operations rather than self-referential constructions. This is a standard empirical measurement study whose claims remain falsifiable against the same public APIs.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption npm-follower and deps.dev API return complete and accurate dependency graphs for the sampled packages.
- domain assumption Google Open Source Vulnerability API correctly maps package names and versions to reported vulnerabilities without significant false positives or omissions.
Reference graph
Works this paper leans on
-
[1]
Mahmoud Alfadel, Diego Elias Costa, and Emad Shihab. 2023. Empirical analysis of security vulnerabilities in python packages. Empirical Software Engineering 28, 3 (2023), 59
work page 2023
-
[2]
Claudia Ayala, Xavier Franch, Reidar Conradi, Jingyue Li, and Daniela Cruzes
-
[3]
Springer New York, New York, NY, 167–186
Developing Software with Open Source Software Components . Springer New York, New York, NY, 167–186. https://doi .org/10.1007/978-1-4614-6596-6_9
-
[4]
Christopher Bogart, Christian Kästner, James Herbsleb, and Ferdian Thung. 2016. How to break an API: cost negotiation and community values in three software ecosystems. In Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering . 109–120
work page 2016
-
[5]
Marco Carvalho, Jared DeMott, Richard Ford, and David A Wheeler. 2014. Heart- bleed 101. IEEE security & privacy 12, 4 (2014), 63–67
work page 2014
-
[6]
Joël Cox, Eric Bouwers, Marko Van Eekelen, and Joost Visser. 2015. Measur- ing dependency freshness in software systems. In 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering , Vol. 2. IEEE, 109–118
work page 2015
-
[7]
CVE. [n.d.]. CVE Program. https://www .cve.org/About/Overview
-
[8]
Alexandre Decan, Tom Mens, and Eleni Constantinou. 2018. On the evolution of technical lag in the npm package dependency network. In2018 IEEE International Conference on Software Maintenance and Evolution (ICSME) . IEEE, 404–414
work page 2018
-
[9]
Alexandre Decan, Tom Mens, and Eleni Constantinou. 2018. On the impact of security vulnerabilities in the npm package dependency network. In Proceedings of the 15th international conference on mining software repositories . 181–191
work page 2018
-
[10]
Alexandre Decan, Tom Mens, and Philippe Grosjean. 2019. An empirical compar- ison of dependency network evolution in seven software packaging ecosystems. Empirical Software Engineering 24, 1 (2019), 381–416
work page 2019
-
[11]
FiRST. [n.d.]. Common Vulnerability Scoring System Version 4.0 . https:// www.first.org/cvss/v4-0/
-
[12]
FIRST. [n.d.]. FIRST Vision and Mission Statement . https://www .first.org/about/ mission
-
[13]
Giammaria Giordano, Gerardo Festa, Gemma Catolino, Fabio Palomba, Filomena Ferrucci, and Carmine Gravino. 2024. On the adoption and effects of source code reuse on defect proneness and maintenance effort.Empirical Software Engineering 29, 1 (2024), 20
work page 2024
-
[14]
Antonios Gkortzis, Daniel Feitosa, and Diomidis Spinellis. 2021. Software reuse cuts both ways: An empirical analysis of its relationship with security vulnera- bilities. Journal of Systems and Software 172 (2021), 110653
work page 2021
-
[15]
Google. [n.d.]. Open Source Vulnerabilities. https://google .github.io/osv.dev/api/
-
[16]
Raphael Hiesgen, Marcin Nawrocki, Thomas C Schmidt, and Matthias Wählisch
-
[17]
A vailable: https://arxiv.org/abs/2205.02544
The race to the vulnerable: Measuring the log4j shell incident. arXiv preprint arXiv:2205.02544 (2022)
-
[18]
Manuel Hoffman, Frank Nagle, and Yanuo Zhou. 2024. The Value of Open Source Software. In Harvard Business School Strategy Unit Working Paper No. 24-038
work page 2024
-
[19]
Muhammad Ikram, Rahat Masood, Gareth Tyson, Mohamed Ali Kaafar, Noha Loizon, and Roya Ensafi. 2020. Measuring and Analysing the Chain of Implicit Trust: A Study of Third-party Resources Loading. ACM Trans. Priv. Secur. 23, 2, Article 8 (April 2020), 27 pages. https://doi .org/10.1145/3380466
-
[20]
Open Source Insights. [n.d.]. deps.dev API. https://docs .deps.dev/api/v3/
-
[21]
Riivo Kikas, Georgios Gousios, Marlon Dumas, and Dietmar Pfahl. 2017. Struc- ture and evolution of package dependency networks. In 2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR). IEEE, 102-112 (2017)
work page 2017
-
[22]
Chengwei Liu, Sen Chen, Lingling Fan, Bihuan Chen, Yang Liu, and Xin Peng
-
[23]
In Proceedings of the 44th International Con- ference on Software Engineering
Demystifying the vulnerability propagation and its evolution via depen- dency trees in the npm ecosystem. In Proceedings of the 44th International Con- ference on Software Engineering . 672–684
-
[24]
npm. [n.d.]. Reporting malware in an npm package . https://docs .npmjs.com/ reporting-malware-in-an-npm-package
-
[25]
NVD. [n.d.]. General Information. https://nvd .nist.gov/general
-
[26]
NVD. [n.d.]. Vulnerability Metrics. https://nvd .nist.gov/vuln-metrics/cvss
-
[27]
Donald Pinckney, Federico Cassano, Arjun Guha, and Jonathan Bell. 2023. npm- follower: A Complete Dataset Tracking the NPM Ecosystem. In Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering . 2132–2136
work page 2023
-
[28]
Sonatype. [n.d.]. 2024 State of the Software Supply Chain . https:// www.sonatype.com/state-of-the-software-supply-chain/Introduction
work page 2024
-
[29]
Jonathan Spring, Eric Hatleback, Allen Householder, Art Manion, and Deana Shick. 2021. Time to Change the CVSS? IEEE Security & Privacy 19, 2 (2021), 74–78
work page 2021
-
[30]
GitHub Staff. 2024. Octoverse 2024: The state of open source
work page 2024
-
[31]
Synopsys. [n.d.]. 2024 Open Source Security and Risk Analysis Re- port. https://www .synopsys.com/software-integrity/resources/analyst-reports/ open-source-security-risk-analysis .html
work page 2024
-
[32]
Erik Wittern, Philippe Suter, and Shriram Rajagopalan. 2016. A look at the dynam- ics of the JavaScript package ecosystem. In Proceedings of the 13th International Conference on Mining Software Repositories . 351–361
work page 2016
-
[33]
Seunghoon Woo, Eunjin Choi, Heejo Lee, and Hakjoo Oh. 2023. {V1SCAN}: Discovering 1-day Vulnerabilities in Reused {C/C++} Open-source Software Components Using Code Classification Techniques. In 32nd USENIX Security Symposium (USENIX Security 23) . 6541–6556
work page 2023
-
[34]
Ahmed Zerouali, Eleni Constantinou, Tom Mens, Gregorio Robles, and Jesús González-Barahona. 2018. An empirical analysis of technical lag in npm package dependencies. In International Conference on Software Reuse . Springer, 95–110
work page 2018
-
[35]
Ahmed Zerouali, Tom Mens, Alexandre Decan, and Coen De Roover. 2022. On the impact of security vulnerabilities in the npm and RubyGems dependency networks. Empirical Software Engineering 27, 5 (2022), 107
work page 2022
-
[36]
Markus Zimmermann, Cristian-Alexandru Staicu, Cam Tenny, and Michael Pradel
-
[37]
Small world with high risks: A study of security threats in the npm ecosystem. In 28th USENIX Security symposium (USENIX security 19) . 995–1010. A Appendix A.1 Vulnerability Database Creation Algorithm In this section, we present a detailed explanation of our Algorithm 1 for the data linkage process outlined in Section 3.2. The Unique Vul- nerability Dat...
work page 2023
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.