pith. sign in

arxiv: 2511.20313 · v2 · submitted 2025-11-25 · 💻 cs.CR

A Reality Check on SBOM-based Vulnerability Management: An Empirical Study and A Path Forward

Pith reviewed 2026-05-17 05:23 UTC · model grok-4.3

classification 💻 cs.CR
keywords SBOMvulnerability scanningfalse positivessoftware supply chainreachability analysisfunction call analysisempirical studyunreachable code
0
0 comments X

The pith

Vulnerability scanners using SBOMs produce 92% false positives mainly from unreachable code, which function call analysis can cut by 62%.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper studies the practical use of Software Bills of Materials for finding vulnerabilities in software supply chains. It first shows that lock files with strong package managers produce accurate and consistent SBOMs. Even with that solid base, the authors find that standard vulnerability scanners still report a 92% false positive rate across 2414 repositories, mostly because they flag problems in code that the program never actually executes. Adding a step of function call analysis removes 61.9% of those false alarms and yields cleaner, more useful reports that reduce developer alert fatigue.

Core claim

Using a high-fidelity SBOM generated from lock files and strong package managers as the foundation, downstream vulnerability scanners still produce a 92.0% false positive rate in the studied case. The dominant source of these errors is the reporting of vulnerabilities inside unreachable code. Function call analysis applied to the same SBOM prunes 61.9% of the false alarms, supporting a practical two-stage workflow that first creates an accurate SBOM and then enriches it with reachability information to produce low-noise vulnerability reports.

What carries the argument

Function call analysis applied after SBOM generation to detect and exclude vulnerabilities located in unreachable code paths.

If this is right

  • Lock files combined with strong package managers provide a reliable foundation for SBOM generation in security workflows.
  • Function call analysis can be added as a second stage to turn high-noise SBOM-based scans into actionable reports.
  • Developers experience less alert fatigue when unreachable-code vulnerabilities are filtered out before reporting.
  • The two-stage approach improves the overall utility of SBOMs for securing the software supply chain.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Similar reachability checks could be applied to other static-analysis outputs beyond SBOM-driven vulnerability lists.
  • Adoption in continuous-integration pipelines would require lightweight integration of function-call tools with existing SBOM generators.
  • The same pattern of high false positives from unreachable code may appear in other dependency-vulnerability databases that lack reachability data.

Load-bearing premise

The 2414 open-source repositories and the chosen vulnerability scanners are representative of real-world software supply chain practice so that the 92% false-positive rate and 61.9% reduction apply beyond this specific set.

What would settle it

Repeating the same analysis on a fresh collection of repositories or with different vulnerability scanners and obtaining a false-positive rate well below 92% or no reduction from function call analysis would falsify the central findings.

Figures

Figures reproduced from arXiv: 2511.20313 by Charalambos Konstantinou, Li Zhou, Marc Dacier.

Figure 1
Figure 1. Figure 1: An overview of SBOM verification methodology. [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: An overview of vulnerability verification methodology. [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
read the original abstract

The Software Bill of Materials (SBOM) is a critical tool for securing the software supply chain (SSC), but its practical utility is undermined by inaccuracies in both its generation and its application in vulnerability scanning. This paper presents a large-scale empirical study on 2,414 open-source repositories to address these issues from a practical standpoint. First, we demonstrate that using lock files with strong package managers enables the generation of accurate and consistent SBOMs, establishing a reliable foundation for security analysis. Using this high-fidelity foundation, however, we expose a more fundamental flaw in practice: downstream vulnerability scanners produce a staggering 92.0\% false positive rate in our case study. We pinpoint the primary cause as the flagging of vulnerabilities within unreachable code. We then demonstrate that function call analysis can effectively prune 61.9\% of these false alarms. Our work validates a practical, two-stage approach for SSC security: first, generate an accurate SBOM using lock files and strong package managers, and second, enrich it with function call analysis to produce actionable, low-noise vulnerability reports that alleviate developers' alert fatigue.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper reports results from an empirical study of 2,414 open-source repositories. It first shows that lock files combined with strong package managers produce accurate and consistent SBOMs. Using this foundation, the authors measure that downstream vulnerability scanners yield a 92.0% false-positive rate, which they attribute primarily to alerts on vulnerabilities located in unreachable code; they further report that function-call analysis can prune 61.9% of those false alarms. The work concludes by recommending a two-stage practical workflow for software supply-chain security.

Significance. If the central measurements hold, the study supplies large-scale empirical evidence on a practically important limitation of SBOM-based vulnerability management and demonstrates a concrete mitigation that could reduce developer alert fatigue. The scale of the repository corpus and the explicit separation between SBOM generation and scanning steps are clear strengths that support the reported percentages.

major comments (2)
  1. [Empirical study and reachability analysis sections] The 92.0% false-positive rate and the subsequent 61.9% pruning figure both rest on the classification of code as unreachable. The manuscript does not provide sufficient detail on the static reachability procedure (e.g., call-graph construction algorithm, handling of reflection, dynamic proxies, or runtime-loaded modules). Because static analysis routinely misses such constructs in real-world open-source code, a non-trivial fraction of the reported “unreachable” vulnerabilities may actually be reachable at runtime, directly affecting both the false-positive rate and the measured benefit of the pruning step.
  2. [Vulnerability scanning methodology] Scanner configuration details (version, rule sets, and whether default or tuned settings were used) are not fully reported. Different configurations can materially change the set of reported vulnerabilities and therefore the observed false-positive rate; without this information the 92.0% figure cannot be independently verified or generalized.
minor comments (2)
  1. [Abstract] The abstract uses the adjective “staggering” for the 92.0% figure; a more neutral phrasing would be appropriate for a technical manuscript.
  2. [SBOM generation results] Consider adding a short table that lists the exact package managers, lock-file formats, and SBOM generators employed, together with the fraction of repositories for which each succeeded.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our empirical study. We address each major comment below and will revise the manuscript accordingly to improve clarity and reproducibility.

read point-by-point responses
  1. Referee: [Empirical study and reachability analysis sections] The 92.0% false-positive rate and the subsequent 61.9% pruning figure both rest on the classification of code as unreachable. The manuscript does not provide sufficient detail on the static reachability procedure (e.g., call-graph construction algorithm, handling of reflection, dynamic proxies, or runtime-loaded modules). Because static analysis routinely misses such constructs in real-world open-source code, a non-trivial fraction of the reported “unreachable” vulnerabilities may actually be reachable at runtime, directly affecting both the false-positive rate and the measured benefit of the pruning step.

    Authors: We agree that the current description of the static reachability procedure lacks sufficient detail. In the revised manuscript we will expand the Reachability Analysis section to specify the call-graph construction algorithm (a conservative inter-procedural analysis built on top of the package’s dependency graph), our handling of reflection (reflective invocations are conservatively treated as reachable), dynamic proxies, and runtime-loaded modules (all classes referenced via Class.forName or similar are included in the reachable set). We will also add an explicit threats-to-validity paragraph acknowledging that static analysis can miss certain runtime behaviors and that our conservative policy may slightly underestimate the true reachable set. This does not change the reported 92.0 % or 61.9 % figures but improves transparency and allows readers to assess the impact of these limitations. revision: yes

  2. Referee: [Vulnerability scanning methodology] Scanner configuration details (version, rule sets, and whether default or tuned settings were used) are not fully reported. Different configurations can materially change the set of reported vulnerabilities and therefore the observed false-positive rate; without this information the 92.0% figure cannot be independently verified or generalized.

    Authors: We accept this criticism. The revised version will include a dedicated subsection under Vulnerability Scanning that reports the exact scanner name and version, the vulnerability database or rule set employed, and confirmation that all scans were performed with the scanner’s default configuration and no custom tuning or filtering. These additions will enable independent reproduction and generalization of the 92.0 % false-positive rate. revision: yes

Circularity Check

0 steps flagged

Purely empirical measurements with no derivations or self-referential steps

full rationale

The paper reports direct measurements from static analysis of 2,414 repositories: SBOM generation accuracy via lock files, a 92.0% false-positive rate in vulnerability scanners, and a 61.9% reduction via function-call pruning. No equations, fitted parameters, predictions derived from prior fits, or load-bearing self-citations appear in the derivation chain. All claims are observational outputs from the repository corpus rather than reductions of inputs by construction. The study is therefore self-contained against external benchmarks and receives the default non-circularity finding.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Claims rest on empirical observations from the studied repositories rather than mathematical axioms or new entities; the main domain assumption is that the selected projects and scanners reflect typical practice.

axioms (1)
  • domain assumption The 2414 open-source repositories and selected vulnerability scanners are representative of broader software supply chain practice.
    Used to generalize the 92% false-positive finding and the effectiveness of function call pruning.

pith-pipeline@v0.9.0 · 5499 in / 1140 out tokens · 60767 ms · 2026-05-17T05:23:42.691589+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

57 extracted references · 57 canonical work pages

  1. [1]

    Pietro Abate, Roberto Di Cosmo, Georgios Gousios, and Stefano Zacchiroli. 2020. Dependency solving is still hard, but we are getting better at it. In2020 IEEE 27th International Conference on Software Analysis, Evolution and Reengineering (SANER). IEEE, 547–551

  2. [2]

    Agency for Healthcare Research and Quality. 2019. Alert Fatigue. https://psnet. ahrq.gov/primer/alert-fatigue. Accessed on 2025-09-27

  3. [3]

    Rahaf Alkhadra, Joud Abuzaid, Mariam AlShammari, and Nazeeruddin Moham- mad. 2021. Solar winds hack: In-depth analysis and countermeasures. In2021 12th International Conference on Computing Communication and Networking Technologies (ICCCNT). IEEE, 1–7

  4. [4]

    Anchore. 2023. Syft: CLI tool and library for generating a Software Bill of Materials from container images and filesystems. https://github.com/anchore/ syft Accessed: 2024-12-03

  5. [5]

    Inc. Anchore. 2025. Grype: A Vulnerability Scanner for Container Images and Filesystems. https://github.com/anchore/grype Accessed: 2025-04-30

  6. [6]

    Musard Balliu, Benoit Baudry, Sofia Bobadilla, Mathias Ekstedt, Martin Monper- rus, Javier Ron, Aman Sharma, Gabriel Skoglund, César Soto-Valero, and Martin Wittlinger. 2023. Challenges of producing software bill of materials for java.IEEE Security & Privacy(2023)

  7. [7]

    Olivier Beg. 2023. Why 2023 is the Year for Software Supply Chain Attacks. https: //hadrian.io/blog/why-2023-is-the-year-for-software-supply-chain-attacks Ac- cessed: 2024-12-03

  8. [8]

    Giacomo Benedetti, Serena Cofano, Alessandro Brighente, and Mauro Conti

  9. [9]

    The Impact of SBOM Generators on Vulnerability Assessment in Python: A Comparison and a Novel Approach.arXiv preprint arXiv:2409.06390(2024)

  10. [10]

    Alex Birsan. 2021. Dependency Confusion: How I Hacked Into Apple, Microsoft and Dozens of Other Companies. Medium post / disclosure and writeup. https://medium.com/@alex.birsan/dependency-confusion-how-i-hacked-into- apple-microsoft-and-dozens-of-other-companies-4a5d60fec610 Public writeup demonstrating dependency/substitution attacks that motivated suppl...

  11. [11]

    Serena Cofano, Giacomo Benedetti, and Matteo Dell’Amico. 2024. SBOM Gener- ation Tools in the Python Ecosystem: an In-Detail Analysis. In2024 IEEE 23rd International Conference on Trust, Security and Privacy in Computing and Com- munications (TrustCom). IEEE, 427–434

  12. [12]

    Intel Corporation. 2025. CVE Binary Tool: Scan Binaries for Known Vulnerabili- ties. https://github.com/intel/cve-bin-tool Accessed: 2025-04-30

  13. [13]

    The End of Vendoring

    Russ Cox. 2018. Defining Go Modules (Go & Versioning, Part 6). Online article / blog. https://research.swtch.com/vgo-module “The End of Vendoring” discussion and vendor directory rationale

  14. [14]

    Gregorio Dalia, Corrado Aaron Visaggio, Andrea Di Sorbo, and Gerardo Canfora

  15. [15]

    InProceedings of the 19th International Conference on A vailability, Reliability and Security

    SBOM Ouverture: What We Need and What We Have. InProceedings of the 19th International Conference on A vailability, Reliability and Security. 1–9

  16. [16]

    Alexandre Decan, Tom Mens, and Philippe Grosjean. 2019. An empirical compar- ison of dependency network evolution in seven software packaging ecosystems. Empirical Software Engineering24, 1 (2019), 381–416

  17. [17]

    Robert J Ellison, John B Goodenough, Charles B Weinstock, and Carol Woody

  18. [18]

    Evaluating and mitigating software supply chain security risks.Software Engineering Institute, Tech. Rep. CMU/SEI-2010-TN-016(2010)

  19. [19]

    Fraunhofer FKIE. 2025. cwe-checker: Detects Vulnerable Patterns in Binary Executables. https://github.com/fkie-cad/cwe_checker Accessed: 2025-04-30

  20. [20]

    Eclipse Foundation. 2025. JBOM: Runtime and Static SBOM Generator for Java Applications. https://github.com/eclipse-jbom/jbom Accessed: 2025-04-30

  21. [21]

    OWASP Foundation. 2025. dep-scan: Next-Generation Security and Risk Audit Tool. https://github.com/owasp-dep-scan/dep-scan Accessed: 2025-04-30

  22. [22]

    Jatin Garg. 2025. Using Grype in CI/CD Pipelines for Automated Security Checks. https://www.gocodeo.com/post/using-grype-in-ci-cd-pipelines-for- automated-security-checks. Accessed: 2025-09-22

  23. [23]

    GitHub. 2025. About the Dependency Graph. https://docs.github.com/en/code- security/supply-chain-security/understanding-your-software-supply- chain/about-the-dependency-graph Accessed: 2025-04-30

  24. [24]

    Inc. GitHub. 2025. GitHub Actions. https://github.com/features/actions. https: //github.com/features/actions Continuous integration / continuous deployment workflow automation

  25. [25]

    2025.Container scanning

    GitLab Docs. 2025.Container scanning. https://docs.gitlab.com/user/application_ security/container_scanning/ Accessed: 2025-09-22

  26. [26]

    Ravie Lakshmanan. 2023. PyTorch Machine Learning Framework Compro- mised with Malicious Dependency. https://thehackernews.com/2023/01/pytorch- machine-learning-framework.html Accessed: 2024-12-03

  27. [27]

    Mario Lins, René Mayrhofer, Michael Roland, Daniel Hofer, and Martin Schwaighofer. 2024. On the critical path to implant backdoors and the effective- ness of potential mitigation techniques: Early learnings from XZ.arXiv preprint arXiv:2404.08987(2024)

  28. [28]

    JFrog Ltd. 2025. build-info-go: Go Library and CLI for Generating Build Informa- tion. https://github.com/jfrog/build-info-go Accessed: 2025-04-30

  29. [29]

    Microsoft. 2025. SBOM Tool. https://github.com/microsoft/sbom-tool Accessed: 2025-04-30

  30. [30]

    We Feel Like We’re Winging It:

    C. Miller et al. 2023. "We Feel Like We’re Winging It:" A Study on Navigating Open-Source Dependency Abandonment.Proceedings of the ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE)(2023). https://www.cs.cmu.edu/~ckaestne/pdf/fse23. pdf Empirical study of developer practices around depen...

  31. [31]

    Audris Mockus. 2007. Large-scale code reuse in open source software. InFirst International Workshop on Emerging Trends in FLOSS Research and Development (FLOSS’07: ICSE Workshops 2007). IEEE, 7–7

  32. [32]

    Inc. Moderne. 2025. OpenRewrite: Automated Refactoring Ecosystem. https: //docs.openrewrite.org/ Accessed: 2025-04-30

  33. [33]

    Hisham Muhammad, Lucas C Villa Real, and Michael Homer. 2019. Taxonomy of package management in programming languages and operating systems. In Proceedings of the 10th Workshop on Programming Languages and Operating Systems. 60–66

  34. [34]

    Éamonn Ó Muirí. 2019. Framing software component transparency: Establishing a common software bill of material (SBOM).NTIA, Nov12 (2019)

  35. [35]

    National Telecommunications and Information Administration. 2021. Software Bill of Materials. https://www.ntia.gov/page/software-bill-materials Accessed: 2024-12-03

  36. [36]

    National Telecommunications and Information Administration (NTIA). 2021. Vulnerability-Exploitability eXchange (VEX) – An Overview. One-page summary / Technical overview. U.S. Department of Commerce. https://www.ntia.gov/files/ ntia/publications/vex_one-page_summary.pdf Accessed: 2025-09-24

  37. [37]

    Eric O’Donoghue, Brittany Boles, Clemente Izurieta, and Ann Marie Reinhold

  38. [38]

    InProceedings of the 2024 Workshop on Software Supply Chain Offensive Research and Ecosystem Defenses

    Impacts of Software Bill of Materials (SBOM) Generation on Vulnerability Detection. InProceedings of the 2024 Workshop on Software Supply Chain Offensive Research and Ecosystem Defenses. 67–76

  39. [39]

    Eric O’Donoghue, Ann Marie Reinhold, and Clemente Izurieta. 2024. Assessing security risks of software supply chains using software bill of materials. In2024 IEEE International Conference on Software Analysis, Evolution and Reengineering- Companion (SANER-C). IEEE, 134–140

  40. [40]

    OWASP Foundation. 2024. CycloneDX: The International Standard for Bill of Materials (ECMA-424). https://cyclonedx.org/ Accessed: 2024-12-03

  41. [41]

    CycloneDX Project. 2025. cdxgen: CycloneDX Bill of Materials Generator. https: //github.com/CycloneDX/cdxgen Accessed: 2025-04-30

  42. [42]

    CycloneDX Project. 2025. CycloneDX Maven Plugin: Generates SBOMs from Maven Projects. https://github.com/CycloneDX/cyclonedx-maven-plugin Ac- cessed: 2025-04-30

  43. [43]

    CycloneDX Project. 2025. cyclonedx-node-module: Meta-package for CycloneDX SBOM Generators. https://github.com/CycloneDX/cyclonedx-node-module Accessed: 2025-04-30

  44. [44]

    Jenkins Project. 2025. Jenkins. https://www.jenkins.io. https://www.jenkins.io Open source automation server

  45. [45]

    OSS Review Toolkit Project. 2025. OSS Review Toolkit (ORT). https://oss-review- toolkit.org/ort/ Accessed: 2025-04-30

  46. [46]

    Zhengyang Qu, Shahid Alam, Yan Chen, Xiaoyong Zhou, Wangjun Hong, and Ryan Riley. 2017. DyDroid: Measuring Dynamic Code Loading and Its Security Implications in Android Applications. InProceedings of the 2017 47th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN). 61–72. doi:10.1109/DSN.2017.14

  47. [47]

    Md Fazle Rabbi, Arifa Islam Champa, Costain Nachuma, and Minhaz Fahim Zibran. 2024. Sbom generation tools under microscope: A focus on the npm ecosystem. InProceedings of the 39th ACM/SIGAPP Symposium on Applied Com- puting. 1233–1241

  48. [48]

    Ann Marie Reinhold, Travis Weber, Colleen Lemak, Derek Reimanis, and Clemente Izurieta. 2023. New version, new answer: Investigating cybersecurity static-analysis tool findings. In2023 IEEE international conference on cyber security and resilience (CSR). IEEE, 28–35

  49. [49]

    Aqua Security. 2023. Trivy: Vulnerability Scanner for Containers and Other Artifacts. https://github.com/aquasecurity/trivy Accessed: 2024-12-03

  50. [50]

    Vandana Verma Sehgal and PS Ambili. 2023. A Taxonomy and Survey of Software Bill of Materials (SBOM) Generation Approaches. InAnalytics Global Conference. Springer, 40–51

  51. [51]

    Mojtaba Shahin, Muhammad Ali Babar, and Liming Zhu. 2017. Continuous integration, delivery and deployment: a systematic review on approaches, tools, challenges and practices.IEEE access5 (2017), 3909–3943

  52. [52]

    Software Package Data Exchange. 2024. SPDX: The Software Package Data Exchange. https://spdx.dev/ Accessed: 2024-12-03

  53. [53]

    Diomidis Spinellis. 2012. Package management systems.IEEE software29, 2 (2012), 84–86

  54. [54]

    The Farama Foundation. 2022. Announcing The Farama Foundation. https: //farama.org/Announcing-The-Farama-Foundation

  55. [55]

    2016.Guidelines for the creation of interoperable software identification ICSE ’26, Apr 12–18, 2026, Rio de Janeiro, BR Li et al

    David Waltermire, Brant A Cheikes, Larry Feldman, David Waltermire, and Greg Witte. 2016.Guidelines for the creation of interoperable software identification ICSE ’26, Apr 12–18, 2026, Rio de Janeiro, BR Li et al. (SWID) tags. US Department of Commerce, National Institute of Standards and Technology

  56. [56]

    Sheng Yu, Wei Song, Xunchao Hu, and Heng Yin. 2024. On the correctness of metadata-based SBOM generation: A differential analysis approach. In2024 54th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN). IEEE, 29–36

  57. [57]

    Yunze Zhao, Yuchen Zhang, Dan Chacko, and Justin Cappos. 2024. CovSBOM: Enhancing Software Bill of Materials with Integrated Code Coverage Analysis. In2024 IEEE 35th International Symposium on Software Reliability Engineering (ISSRE). IEEE, 228–237