What You See Is Not What You Execute: Memory-Based Runtime SBOM Generation for Supply Chain Security
Pith reviewed 2026-06-29 05:03 UTC · model grok-4.3
The pith
MEM-SBOM generates runtime SBOMs for Python applications by parsing volatile memory without prior instrumentation.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
MEM-SBOM is the first memory forensics framework that generates SBOMs directly from the runtime state of Python applications. It recovers the modules from the interpreter's internal structures, resolves package versions, and analyzes bytecode to build dependency graphs and identify vulnerable functions. Implemented as Volatility 3 plugins, it achieves 100 percent extraction accuracy on 51 real-world applications, identifies cases where vulnerable routines are called, and recovers all runtime packages missed by existing SBOM tools.
What carries the argument
MEM-SBOM, a suite of Volatility 3 plugins that read Python interpreter internal structures from memory dumps to extract loaded modules, resolve versions, and construct executed dependency graphs.
If this is right
- Runtime SBOMs can show precisely which functions from a dependency are invoked, enabling targeted vulnerability checks.
- Supply chain analysis becomes possible in incident response without requiring any pre-installed monitoring.
- Applications receive more complete dependency graphs than those produced by metadata or filesystem-based tools.
- One tested application was shown to invoke vulnerable routines inside its tornado dependency.
Where Pith is reading between the lines
- The same memory-parsing technique might extend to other interpreted languages whose runtime keeps similar internal records.
- Forensic teams could combine MEM-SBOM output with existing memory-analysis tools to trace execution history after an incident.
- Repeated testing across more Python versions would be needed to confirm the parsing method remains reliable as interpreters evolve.
Load-bearing premise
The Python interpreter's internal data structures stay consistent enough in memory dumps that they can be parsed to recover every actually executed module and its version.
What would settle it
A memory dump from a running Python application that contains a loaded package MEM-SBOM completely fails to report while independent confirmation shows the package was executed.
Figures
read the original abstract
Modern software development relies heavily on third-party components from public repositories, expanding the software supply chain attack surface. In response to these growing risks, federal initiatives have advanced the Software Bill of Materials (SBOM) as a standardized mechanism for improving transparency by describing software components, dependencies, and their relationships. However, SBOMs built from metadata or filesystem artifacts fail to capture the components loaded and executed at runtime, especially in dynamic ecosystems such as Python. Moreover, generating runtime SBOMs through instrumentation requires monitoring to be deployed in advance and the system to remain observable throughout execution. Such conditions are difficult to satisfy in production environments and incident-response scenarios. Volatile memory, in contrast, provides a reliable source for recovering the actual runtime state of a running application without requiring prior instrumentation. Therefore, this paper presents MEM-SBOM, the first memory forensics framework that generates SBOMs directly from the runtime state of Python applications. It recovers the modules from the interpreter's internal structures, resolves package versions, and analyzes bytecode to build dependency graphs and identify vulnerable functions. We implemented MEM-SBOM as a suite of Volatility 3 plugins and evaluated it against 51 real-world Python applications. It achieves 100% extraction accuracy, identifies Streamlit as the only application that calls the vulnerable routines of the tornado dependency, and recovers all runtime packages missed by existing SBOM tools, providing more accurate dependency graphs and better vulnerability assessment. These capabilities make MEM-SBOM a practical foundation for software supply chain security and incident response by providing a forensically sound runtime view of what is executed on a system.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript presents MEM-SBOM, the first memory forensics framework that generates runtime SBOMs for Python applications directly from volatile memory using a suite of Volatility 3 plugins. It recovers loaded modules from the CPython interpreter's internal structures, resolves package versions, analyzes bytecode to construct dependency graphs, and identifies vulnerable functions. The evaluation on 51 real-world applications reports 100% extraction accuracy, recovery of all runtime packages missed by existing SBOM tools, and a specific finding that Streamlit is the only tested application calling vulnerable routines in the tornado dependency.
Significance. If the empirical results hold, the contribution is significant for software supply chain security and incident response. It provides a forensically sound method to obtain an accurate runtime view of executed components in dynamic Python environments without requiring prior instrumentation, addressing limitations of metadata- or filesystem-based SBOMs and enabling better vulnerability assessment through recovered dependency graphs.
major comments (2)
- [Evaluation] Evaluation section: The central claim of 100% extraction accuracy on 51 applications is load-bearing for the paper's contribution, yet the provided description (including the abstract) supplies no details on methodology for establishing ground truth, selection criteria for the applications, handling of edge cases such as dynamically generated modules or varying Python interpreter versions, or error conditions in memory parsing. This omission prevents assessment of whether the reported accuracy supports the claims of reliable runtime SBOM generation.
- [Implementation] Implementation description: The claims that the framework 'recovers the modules from the interpreter's internal structures' and 'analyzes bytecode to build dependency graphs' require concrete technical details on the Volatility plugin logic (e.g., specific data structures walked or bytecode traversal algorithm) to substantiate that the approach works across the tested applications without hidden assumptions about memory layout stability.
minor comments (2)
- [Abstract] The abstract would be strengthened by a one-sentence summary of the evaluation methodology and any limitations considered.
- Consider adding a table listing the 51 applications with key characteristics (e.g., Python version, number of dependencies) to contextualize the 100% accuracy result.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback and positive assessment of the work's significance. We address each major comment below and will revise the manuscript to incorporate additional details as outlined.
read point-by-point responses
-
Referee: [Evaluation] Evaluation section: The central claim of 100% extraction accuracy on 51 applications is load-bearing for the paper's contribution, yet the provided description (including the abstract) supplies no details on methodology for establishing ground truth, selection criteria for the applications, handling of edge cases such as dynamically generated modules or varying Python interpreter versions, or error conditions in memory parsing. This omission prevents assessment of whether the reported accuracy supports the claims of reliable runtime SBOM generation.
Authors: We agree that the Evaluation section requires expanded methodological details to substantiate the 100% accuracy claim. In the revised manuscript, we will add a new subsection that explicitly describes: ground truth establishment through direct comparison against runtime sys.modules inspection and importlib.metadata package data captured at dump time; selection criteria for the 51 applications (a mix of popular PyPI packages and real-world tools such as Streamlit, chosen for diversity in size, dependency complexity, and usage patterns); handling of edge cases including dynamically generated modules via exec() and importlib (tested in our evaluation with no accuracy loss); support across Python interpreter versions 3.8–3.11; and error conditions in memory parsing (e.g., partial structure recovery due to memory fragmentation), with none encountered in the reported experiments. These additions will enable readers to fully evaluate the reliability of the results. revision: yes
-
Referee: [Implementation] Implementation description: The claims that the framework 'recovers the modules from the interpreter's internal structures' and 'analyzes bytecode to build dependency graphs' require concrete technical details on the Volatility plugin logic (e.g., specific data structures walked or bytecode traversal algorithm) to substantiate that the approach works across the tested applications without hidden assumptions about memory layout stability.
Authors: We acknowledge that the Implementation section provides only high-level descriptions and will expand it with concrete technical details in the revision. The module recovery plugin walks the PyInterpreterState structure (accessed via the _PyRuntime symbol) to locate the modules PyDictObject, then iterates PyModuleObject entries to extract __name__, __file__, and version strings from __version__ attributes or PKG-INFO metadata in memory. The bytecode analysis plugin traverses PyCodeObject instances by following the co_code field, emulating Python's opcode parsing to identify IMPORT_NAME, LOAD_CONST, and CALL_FUNCTION opcodes for dependency graph construction; this logic is version-aware via offsets derived from the Python version string recovered from the interpreter. These specifics will clarify the data structures and algorithms used and confirm stability assumptions across the tested CPython builds. revision: yes
Circularity Check
No circularity; empirical engineering result with independent evaluation
full rationale
The paper describes an implementation of Volatility 3 plugins that walk CPython interpreter structures in memory dumps to extract loaded modules, resolve versions, build dependency graphs from bytecode, and identify vulnerable functions. The central claims rest on this engineering pipeline and its empirical performance (100% extraction accuracy on 51 real-world applications). No equations, fitted parameters renamed as predictions, self-citation load-bearing premises, uniqueness theorems, or ansatzes are invoked. The evaluation is presented as direct testing against real applications rather than any reduction to prior fitted quantities or self-referential definitions. This is a standard non-circular engineering contribution.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Volatile memory contains sufficient and accurate information about the Python interpreter's loaded modules and bytecode state at the time of the dump.
Forward citations
Cited by 1 Pith paper
-
Skills Are Not Islands: Measuring Dependency and Risk in Agent Skill Supply Chains
The paper defines Agent Skill Supply Chains (ASSCs) and SkillDepAnalyzer to extract and analyze dependency graphs from over 1.43 million LLM agent skills, revealing structural patterns and security signals.
Reference graph
Works this paper leans on
-
[1]
Forensic Science International: Digital Investigation 48, 301685
Ali, H., Case, A., Ahmed, I., 2025a. Leveraging memory forensics to investigate and detect illegal 3d printing activities. Forensic Science International: Digital Investigation 53, 301925. doi:10.1016/j.fsidi. 2025.301925. Ali,H.,Case,A.,Ahmed,I.,2025b.Memoryanalysisofthepythonruntime environment. Forensic Science International: Digital Investigation 53, ...
-
[2]
GitHub repository
Trivy: A comprehensive and versatile security scanner.https://github.com/aquasecurity/trivy. GitHub repository. Accessed: 2025-03-11. Benedetti, G., Cofano, S., Brighente, A., Conti, M.,
2025
-
[3]
Accessed: 2025-11-11
Django security releases issued: 5.2.8, 5.1.14, and 4.2.26.https://www.djangoproject.com/weblog/2025/nov/05/ security-releases/. Accessed: 2025-11-11. Bufalino, J., Di Francesco, M., Blaise, A., Secci, S.,
2025
-
[4]
CoRRabs/2510.03163(2025).https://doi.org/10.48550/ARXIV.2510
Sbomproof: Beyondallegedsbomcompliance forsupplychainsecurityofcontainer images. arXiv preprint arXiv:2510.05798 doi:10.48550/arXiv.2510. 05798. preprint. Chen, V.,
-
[5]
GitHub repository
Awesome python.https://github.com/vinta/ awesome-python. GitHub repository. Accessed: 2025-10-02. Cofano, S., Benedetti, G., Dell’Amico, M.,
2025
-
[6]
Sbom generation tools in the python ecosystem: An in-detail analysis, in: 2024 IEEE 23rd International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom), IEEE. pp. 427–434. doi:10.1109/ TrustCom63139.2024.00077. CybersecurityandInfrastructureSecurityAgency(CISA),2021.Defending AgainstSoftwareSupplyChainAttacks. Technical...
-
[7]
gov/sites/default/files/2023-04/sbom-types-document-508c.pdf
Types of software bill of material (sbom) documents.https://www.cisa. gov/sites/default/files/2023-04/sbom-types-document-508c.pdf. Ac- cessed: 2025-03-11. Cybersecurity and Infrastructure Security Agency (CISA),
2023
-
[8]
2025 Minimum Elements for a Software Bill of Materials (SBOM): Public Comment Draft. Technical Report. U.S. Department of Homeland Security. Draft for public comment. CycloneDX Project, 2025a. cdxgen: Universal polyglot sbom generator. https://github.com/CycloneDX/cdxgen. GitHub repository. Accessed: 2025-04-26. CycloneDX Project, 2025b. Cyclonedx: A stan...
-
[9]
Halbritter, A., Merli, D.,
Accessed: 2025-11-15. Halbritter, A., Merli, D.,
2025
-
[10]
Accuracy evaluation of sbom tools for web applications and system-level software, in: Proceedings of the 19th InternationalConferenceonAvailability,ReliabilityandSecurity,Asso- ciationforComputingMachinery.pp.1–9.doi:10.1145/3664476.3670926. Harrison, A.,
-
[11]
GitHub repository
Sbom4python: Open-source tool to generate an sbom for installed python modules.https://github.com/anthonyharrison/ sbom4python. GitHub repository. Accessed: 2025-04-26. Kampourakis, V., Kavallieratios, G., Gkioulos, V., Katsikas, S.,
2025
-
[12]
Cracksinthechain:Atechnicalanalysisofreal-lifesupplychainsecurity incidents.Computers&Security159,104673.doi:10.1016/j.cose.2025. 104673. Kawaguchi, N., Hart, C.,
-
[13]
On the deployment control and runtime monitoringofcontainersbasedonconsumersidesboms,in:2024IEEE 21st Consumer Communications & Networking Conference (CCNC), IEEE. pp. 1022–1025. doi:10.1109/CCNC51664.2024.10454654. Lakshmanan, R.,
-
[14]
Accessed: 2025-11-10
Malicious python packages on pypi downloaded 39,000+ times, steal sensitive data.https://thehackernews.com/2025/ 04/malicious-python-packages-on-pypi.html. Accessed: 2025-11-10. Mike Fiedler,
2025
-
[15]
org/posts/2023-09-18-inbound-malware-reporting/
Inbound malware volume report.https://blog.pypi. org/posts/2023-09-18-inbound-malware-reporting/. Accessed: 2025- 11-10. Mirakhorli, M., Garcia, D., Dillon, S., Laporte, K., Morrison, M., Lu, H., Koscinski, V., Enoch, C.,
2023
-
[16]
arXiv preprint arXiv:2402.11151 doi:10.48550/arXiv.2402.11151
A landscape study of open source and proprietary tools for software bill of materials (sbom). arXiv preprint arXiv:2402.11151 doi:10.48550/arXiv.2402.11151. preprint. Nahum, M., Grolman, E., Maimon, I., Mimran, D., Brodt, O., Elyashar, A., Elovici, Y., Shabtai, A.,
-
[17]
Computers & Security 144, 103977
Ossintegrity: Collaborative open- source code integrity verification. Computers & Security 144, 103977. doi:10.1016/j.cose.2024.103977. NationalCounterintelligenceandSecurityCenter(NCSC),2021. Software Supply Chain Attacks. Technical Report. Office of the Director of National Intelligence (ODNI). Accessed: 2025-11-15. Nelson, N.,
-
[18]
Ac- cessed: 2025-01-06
Novel pypi malware uses compiled python bytecode toevadedetection.https://www.darkreading.com/application-security/ novel-pypi-malware-compiled-python-bytecode-evade-detection. Ac- cessed: 2025-01-06. nexB,
2025
-
[19]
GitHub repository
pip-requirements-parser.https://github.com/nexB/ pip-requirements-parser. GitHub repository. Accessed: 2025-10-05. Office of the Director of National Intelligence (ODNI), National Security Agency (NSA), Cybersecurity and Infrastructure Security Agency (CISA),
2025
-
[20]
https://www.cisa.gov/sites/default/files/2024-08/SECURING_THE_ SOFTWARE_SUPPLY_CHAIN_RECOMMENDED_PRACTICES_FOR_SOFTWARE_BILL_OF_ MATERIALS_CONSUMPTION-508.pdf
Securing the software supply chain: Recommended practices for software bill of materials consumption. https://www.cisa.gov/sites/default/files/2024-08/SECURING_THE_ SOFTWARE_SUPPLY_CHAIN_RECOMMENDED_PRACTICES_FOR_SOFTWARE_BILL_OF_ MATERIALS_CONSUMPTION-508.pdf. Accessed: 2025-03-11. OSS Review Toolkit,
2024
-
[21]
GitHub repository
Oss review toolkit (ort): Foss policy automation and sbom generation framework.https://github.com/ oss-review-toolkit/ort. GitHub repository. Accessed: 2025-04-27. Park, T., Lettner, J., Na, Y., Volckaert, S., Franz, M.,
2025
-
[22]
Python Documentation: site — Site- specific configuration hook.https://docs.python.org/3/library/site. html. Accessed: 2025-01-09. Reichelt, D.G., Bulej, L., Jung, R., Van Hoorn, A.,
2025
-
[23]
Association for Computing Machinery
A Machine Learning-Based Approach For Detecting Malicious PyPI Packages. Association for Computing Machinery. p. 1617–1626. doi:10.1145/ 3672608.3707756. Scholtes, R., Khodayari, S., Staicu, C.A., Pellegrino, G.,
-
[24]
Charon: Polyglotcodeanalysisfordetectingvulnerabilitiesinscriptinglanguages nativeextensions,in:2025IEEE10thEuropeanSymposiumonSecurity and Privacy (EuroS&P), IEEE. pp. 153–168. doi:10.1109/EuroSP63326. 2025.00018. Sharma, A., Wittlinger, M., Baudry, B., Monperrus, M.,
-
[25]
arXiv preprint arXiv:2407.00246 doi:10.48550/arXiv.2407.00246
Sbom.exe: Counteringdynamiccodeinjectionbasedonsoftwarebillofmaterialsin java. arXiv preprint arXiv:2407.00246 doi:10.48550/arXiv.2407.00246. preprint. Smith, E.V.,
-
[26]
python.org/pep-0420/
PEP 420: Implicit Namespace Packages.https://peps. python.org/pep-0420/. Accessed: 2025-10-30. Snow, E.,
2025
-
[27]
https://peps.python.org/pep-0451/
PEP 451: A ModuleSpec Type for the Import System. https://peps.python.org/pep-0451/. Accessed: 2025-10-30. Software Package Data Exchange (SPDX),
2025
-
[28]
Accessed: 2025- 03-12
Spdx: Open standard for software bill of materials (sbom).https://spdx.dev/. Accessed: 2025- 03-12. Sonatype,
2025
-
[29]
Accessed: 2025-03-10
10th annual state of the software supply chain.https:// www.sonatype.com/state-of-the-software-supply-chain/introduction. Accessed: 2025-03-10. Sonatype Nexus Community,
2025
-
[30]
GitHub repository
Jake: Dependency analysis and sbom generation tool.https://github.com/sonatype-nexus-community/jake/ tree/803ec4f63e2c117352463065cf69d12a69f38450. GitHub repository. Accessed: 2025-10-02. Stalnaker, T., Wintersgill, N., Chaparro, O., Di Penta, M., German, D.M., Poshyvanyk, D.,
2025
-
[31]
Boms away! inside the minds of stakeholders: A comprehensive study of bills of materials for software systems, in: Proceedings of the 46th IEEE/ACM International Conference on SoftwareEngineering,AssociationforComputingMachinery.pp.1–13. doi:10.1145/3597503.3623347. Stufft, D.,
-
[32]
org/pep-0503/
PEP 503: Simple Repository API.https://peps.python. org/pep-0503/. Accessed: 2025-11-12. Tooling and Implementation Working Group,
2025
-
[33]
Third Edition
Framing software component transparency: Establishing a common software bill of materials (sbom), third edition.https://www.cisa.gov/sites/ default/files/2024-10/SBOM%20Framing%20Software%20Component% 20Transparency%202024.pdf. Third Edition. Volatility Foundation,
2024
-
[34]
GitHub repository
Volatility 3: The volatile memory extrac- tionframework.https://github.com/volatilityfoundation/volatility3. GitHub repository. Accessed: 2024-12-01. Vu, D.L., Newman, Z., Meyers, J.S.,
2024
-
[35]
Bad snakes: Understanding andimprovingpythonpackageindexmalwarescanning,in:Proceedings of the 45th International Conference on Software Engineering (ICSE), IEEE. pp. 499–511. doi:10.1109/ICSE48619.2023.00052. Wang, J.,
-
[36]
Accessed: 2025-11-10
Malicious pypi package zlibxjson targets browser data and credentials.https://www.fortinet.com/blog/threat-research/ malicious-pypi-package-zlibxjson-steals-browser-data. Accessed: 2025-11-10. Xia, B., Bi, T., Xing, Z., Lu, Q., Zhu, L.,
2025
-
[37]
An empirical study on software bill of materials: Where we stand and the road ahead, in: 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE), IEEE. pp. 2630–2642. doi:10.1109/ICSE48619.2023.00219. Yu, S., Song, W., Hu, X., Yin, H.,
-
[38]
On the correctness of metadata- based sbom generation: A differential analysis approach, in: 2024 54th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), IEEE. pp. 29–36. doi:10.1109/DSN58291.2024. 00018. Zahan, N., Lin, E., Tamanna, M., Enck, W., Williams, L.,
-
[39]
Software billsofmaterialsarerequired.arewethereyet?IEEESecurity&Privacy 21, 82–88. doi:10.1109/MSEC.2023.3237100. A. Appendix Hala Ali, Andrew Case, Irfan Ahmed:Preprint submitted to ElsevierPage 19 of 19 What You See Is Not What You Execute Memory-Based Runtime SBOM Generation for Supply Chain Security Table 12 Evaluation dataset of 51 Python application...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.