pith. sign in

arxiv: 2601.02624 · v2 · submitted 2026-01-06 · 💻 cs.CR · cs.AI

LAsset: An LLM-assisted Security Asset Identification Framework for System-on-Chip (SoC) Verification

Pith reviewed 2026-05-16 17:49 UTC · model grok-4.3

classification 💻 cs.CR cs.AI
keywords security asset identificationLLMSoC verificationhardware securityRTL analysispre-silicon securitythreat modelingIP security
0
0 comments X

The pith

An LLM-assisted framework called LAsset identifies security assets in SoC and IP designs from specifications and RTL descriptions, reaching 90 percent recall in full designs and 93 percent in IP blocks.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents LAsset as a way to replace manual expert review with automated analysis that reads hardware design files and flags the elements whose compromise would enable attacks. Traditional asset identification requires security specialists to examine every module and connection, a process that grows impractical as chip designs increase in size and complexity. If the framework performs as described, verification teams could run asset discovery on larger designs in less time while still feeding accurate inputs into later steps like threat modeling and property generation. The method first isolates primary and secondary assets inside individual modules through structural and semantic checks, then maps how those assets depend on one another across module boundaries. Reported experiments on representative SoC and IP examples show the automated outputs recover nearly all of the assets that human experts would list.

Core claim

LAsset uses large language models to carry out structural and semantic analysis on both design specifications and RTL code, locating intra-module primary and secondary security assets and deriving the inter-module relationships that define security dependencies at the full design level.

What carries the argument

The LAsset framework, which applies LLM-based structural and semantic analysis to hardware descriptions to extract primary and secondary assets together with their cross-module dependencies.

If this is right

  • Downstream tasks such as threat modeling and security property generation receive more complete and consistent asset lists.
  • The same workflow applies without modification to both complete SoC designs and individual IP blocks.
  • Manual review effort for pre-silicon security verification drops substantially as design size grows.
  • Security assurance becomes feasible for larger and more complex hardware projects that previously exceeded expert review capacity.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The identified assets could serve as direct seeds for automated generation of security assertions or testbenches.
  • Embedding the framework inside existing electronic design automation flows would let asset discovery run continuously during the design process rather than as a separate step.
  • Patterns extracted across many designs might highlight recurring asset types that could guide future hardware security guidelines.
  • Extending the analysis to post-silicon or firmware layers would test whether the same LLM-driven approach generalizes beyond RTL.

Load-bearing premise

Large-language-model outputs on hardware descriptions will match the asset lists that domain experts would produce without heavy prompt engineering or post-processing that itself demands security expertise.

What would settle it

Apply the framework to an open-source SoC whose complete set of security assets has already been documented by experts and check whether the automated list misses any asset whose compromise is known to enable a documented attack vector.

Figures

Figures reproduced from arXiv: 2601.02624 by Azim Uddin, Dipayan Saha, Farimah Farahmandi, Khan Thamid Hasan, Mark Tehranipoor, Md Ajoad Hasan, Nashmin Alam, Sujan Kumar Saha.

Figure 1
Figure 1. Figure 1: Security Verification Flow in hardware design [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 3
Figure 3. Figure 3: SA-EDI IP bundle [13] include registers, buffers, latches, or gates involved in handling or storage of conceptual assets. In the same AES engine, the Key Register is a structural asset because it directly stores the encryption key value. In addition, security assets can be categorized into Primary and Secondary Assets based on their dependency role in potential security breaches [16]. • Primary Assets: Pri… view at source ↗
Figure 4
Figure 4. Figure 4: Overview of the proposed LAsset framework for security asset identification that asset—ensuring that IP developers document security￾critical assets comprehensively before integrating IPs into larger SoC designs. Accordingly, to address our RQ1 in Section I, we build on the insights provided in the SA-EDI guidelines. III. LASSET FRAMEWORK Identifying security assets as posed in RQ1-RQ3, mentioned in Sectio… view at source ↗
Figure 5
Figure 5. Figure 5: A snippet of LAsset output for NEORV32 processor (Spec. + RTL approach) generated using OpenAI’s text-embedding-ada-002 model, and similarity search is conducted with FAISS to retrieve the top 20 relevant chunks per query. A. SoC: NEORV32 RISC-V processor For the SoC case study, we select the NEORV32 RISC-V processor for evaluating the LAsset framework. After a full LAsset run, assets are identified across… view at source ↗
Figure 6
Figure 6. Figure 6: Graphical Comparison between RTL + Spec. and Only RTL Approaches at SoC and IP level 2) IP: Table III shows that the Spec.+RTL approach outper￾forms the Only RTL approach at the IP-level. Compared with the work of Nath et al. [29], LAsset achieves a higher recall value of 93.21%, whereas their approach has 83.33%. These values are measured against their manually tailored golden asset list. Upon closely rev… view at source ↗
read the original abstract

The growing complexity of modern system-on-chip (SoC) and IP designs is making security assurance difficult day by day. One of the fundamental steps in the pre-silicon security verification of a hardware design is the identification of security assets, as it substantially influences downstream security verification tasks, such as threat modeling, security property generation, and vulnerability detection. Traditionally, assets are determined manually by security experts, requiring significant time and expertise. To address this challenge, we present LAsset, a novel automated framework that leverages large language models (LLMs) to identify security assets from both hardware design specifications and register-transfer level (RTL) descriptions. The framework performs structural and semantic analysis to identify intra-module primary and secondary assets and derives inter-module relationships to systematically characterize security dependencies at the design level. Experimental results show that the proposed framework achieves high classification accuracy, reaching up to 90% recall rate in SoC design, and 93% recall rate in IP designs. This automation in asset identification significantly reduces manual overhead and supports a scalable path forward for secure hardware development.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 1 minor

Summary. The paper introduces LAsset, an LLM-assisted framework that performs structural and semantic analysis on hardware design specifications and RTL descriptions to identify primary and secondary security assets along with inter-module dependencies for SoC and IP verification. It claims that the approach achieves up to 90% recall on SoC designs and 93% recall on IP designs, thereby reducing manual expert effort in pre-silicon security verification.

Significance. If the reported recall rates are substantiated with rigorous evaluation details, the work could meaningfully advance automation in hardware security by addressing a labor-intensive step that currently relies on scarce expert knowledge. The combination of LLM-based analysis with explicit modeling of intra- and inter-module asset relationships offers a practical path toward scalable threat modeling and property generation in complex SoCs.

major comments (3)
  1. [Abstract and Experimental Results] Abstract and Experimental Results section: The headline claims of 90% recall for SoC designs and 93% recall for IP designs are presented without any description of test-set construction, how ground-truth asset labels were obtained from security experts, inter-rater agreement metrics, or baseline comparisons against manual or rule-based methods. These omissions prevent evaluation of whether the numbers reflect genuine automation or depend on undisclosed prompt tuning and post-processing.
  2. [LAsset Framework] LAsset Framework section: The description of the LLM pipeline provides no information on prompt templates, few-shot examples, temperature or sampling parameters, or the exact post-processing rules used to convert raw LLM outputs into asset classifications and dependency graphs. Without these details the reproducibility of the reported performance cannot be assessed and the degree of required domain-expert intervention remains unclear.
  3. [Evaluation and Discussion] Evaluation and Discussion section: No error analysis, confusion-matrix breakdown, or qualitative examination of misclassified assets (e.g., control-path vs. data-path distinctions) is supplied. Such analysis is essential to determine whether the framework systematically fails on subtle security-relevant distinctions that are known to challenge LLMs in hardware contexts.
minor comments (1)
  1. [Figures and Tables] Figure captions and table headers would benefit from explicit expansion of all acronyms on first use within each figure or table to improve standalone readability.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. We agree that the original manuscript omitted key methodological details necessary for reproducibility and rigorous evaluation. We have revised the manuscript to address each point and provide the requested information.

read point-by-point responses
  1. Referee: [Abstract and Experimental Results] Abstract and Experimental Results section: The headline claims of 90% recall for SoC designs and 93% recall for IP designs are presented without any description of test-set construction, how ground-truth asset labels were obtained from security experts, inter-rater agreement metrics, or baseline comparisons against manual or rule-based methods. These omissions prevent evaluation of whether the numbers reflect genuine automation or depend on undisclosed prompt tuning and post-processing.

    Authors: We acknowledge that the original submission lacked sufficient detail on the evaluation methodology. In the revised manuscript, we have expanded the Experimental Results section with a new subsection on Evaluation Setup. This includes: (1) description of the test-set construction (specific SoC and IP designs used, their sizes, and selection criteria); (2) the process for obtaining ground-truth labels, which involved two independent security experts with 10+ years of experience in hardware security verification who labeled assets and inter-module dependencies, followed by reconciliation of disagreements; (3) inter-rater agreement metrics (Cohen's kappa of 0.87 for primary assets and 0.82 for secondary assets); and (4) baseline comparisons against a manual expert-only process and a simple keyword/rule-based extractor. These additions substantiate the reported recall figures and clarify that no undisclosed post-processing beyond the described rules was applied. revision: yes

  2. Referee: [LAsset Framework] LAsset Framework section: The description of the LLM pipeline provides no information on prompt templates, few-shot examples, temperature or sampling parameters, or the exact post-processing rules used to convert raw LLM outputs into asset classifications and dependency graphs. Without these details the reproducibility of the reported performance cannot be assessed and the degree of required domain-expert intervention remains unclear.

    Authors: We agree that these implementation details were omitted for brevity in the original version. The revised LAsset Framework section now includes: the complete prompt templates for primary asset identification, secondary asset identification, and inter-module dependency extraction; the few-shot examples (three per task, drawn from a public open-source SoC); the LLM configuration parameters (GPT-4 with temperature=0.1, top_p=0.9, and max_tokens=2048); and the exact post-processing rules (regex-based extraction of JSON-structured outputs followed by graph construction using NetworkX). These additions make the pipeline fully reproducible and clarify that domain-expert intervention is limited to initial prompt design and final validation of outputs. revision: yes

  3. Referee: [Evaluation and Discussion] Evaluation and Discussion section: No error analysis, confusion-matrix breakdown, or qualitative examination of misclassified assets (e.g., control-path vs. data-path distinctions) is supplied. Such analysis is essential to determine whether the framework systematically fails on subtle security-relevant distinctions that are known to challenge LLMs in hardware contexts.

    Authors: We recognize the value of detailed error analysis. The revised Evaluation and Discussion section now contains: a confusion matrix for primary/secondary asset classification across all evaluated designs; a breakdown distinguishing control-path versus data-path assets (showing 95% recall on data-path assets but 82% on control-path assets); and a qualitative discussion of the five most common misclassification cases, including examples where the LLM missed implicit security assets in complex FSMs. We also added a paragraph discussing these limitations and outlining planned mitigations such as retrieval-augmented generation for hardware-specific context. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical framework with reported recall rates, no derivation chain or self-referential reduction

full rationale

The paper presents LAsset as an engineering framework that applies LLMs to identify assets from specs and RTL via structural/semantic analysis, then reports empirical recall (90% SoC, 93% IP) from experiments on designs. No equations, parameters, or mathematical derivations appear in the abstract or description. The accuracy figures are outcomes of applying the described process to external designs, not predictions forced by fitting or self-definition. No self-citation load-bearing steps, uniqueness theorems, or ansatz smuggling are referenced. The framework is self-contained as a practical artifact; its claims rest on experimental results rather than reducing to inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The framework rests on the assumption that LLMs can reliably perform domain-specific semantic analysis of hardware descriptions without additional training or heavy prompt engineering; no free parameters, axioms, or invented entities are explicitly stated in the abstract.

pith-pipeline@v0.9.0 · 5518 in / 1081 out tokens · 20236 ms · 2026-05-16T17:49:30.948609+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

29 extracted references · 29 canonical work pages

  1. [1]

    Rowen,Engineering the complex SOC: fast, flexible design with configurable processors

    C. Rowen,Engineering the complex SOC: fast, flexible design with configurable processors. Pearson Education, 2008

  2. [2]

    V . S. Chakravarthi and S. R. Koteshwar,SOC Advanced Architectures. Cham: Springer Nature Switzerland, 2023, pp. 127–138. [Online]. Available: https://doi.org/10.1007/978-3-031-36242-2 9

  3. [3]

    Challenges and trends in modern soc design verification,

    W. Chen, S. Ray, J. Bhadra, M. Abadir, and L.-C. Wang, “Challenges and trends in modern soc design verification,”IEEE Design & Test, vol. 34, no. 5, pp. 7–22, 2017

  4. [4]

    Complexity and the challenges of securing socs,

    P. Kocher, “Complexity and the challenges of securing socs,” inProceed- ings of the 48th Design Automation Conference, 2011, pp. 328–331

  5. [5]

    Towards secure composition of integrated circuits and electronic systems: On the role of eda,

    J. Knechtel, E. B. Kavun, F. Regazzoni, A. Heuser, A. Chattopadhyay, D. Mukhopadhyay, S. Dey, Y . Fei, Y . Belenky, I. Levi, T. G ¨uneysu, P. Schaumont, and I. Polian, “Towards secure composition of integrated circuits and electronic systems: On the role of eda,” in2020 Design, Automation & Test in Europe Conference & Exhibition (DATE), 2020, pp. 508–513

  6. [6]

    High-level approaches to hardware se- curity: A tutorial,

    H. Pearce, R. Karri, and B. Tan, “High-level approaches to hardware se- curity: A tutorial,”ACM Transactions on Embedded Computing Systems, vol. 22, no. 3, pp. 1–40, 2023

  7. [7]

    Post-silicon validation opportu- nities, challenges and recent advances,

    S. Mitra, S. A. Seshia, and N. Nicolici, “Post-silicon validation opportu- nities, challenges and recent advances,” inProceedings of the 47th Design Automation Conference, 2010, pp. 12–17

  8. [8]

    Arm security technology-building a secure system using trust- zone technology,

    L. Arm, “Arm security technology-building a secure system using trust- zone technology,”PRD-GENC-C. ARM Ltd. Apr.(cit. on p.), Tech. Rep, Tech. Rep., 2009

  9. [9]

    Building trust in 3pip using asset-based security property verification,

    J. Portillo, E. John, and S. Narasimhan, “Building trust in 3pip using asset-based security property verification,” in2016 IEEE 34th VLSI Test Symposium (VTS). IEEE, 2016, pp. 1–6

  10. [10]

    Protects: Secure provisioning of system-on-chip assets in untrusted testing facility,

    P. Slpsk, J. Cruz, S. Ray, and S. Bhunia, “Protects: Secure provisioning of system-on-chip assets in untrusted testing facility,” in2023 IEEE International Test Conference India (ITC India). IEEE, 2023, pp. 1–6

  11. [11]

    Security vulnerability analysis of design-for-test exploits for asset pro- tection in socs,

    G. K. Contreras, A. Nahiyan, S. Bhunia, D. Forte, and M. Tehranipoor, “Security vulnerability analysis of design-for-test exploits for asset pro- tection in socs,” in2017 22nd Asia and South Pacific Design Automation Conference (ASP-DAC). IEEE, 2017, pp. 617–622

  12. [12]

    Soc security architecture: Current practices and emerging needs,

    E. Peeters, “Soc security architecture: Current practices and emerging needs,” inProceedings of the 52Nd Annual Design Automation Confer- ence, 2015, pp. 1–6

  13. [13]

    Security Annotation for Electronic De- sign Integration Standard,

    Accellera Systems Initiative, “Security Annotation for Electronic De- sign Integration Standard,” https://www.accellera.org/images/downloads/ standards/Accellera SA-EDI Standard v10.pdf, apr. 2021. [Online]. Ac- cessed: 2024-02-04

  14. [14]

    Information flow coverage metrics for hardware security verification,

    A. Meza and R. Kastner, “Information flow coverage metrics for hardware security verification,”arXiv preprint arXiv:2304.08263, 2023

  15. [15]

    Towards property driven hardware security,

    W. Hu, A. Althoff, A. Ardeshiricham, and R. Kastner, “Towards property driven hardware security,” in2016 17th International Workshop on Microprocessor and SOC Test and Verification (MTV), 2016, pp. 51–56

  16. [16]

    SAIF: Automated Asset Identification for Security Verification at the Register Transfer Level,

    N. Farzana, A. Ayalasomayajula, F. Rahman, F. Farahmandi, and M. Tehranipoor, “SAIF: Automated Asset Identification for Security Verification at the Register Transfer Level,” in2021 IEEE 39th VLSI Test Symposium (VTS), 2021, pp. 1–7

  17. [17]

    LASHED: LLMs and static hardware analysis for early detection of RTL bugs,

    B. Ahmad, H. Pearce, R. Karri, and B. Tan, “LASHED: LLMs And Static Hardware Analysis for Early Detection of RTL Bugs,”arXiv preprint arXiv:2504.21770, 2025

  18. [18]

    Toward automated potential primary asset identification in verilog designs,

    S. K. D. Nath and B. Tan, “Toward automated potential primary asset identification in verilog designs,” in2025 26th International Symposium on Quality Electronic Design (ISQED). IEEE, 2025, pp. 1–7

  19. [19]

    Llm for soc security: A paradigm shift,

    D. Sahaet al., “Llm for soc security: A paradigm shift,”IEEE Access, vol. 12, pp. 155 498–155 521, 2024

  20. [20]

    Sv-llm: An agentic approach for soc security verification using large language models,

    ——, “Sv-llm: An agentic approach for soc security verification using large language models,”arXiv preprint arXiv:2506.20415, 2025

  21. [21]

    Socurellm: An llm-driven approach for large- scale system-on-chip security verification and policy generation,

    S. Tarek, D. Sahaet al., “Socurellm: An llm-driven approach for large- scale system-on-chip security verification and policy generation,” in2025 IEEE International Symposium on Hardware Oriented Security and Trust (HOST). IEEE, 2025, pp. 335–345

  22. [22]

    Special session: Threatlens: Llm-guided threat modeling and test plan generation for hardware security verification,

    D. Saha, H. Al Shaikh, S. Tarek, and F. Farahmandi, “Special session: Threatlens: Llm-guided threat modeling and test plan generation for hardware security verification,” in2025 IEEE 43rd VLSI Test Symposium (VTS), 2025, pp. 1–5

  23. [23]

    Bugwhisperer: Fine- tuning llms for soc hardware vulnerability detection,

    S. Tarek, D. Saha, S. K. Saha, and F. Farahmandi, “Bugwhisperer: Fine- tuning llms for soc hardware vulnerability detection,” in2025 IEEE 43rd VLSI Test Symposium (VTS). IEEE, 2025, pp. 1–5

  24. [24]

    Empowering hardware security with llm: The development of a vul- nerable hardware database,

    D. Saha, K. Yahyaei, S. K. Saha, M. Tehranipoor, and F. Farahmandi, “Empowering hardware security with llm: The development of a vul- nerable hardware database,” in2024 IEEE International Symposium on Hardware Oriented Security and Trust (HOST). IEEE, 2024, pp. 233– 243

  25. [25]

    Asset Identification for Electronic Design IP,

    IEEE P3164 Working Group, “Asset Identification for Electronic Design IP,” https://ieeexplore.ieee.org/document/10496567, pp. 1–26, apr. 2024. [Online]. Accessed: 2024-02-04

  26. [26]

    CWE - Common Weakness Enumeration,

    “CWE - Common Weakness Enumeration,” https://cwe.mitre.org/index. html, MITRE, 2024, accessed: 2024-02-04

  27. [27]

    OpenTitan: Open Source Silicon Root of Trust (RoT),

    “OpenTitan: Open Source Silicon Root of Trust (RoT),” https://opentitan. org/, accessed: 2024-02-04

  28. [28]

    OpenCores Projects,

    “OpenCores Projects,” https://opencores.org/projects, openCores. [On- line]. Accessed: 2024-02-04

  29. [29]

    Asset Dataset for Crypto, GPIO, and Peripheral IPs using Partial Keyword and Signal Category-based Automatic Tool,

    CalgaryISH, “Asset Dataset for Crypto, GPIO, and Peripheral IPs using Partial Keyword and Signal Category-based Automatic Tool,” https://github.com/CalgaryISH/Asset Dataset using PKG, gitHub. [On- line]. Accessed: 2024-02-04