Hardware-Level Governance of AI Compute: A Feasibility Taxonomy for Regulatory Compliance and Treaty Verification
Pith reviewed 2026-05-10 19:41 UTC · model grok-4.3
The pith
A taxonomy of twenty hardware mechanisms reveals that those required for verifying AI treaties are the least technically mature.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that hardware-level governance mechanisms can be systematically classified into twenty types, each described technically with feasibility ratings and vulnerability notes, and that this classification maps unevenly onto governance scenarios: mature mechanisms suffice for domestic regulation and industry self-regulation, but multilateral treaty verification depends on the least mature ones including on-chip metering, cryptographic proof-of-training, and hardware-embedded enforcement, with a narrowing temporal window for implementation due to manufacturing concentration.
What carries the argument
A taxonomy of twenty hardware-level governance mechanisms organized by monitoring, verification, and enforcement functions, each with a technical description, four-point feasibility rating, and adversarial vulnerability assessment.
If this is right
- Domestic regulation can proceed with currently deployable monitoring and basic enforcement mechanisms.
- Bilateral agreements require intermediate-feasibility verification tools that still need engineering work.
- Multilateral treaty verification cannot be supported until on-chip metering, cryptographic proof-of-training, and hardware-embedded enforcement reach higher maturity.
- Industry self-regulation can use existing software and firmware controls without new hardware mandates.
- Threats from algorithmic efficiency gains and distributed training reduce the reliability of any compute-based governance approach.
Where Pith is reading between the lines
- Without accelerated R&D on the low-feasibility mechanisms, the concentration of chip production may end before verification systems are ready.
- Hardware governance could be combined with supply-chain tracking to address sovereignty concerns raised by distributed training.
- The adversary-tiered analysis suggests that tamper-evident designs rather than perfect tamper-proofing may be sufficient for treaty contexts.
- Prototyping the highest-priority mechanisms would allow empirical refinement of the four-point feasibility scale.
Load-bearing premise
The twenty mechanisms identified are comprehensive and that their feasibility can be reliably rated on a four-point scale using only high-level technical descriptions without prototypes or detailed adversarial testing.
What would settle it
Successful construction and independent testing of a tamper-evident on-chip compute metering system against nation-state adversaries that produces verifiable logs without false negatives would support the feasibility ratings; failure to do so or discovery of a major unlisted mechanism that changes the overall landscape would falsify the taxonomy's completeness.
Figures
read the original abstract
The governance of frontier AI increasingly relies on controlling access to computational resources, yet the hardware-level mechanisms invoked by policy proposals remain largely unexamined from an engineering perspective. This paper bridges the gap between AI governance and computer engineering by proposing a taxonomy of 20 hardware-level governance mechanisms, organised by function (monitoring, verification, enforcement) and assessed for technical feasibility on a four-point scale from currently deployable to speculative. For each mechanism, we provide a technical description, a feasibility rating, and an identification of adversarial vulnerabilities. We map the taxonomy onto four governance scenarios: domestic regulation, bilateral agreements, multilateral treaty verification, and industry self-regulation. Our analysis reveals a structural mismatch: the mechanisms most needed for treaty verification, including on-chip compute metering, cryptographic proof-of-training, and hardware-embedded enforcement, are also the least mature. We assess principal threats to compute-based governance, including algorithmic efficiency gains, distributed training methods, and sovereignty concerns. We identify a temporal constraint: the window during which semiconductor manufacturing concentration makes hardware-level governance implementable is narrowing, while R&D timelines for critical mechanisms span years. We present an adversary-tiered threat analysis distinguishing commercial, non-state, and nation-state actors, arguing the appropriate security standard is tamper-evident assurance analogous to IAEA verification rather than absolute tamper-proofing. The taxonomy, feasibility classification, and mechanism-to-scenario mapping provide a technical foundation for policymakers and identify the R&D investments required before hardware-level governance can support verifiable international agreements.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes a taxonomy of 20 hardware-level mechanisms for AI compute governance, organized by monitoring, verification, and enforcement functions. Each mechanism receives a technical description, a four-point feasibility rating (currently deployable to speculative), and an assessment of adversarial vulnerabilities. The taxonomy is mapped to four scenarios (domestic regulation, bilateral agreements, multilateral treaty verification, industry self-regulation). The central claim is a structural mismatch: mechanisms most needed for treaty verification (on-chip compute metering, cryptographic proof-of-training, hardware-embedded enforcement) are the least mature. The work also analyzes threats including algorithmic efficiency gains and distributed training, identifies a narrowing temporal window due to semiconductor concentration, and advocates tamper-evident assurance standards calibrated to adversary tiers.
Significance. If the taxonomy and mismatch hold, the paper supplies a concrete engineering foundation for AI governance policy, explicitly identifying R&D gaps that must be closed before hardware mechanisms can support verifiable multilateral treaties. The structured mapping of mechanisms to scenarios and the adversary-tiered threat model (distinguishing commercial, non-state, and nation-state actors) are particular strengths that could guide targeted standards development and investment prioritization.
major comments (2)
- [§4 (Feasibility Assessment)] §4 (Feasibility Assessment): The four-point feasibility ratings assigned to on-chip compute metering, cryptographic proof-of-training, and hardware-embedded enforcement rest on narrative technical descriptions and enumerated vulnerabilities without quantitative benchmarks, cycle-accurate estimates, reference implementations, or direct comparison to existing TEE primitives (e.g., Intel TDX or ARM CCA extensions). Because these ratings directly support the structural-mismatch conclusion in §5, the absence of such anchors makes the 'least mature' classification difficult to evaluate or reproduce.
- [§5 (Scenario Mapping and Structural Mismatch)] §5 (Scenario Mapping and Structural Mismatch): The claim that treaty-verification needs concentrate on the lowest-maturity mechanisms assumes an implicit weighting of 'need' versus 'maturity' across the 20 mechanisms, yet no explicit criteria, scoring rubric, or sensitivity analysis for this weighting is provided. Reclassification of even a few mechanisms could alter the mismatch result, which is load-bearing for the paper's policy implications.
minor comments (3)
- [§4] The four-point feasibility scale is described in prose but would benefit from an explicit table listing the exact definitions of each level and the decision rules used for assignment.
- [§3] The list of 20 mechanisms would be easier to navigate if presented in a single summary table with columns for function category, feasibility rating, and primary scenario applicability.
- [§6] A small number of citations to recent hardware-security literature on TEE attestation and remote metering appear to be missing from the threat-analysis section.
Simulated Author's Rebuttal
We thank the referee for these constructive comments, which identify areas where additional technical grounding and transparency can strengthen the manuscript. We address each major comment below, indicating the revisions we intend to make.
read point-by-point responses
-
Referee: [§4 (Feasibility Assessment)] §4 (Feasibility Assessment): The four-point feasibility ratings assigned to on-chip compute metering, cryptographic proof-of-training, and hardware-embedded enforcement rest on narrative technical descriptions and enumerated vulnerabilities without quantitative benchmarks, cycle-accurate estimates, reference implementations, or direct comparison to existing TEE primitives (e.g., Intel TDX or ARM CCA extensions). Because these ratings directly support the structural-mismatch conclusion in §5, the absence of such anchors makes the 'least mature' classification difficult to evaluate or reproduce.
Authors: We agree that the feasibility ratings in §4 are qualitative and rest on narrative synthesis of existing hardware-security literature rather than new empirical measurements or implementations. This reflects the paper's scope as a taxonomy identifying engineering gaps rather than a systems paper presenting novel hardware. In revision we will expand §4 with explicit comparisons to deployed TEE primitives (Intel TDX and ARM CCA), citing published overhead and attestation benchmarks from those systems. We will also add a short subsection explaining the rating criteria for each mechanism and noting why cycle-accurate estimates are unavailable for the proposed mechanisms (they do not yet exist in silicon). These additions will improve evaluability and reproducibility while leaving the four-point scale and the 'least mature' designation unchanged. revision: partial
-
Referee: [§5 (Scenario Mapping and Structural Mismatch)] §5 (Scenario Mapping and Structural Mismatch): The claim that treaty-verification needs concentrate on the lowest-maturity mechanisms assumes an implicit weighting of 'need' versus 'maturity' across the 20 mechanisms, yet no explicit criteria, scoring rubric, or sensitivity analysis for this weighting is provided. Reclassification of even a few mechanisms could alter the mismatch result, which is load-bearing for the paper's policy implications.
Authors: The mismatch conclusion follows from the functional requirements of treaty verification (cryptographic remote attestation and tamper-evident enforcement) mapping onto mechanisms that current hardware does not provide at scale. We acknowledge that the weighting was presented implicitly. In revision we will insert an explicit criteria table in §5 that defines 'need' for each scenario according to required assurance properties (e.g., remote verifiability for multilateral settings). We will also add a brief sensitivity discussion examining the effect of reclassifying two or three borderline mechanisms; this analysis shows the core mismatch persists. These changes will make the weighting transparent without altering the paper's central policy implication. revision: yes
Circularity Check
No circularity: taxonomy and feasibility assessment are self-contained classification
full rationale
The paper constructs a taxonomy of 20 mechanisms, supplies high-level technical descriptions for each, assigns four-point feasibility ratings, and maps them to four governance scenarios. No equations, fitted parameters, derivations, or predictions appear anywhere in the text. The structural-mismatch conclusion is a direct summary of the classification results rather than a reduction to any prior input or self-citation. No load-bearing step relies on self-definition, renaming of known results, or imported uniqueness theorems; the work remains an independent mapping exercise.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Hardware mechanisms can be feasibly assessed for technical readiness on a four-point scale using high-level descriptions without requiring prototypes or full adversarial analysis.
Reference graph
Works this paper leans on
-
[1]
European Parliament and Council of the European Union. Regulation (EU) 2024/1689 of the European Parlia- ment and of the Council of 13 June 2024 laying down harmonised rules on artificial intelligence, 2024
work page 2024
-
[2]
The White House. Executive order 14110: Safe, secure, and trustworthy development and use of artificial intelligence (2023), October 2023
work page 2023
- [3]
-
[4]
Toward a Global Regime for Compute Governance: Building the Pause Button,
Ananthi Al Ramiah, Raymond Koopmanschap, Josh Thorsteinson, Sadruddin Khan, Jim Zhou, Shafira Noh, Joep Meindertsma, and Farhan Shafiq. Toward a global regime for compute governance: Building the pause button.arXiv preprint arXiv:2506.20530, 2025
-
[5]
Nuclear arms control verification and lessons for AI treaties.arXiv preprint arXiv:2304.04123, 2023
Mauricio Baker. Nuclear arms control verification and lessons for AI treaties.arXiv preprint arXiv:2304.04123, 2023
-
[6]
Girish Sastry, Lennart Heim, Haydn Belfield, Markus Anderljung, Miles Brundage, Julian Hazell, Cullen O’keefe, Gillian K Hadfield, Richard Ngo, Konstantin Pilz, et al. Computing power and the governance of artificial intelligence.arXiv preprint arXiv:2402.08797, 2024
-
[7]
To govern AI, we must govern compute.Lawfare, 2024
Lennart Heim, Markus Anderljung, and Haydn Belfield. To govern AI, we must govern compute.Lawfare, 2024
work page 2024
-
[8]
Secure, governable chips.Center for a New American Security
Onni Aarne, Tim Fist, and Caleb Withers. Secure, governable chips.Center for a New American Security. https://www. cnas. org/publications/reports/secure-governable-chips, 2024
work page 2024
-
[9]
Training compute thresholds-key considera- tions for theEU AI act
Alexander Erben, Max Negele, Lennart Heim, Jaime Sevilla, et al. Training compute thresholds-key considera- tions for theEU AI act. 2025
work page 2025
-
[10]
Lennart Heim, Tim Fist, Janet Egan, Sihao Huang, Stephen Zekany, Robert Trager, Michael A Osborne, and Noa Zilberman. Governing through the cloud: The intermediary role of compute providers in AI regulation.arXiv preprint arXiv:2403.08501, 2024
-
[11]
Engineering US responsible AI policy, a survey, 2020-2025
Daniene Byrne. Engineering US responsible AI policy, a survey, 2020-2025. In2025 ASEE Annual Conference & Exposition, 2025
work page 2020
-
[12]
From turing to tomorrow: The uk’s approach to AI regulation.arXiv preprint arXiv:2507.03050, 2025
Oliver Ritchie, Markus Anderljung, and Tom Rachman. From turing to tomorrow: The uk’s approach to AI regulation.arXiv preprint arXiv:2507.03050, 2025
-
[13]
Megha Shrivastava and Amrita Jash. China’s semiconductor conundrum: understanding US export controls and their efficacy.Cogent Social Sciences, 11(1):2528450, 2025
work page 2025
-
[14]
Hardware-enabled governance mechanisms.RAND Corporation, 2024
Gabriel Kulp, Daniel Gonzales, Everett Smith, Lennart Heim, Prateek Puri, MJ Vermeer, and Z Winkelman. Hardware-enabled governance mechanisms.RAND Corporation, 2024
work page 2024
-
[15]
Fred Cheung. The geopolitical rivalry behind US chip export controls and its implications for canada’s semicon- ductor autonomy. 2025
work page 2025
-
[16]
Rebecca Scholefield, Samuel Martin, and Otto Barten. International agreements on ai safety: Review and rec- ommendations for a conditional ai safety treaty.arXiv preprint arXiv:2503.18956, 2025
-
[17]
International institutions for advanced AI.arXiv preprint arXiv:2307.04699, 2023
Lewis Ho, Joslyn Barnhart, Robert Trager, Yoshua Bengio, Miles Brundage, Allison Carnegie, Rumman Chowd- hury, Allan Dafoe, Gillian Hadfield, Margaret Levi, et al. International institutions for advanced AI.arXiv preprint arXiv:2307.04699, 2023
-
[18]
Alvin Moon, Padmaja Vedula, Jesse Geneson, and Simon Bar-on.Strategies and detection gaps in a game- theoretic model of compute governance. RAND, 2025
work page 2025
-
[19]
Jakub Kry ´s, Yashvardhan Sharma, and Janet Egan. Distributed and decentralised training: Technical governance challenges in a shifting AI landscape.arXiv preprint arXiv:2507.07765, 2025
-
[20]
Janet Egan and Lennart Heim. Oversight for frontier AI through a know-your-customer scheme for compute providers.arXiv preprint arXiv:2310.13625, 2023
-
[21]
Verification methods for international AI agreements.arXiv preprint arXiv:2408.16074, 2024
Akash R Wasil, Tom Reed, Jack William Miller, and Peter Barnett. Verification methods for international AI agreements.arXiv preprint arXiv:2408.16074, 2024
-
[22]
Technical requirements for halting dangerous AI activities
Peter Barnett, Aaron Scher, and David Abecassis. Technical requirements for halting dangerous AI activities. arXiv preprint arXiv:2507.09801, 2025
-
[23]
Embedded off-switches for AI compute.arXiv preprint arXiv:2509.07637, 2025
James Petrie. Embedded off-switches for AI compute.arXiv preprint arXiv:2509.07637, 2025. 21 APREPRINT- APRIL7, 2026
-
[24]
Flexible hardware-enabled guarantees for AI compute.arXiv preprint arXiv:2506.15093, 2025
James Petrie, Onni Aarne, Nora Ammann, and David Dalrymple. Flexible hardware-enabled guarantees for AI compute.arXiv preprint arXiv:2506.15093, 2025
-
[25]
Location verification for AI chips
Asher Brass and Onni Aarne. Location verification for AI chips. 2025
work page 2025
-
[26]
Ayse Kok Arslan. Advancing AI governance: Challenges and opportunities for reliable evaluations, verification, and regulatory compliance. 2025
work page 2025
-
[27]
Pattern recognition of artificial intelligence hardware in global trade data
Muhammad Sukri Bin Ramli. Pattern recognition of artificial intelligence hardware in global trade data. 2026
work page 2026
-
[28]
Yonadav Shavit. What does it take to catch a chinchilla? verifying rules on large-scale neural network training via compute monitoring.arXiv preprint arXiv:2303.11341, 2023
-
[29]
A technical analysis of confidential computing
Confidential Computing Consortium. A technical analysis of confidential computing. Technical report, Novem- ber 2022
work page 2022
-
[30]
TeeDFuzzer: Fuzzing trusted execution environ- ment.Electronics, 14(8):1674, 2025
Sheng Wen, Liam Xu, Liwei Tian, Suping Liu, and Yong Ding. TeeDFuzzer: Fuzzing trusted execution environ- ment.Electronics, 14(8):1674, 2025
work page 2025
-
[31]
Principled symbolic validation of enclaves on low-end microcontrollers
Gert-Jan Goossens and Jo Van Bulck. Principled symbolic validation of enclaves on low-end microcontrollers. In2025 IEEE European Symposium on Security and Privacy Workshops (EuroS&PW), pages 435–447. IEEE, 2025
work page 2025
-
[32]
Proof-of-learning: Definitions and practice
Hengrui Jia, Mohammad Yaghini, Christopher A Choquette-Choo, Natalie Dullerud, Anvith Thudi, Varun Chan- drasekaran, and Nicolas Papernot. Proof-of-learning: Definitions and practice. In2021 IEEE Symposium on Security and Privacy (SP), pages 1039–1056. IEEE, 2021
work page 2021
-
[33]
Miles Brundage, Noemi Dreksler, Aidan Homewood, Sean McGregor, Patricia Paskov, Conrad Stosz, Girish Sastry, A Feder Cooper, George Balston, Steven Adler, et al. Frontier AI auditing: Toward rigorous third-party assessment of safety and security practices at leading ai companies.arXiv preprint arXiv:2601.11699, 2026
-
[34]
Defending compute thresholds against legal loopholes.arXiv preprint arXiv:2502.00003, 2025
Matteo Pistillo and Pablo Villalobos. Defending compute thresholds against legal loopholes.arXiv preprint arXiv:2502.00003, 2025
-
[35]
Historical analogues that can inform AI governance
Michael JD Vermeer. Historical analogues that can inform AI governance. 2024
work page 2024
-
[36]
Sara Hooker. On the limitations of compute thresholds as a governance strategy.arXiv preprint arXiv:2407.05694, 2024
-
[37]
Lee Alexander. The efficiency shock in AI governance: Open-weight models, regulatory fragmentation, and the emerging limits of chokepoint control.Journal of Algorithmic Governance & Policy, 1(1):107–119, 2026
work page 2026
-
[38]
Prateek Puri. Small models, big threats: Characterizing safety challenges from low-compute AI models.arXiv preprint arXiv:2601.21365, 2026
-
[39]
Practical principles for AI cost and compute accounting.arXiv preprint arXiv:2502.15873, 2025
Stephen Casper, Luke Bailey, and Tim Schreier. Practical principles for AI cost and compute accounting.arXiv preprint arXiv:2502.15873, 2025
-
[40]
Overcoming problems with compute thresholds for AI regulation
Adam Jones. Overcoming problems with compute thresholds for AI regulation. 2025
work page 2025
-
[41]
Milton L Mueller. It’s just distributed computing: Rethinking AI governance.Telecommunications Policy, 49(3):102917, 2025
work page 2025
-
[42]
Legal challenges to compute governance.Lawfare, 2024
Diane Bernabei, James Baker, and Cosimo Fabrizio. Legal challenges to compute governance.Lawfare, 2024
work page 2024
-
[43]
Vili Lehdonvirta, B ´ox¯ı W´u, and Zoe Hawkins. Compute north vs. compute south: The uneven possibilities of compute-based ai governance around the globe. InProceedings of the AAAI/ACM Conference on AI, Ethics, and Society, volume 7, pages 828–838, 2024
work page 2024
-
[44]
Jiho Yoon, Jaeyoung Oh, Junhyun Bae, TCE Cheng, and Keumah Jung. Export controls and strategic adaptation in global supply chains.IEEE Transactions on Engineering Management, 2026
work page 2026
-
[45]
Automated compliance and the regulation of AI.Institute for Law & AI Working Paper, (1-2026), 2026
Cullen O’Keefe and Kevin Frazier. Automated compliance and the regulation of AI.Institute for Law & AI Working Paper, (1-2026), 2026
work page 2026
-
[46]
Risk tiers: Towards a gold standard for advanced AI.Research Memo, June, 16, 2025
Nicholas A Caputo, Sim ´eon Campos, Stephen Casper, James Gealy, Bosco Hung, Julian Jacobs, Daniel Kossack, M Murray, J Schuett, AK Wisakanto, et al. Risk tiers: Towards a gold standard for advanced AI.Research Memo, June, 16, 2025
work page 2025
-
[47]
Sella Nevo, Dan Lahav, Ajay Karpur, Yogev Bar-On, Henry-Alexander Bradley, and Jeff Alstott.Securing AI model weights: Preventing theft and misuse of frontier models. Rand Corporation, 2024
work page 2024
-
[48]
Tobin South, Alexander Camuto, Shrey Jain, Shayla Nguyen, Robert Mahari, Christian Paquin, Jason Morton, and Alex’Sandy’ Pentland. Verifiable evaluations of machine learning models using zkSNARKs.arXiv preprint arXiv:2402.02675, 2024. 22 APREPRINT- APRIL7, 2026
-
[49]
Afifah Kashif, Abdul Muhsin Hameed, and Asim Iqbal. Governance at the edge of architecture: Regulating NeuroAI and neuromorphic systems.arXiv preprint arXiv:2602.01503, 2026. 23
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.