Blueprint, Bootstrap, and Bridge: A Security Look at NVIDIA GPU Confidential Computing
Pith reviewed 2026-05-19 06:04 UTC · model grok-4.3
The pith
NVIDIA GPU confidential computing keeps data transfers protected across the CPU-GPU bridge under its threat model.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By mapping the system's blueprint, bootstrap process, and bridge mechanisms, the authors establish that under the GPU-CC threat model data transfers along different paths remain protected when crossing between the trusted CPU and GPU domains.
What carries the argument
The GPU-CC bridge that coordinates hardware engines and software components to maintain isolation and protection for data moving between CPU and GPU domains.
If this is right
- Existing AI workloads can continue to use GPU-CC without code changes while retaining the claimed protections.
- Data movement between CPU and GPU domains stays isolated even across multiple transfer paths.
- The bootstrap sequence successfully coordinates hardware and software to activate those protections before workloads run.
- Security researchers gain a usable model for further inspection of the closed GPU-CC implementation.
Where Pith is reading between the lines
- Similar reconstruction techniques could be applied to other proprietary confidential-computing stacks to check cross-domain data protection.
- If the protections hold, GPU-CC could serve as a template for hardware vendors seeking to add secure acceleration to AI pipelines without requiring application rewrites.
- The findings open the door to automated tools that monitor the bridge for deviations from the expected protected behavior in production clusters.
Load-bearing premise
The reconstruction of a coherent system view from proprietary components is accurate enough to support the security experiments and conclusions.
What would settle it
An experiment that demonstrates a data leak or exposure along any transfer path when running under the documented GPU-CC threat model and configuration.
Figures
read the original abstract
NVIDIA GPU Confidential Computing (GPU-CC) aims to provide secure execution for AI workloads. For end users, enabling GPU-CC is seamless and requires no modifications to existing applications. However, this ease of adoption relies on a proprietary and highly complex system that is difficult to inspect, creating challenges for researchers seeking to understand its architecture and security landscape. In this work, we provide a security look at GPU-CC by reconstructing a coherent view of the system. We first examine the system's blueprint, focusing on the specialized architectural engines that support its security mechanisms. We then analyze the bootstrap process, which coordinates hardware and software components to establish these protections. Finally, we conduct targeted experiments to assess whether, under the GPU-CC threat model, data transfers along different paths remain protected across the bridge between trusted CPU and GPU domains. We responsibly disclosed all security findings presented in this paper to the NVIDIA Product Security Incident Response Team (PSIRT).
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper reconstructs a coherent architectural view of NVIDIA GPU Confidential Computing (GPU-CC) by first detailing the blueprint of specialized hardware engines supporting its security mechanisms, then analyzing the bootstrap process that coordinates hardware and software to establish trusted domains, and finally reporting targeted experiments assessing whether data transfers along different paths remain protected across the CPU-GPU bridge under the stated GPU-CC threat model. All security findings were responsibly disclosed to NVIDIA PSIRT.
Significance. If the reconstruction accurately captures the closed-source system and the experiments comprehensively cover relevant paths, the work would provide a valuable public analysis of protections for AI workloads on widely deployed NVIDIA hardware. The responsible disclosure and focus on a practical threat model are strengths; however, the absence of independent corroboration (such as a machine-checked model or vendor diagram) limits the strength of the security conclusions.
major comments (1)
- [Bridge / targeted experiments] The central claim that data transfers remain protected across the bridge depends on the completeness of the proprietary reconstruction described in the blueprint and bootstrap sections. Without additional validation (e.g., cross-checks against public documentation, a formal model, or explicit discussion of how omitted mechanisms were ruled out), it is difficult to confirm that all relevant data paths were examined in the targeted experiments.
minor comments (2)
- [Abstract] The abstract describes the methodology but does not summarize the concrete outcomes of the targeted experiments (e.g., which paths were tested and what the results showed).
- Notation for trust domains and data paths could be introduced earlier with a diagram to improve readability when discussing the bridge experiments.
Simulated Author's Rebuttal
We thank the referee for their constructive review and for recognizing the strengths of our responsible disclosure and practical threat model focus. We address the major comment below, revising the manuscript to improve transparency around our reconstruction methodology while remaining honest about inherent limitations of analyzing a closed-source system.
read point-by-point responses
-
Referee: [Bridge / targeted experiments] The central claim that data transfers remain protected across the bridge depends on the completeness of the proprietary reconstruction described in the blueprint and bootstrap sections. Without additional validation (e.g., cross-checks against public documentation, a formal model, or explicit discussion of how omitted mechanisms were ruled out), it is difficult to confirm that all relevant data paths were examined in the targeted experiments.
Authors: We agree that the strength of our claims on protected data transfers rests on the accuracy and completeness of the blueprint and bootstrap reconstruction. Our analysis draws from public NVIDIA documentation on GPU-CC, hardware interface probing, and empirical experiments across multiple transfer paths under the stated threat model. In the revised manuscript, we have added an explicit subsection detailing our cross-checks against available public specifications, the criteria used to identify and rule out alternative mechanisms, and consistency verification across GPU configurations. These additions aim to make the coverage of relevant paths more transparent. A machine-checked formal model or internal vendor diagrams cannot be provided, as the system is proprietary and such artifacts are not accessible through public channels or the disclosure process. revision: partial
Circularity Check
No significant circularity in the derivation chain
full rationale
The paper reconstructs a coherent view of the proprietary GPU-CC system by examining its blueprint and bootstrap process, then performs targeted experiments to assess data transfer protections under the stated threat model. This chain relies on empirical observation and analysis rather than any self-referential definitions, fitted parameters renamed as predictions, or load-bearing self-citations. No equations or derivations are presented that reduce the central claim to its own inputs by construction; the assessment is externally falsifiable via the experiments and responsible disclosure process. The reconstruction serves as an input step whose accuracy is a standard limitation for closed-source systems, not a circularity mechanism.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption The GPU-CC threat model accurately reflects the intended security boundaries.
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We instrument the GPU kernel module... conduct a series of experiments to identify the security weaknesses... For certain components... we propose well-reasoned speculations about their inner working mechanisms.
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Forward citations
Cited by 4 Pith papers
-
GPUBreach: Privilege Escalation Attacks on GPUs using Rowhammer
Unprivileged CUDA kernels can use Rowhammer to tamper with GPU page tables for targeted privilege escalation, leaking cryptographic keys and escalating to CPU root access by bypassing IOMMU.
-
Revealing NVIDIA Closed-Source Driver Command Streams for CPU-GPU Runtime Behavior Insight
A technique recovers complete GPU hardware command streams from NVIDIA's closed-source CUDA driver via kernel instrumentation and doorbell watchpoints, demonstrated on data movement and CUDA Graphs.
-
When Agents Handle Secrets: A Survey of Confidential Computing for Agentic AI
A survey providing a taxonomy of TEE platforms, an agent-centric threat model, and open challenges for applying confidential computing to secure agentic AI systems.
-
When Agents Handle Secrets: A Survey of Confidential Computing for Agentic AI
A structured survey of confidential computing for agentic AI that catalogs TEE platforms, agent-specific threats, transferable defenses, and remaining gaps in end-to-end frameworks.
Reference graph
Works this paper leans on
-
[1]
AMD. 2020. Strengthening VM isolation with integrity protection and more. AMD (2020)
work page 2020
-
[2]
AMD. 2023. AMD SEV-TIO: Trusted I/O for Secure Encrypted Virtualization. AMD (2023)
work page 2023
-
[3]
Phil Rogers Antoine Delignat-Lavaud. 2023. Hopper Confidential Computing: How it Works under the Hood. https://www .nvidia.com/en-us/on-demand/ session/gtcspring23-s51709/
work page 2023
-
[4]
Henk Birkholz, Dave Thaler, Michael Richardson, Ned Smith, and Wei Pan
-
[5]
Remote ATtestation procedureS (RATS) Architecture. RFC 9334. https: //doi.org/10.17487/RFC9334
-
[6]
Pau-Chen Cheng, Kevin Eykholt, Zhongshu Gu, Hani Jamjoom, KR Jayaram, Enriquillo Valdez, and Ashish Verma. 2024. Deta: Minimizing data leaks in federated learning via decentralized and trustworthy aggregation. In Proceedings of the nineteenth european conference on computer systems . 219–235
work page 2024
-
[7]
Pau-Chen Cheng, Wojciech Ozga, Enriquillo Valdez, Salman Ahmed, Zhongshu Gu, Hani Jamjoom, Hubertus Franke, and James Bottomley. 2024. Intel tdx demystified: A top-down approach. Comput. Surveys 56, 9 (2024), 1–33
work page 2024
-
[8]
Yunjie Deng, Chenxu Wang, Shunchang Yu, Shiqing Liu, Zhenyu Ning, Kevin Leach, Jin Li, Shoumeng Yan, Zhengyu He, Jiannong Cao, et al. 2022. Strongbox: A gpu tee on arm endpoints. In Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security . 769–783
work page 2022
-
[9]
Gobikrishna Dhanuskodi, Sudeshna Guha, Vidhya Krishnan, Aruna Manjunatha, Rob Nertney, Michael O’Connor, and Phil Rogers. 2023. Creating the First Confidential GPUs. Commun. ACM 67, 1 (2023), 60–67
work page 2023
- [10]
-
[11]
Zhongshu Gu, Hani Jamjoom, Dong Su, Heqing Huang, Jialong Zhang, Tengfei Ma, Dimitrios Pendarakis, and Ian Molloy. 2019. Reaching data confidentiality and model accountability on the caltrain. In 2019 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN) . IEEE, 336–348
work page 2019
-
[12]
Jinnan Guo, Peter Pietzuch, Andrew Paverd, and Kapil Vaswani. 2024. Trustwor- thy AI using Confidential Federated Learning: Federated learning and confiden- tial computing are not competing technologies. Queue 22, 2 (2024), 87–107
work page 2024
-
[13]
Intel. 2023. Intel ® TDX Connect Architecture Specification. https:// cdrdv2.intel.com/v1/dl/getContent/773614. (2023)
work page 2023
-
[14]
Intel. 2023. Intel ® Trust Domain Extensions. https://cdrdv2 .intel.com/v1/dl/ getContent/690419. (2023)
work page 2023
-
[15]
Andrei Ivanov, Benjamin Rothenberger, Arnaud Dethise, Marco Canini, Torsten Hoefler, and Adrian Perrig. 2023. {SAGE}: Software-based Attestation for {GPU } Execution. In 2023 USENIX Annual Technical Conference (USENIX ATC 23). 485–499
work page 2023
-
[16]
Insu Jang, Adrian Tang, Taehoon Kim, Simha Sethumadhavan, and Jaehyuk Huh
-
[17]
Heterogeneous isolated execution for commodity gpus. InProceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems . 455–468
-
[18]
Jianyu Jiang, Ji Qi, Tianxiang Shen, Xusheng Chen, Shixiong Zhao, Sen Wang, Li Chen, Gong Zhang, Xiapu Luo, and Heming Cui. 2022. CRONUS: Fault- isolated, secure and high-performance heterogeneous computing for trusted execution environment. In 2022 55th IEEE/ACM International Symposium on Microarchitecture (MICRO). IEEE, 124–143
work page 2022
-
[19]
David Kaplan. 2017. Protecting vm register state with sev-es. AMD (2017)
work page 2017
-
[20]
David Kaplan, Jeremy Powell, and Tom Woller. 2016. AMD memory encryption. AMD (2016)
work page 2016
-
[21]
Sunho Lee, Jungwoo Kim, Seonjin Na, Jongse Park, and Jaehyuk Huh. 2022. Tnpu: Supporting trusted execution with tree-less integrity protection for neural processing unit. In 2022 IEEE International Symposium on High-Performance Computer Architecture (HPCA). IEEE, 229–243
work page 2022
-
[22]
Haohui Mai, Jiacheng Zhao, Hongren Zheng, Yiyang Zhao, Zibin Liu, Mingyu Gao, Cong Wang, Huimin Cui, Xiaobing Feng, and Christos Kozyrakis. 2023. Honeycomb: Secure and efficient {GPU } executions via static validation. In 17th USENIX Symposium on Operating Systems Design and Implementation (OSDI 23) . 155–172
work page 2023
-
[23]
MICROCHIP. [n. d.]. CEC1736 Real-Time Platform Root of Trust Controller. https://www.microchip.com/en-us/product/cec1736
-
[24]
Fan Mo, Hamed Haddadi, Kleomenis Katevas, Eduard Marin, Diego Perino, and Nicolas Kourtellis. 2022. Ppfl: Enhancing privacy in federated learning with confidential computing. GetMobile: Mobile Computing and Communications 25, 4 (2022), 35–38
work page 2022
-
[25]
Apoorve Mohan, Mengmei Ye, Hubertus Franke, Mudhakar Srivatsa, Zhuoran Liu, and Nelson Mimura Gonzalez. 2024. Securing AI Inference in the Cloud: Is CPU-GPU Confidential Computing Ready?. In 2024 IEEE 17th International Conference on Cloud Computing (CLOUD) . IEEE, 164–175
work page 2024
-
[26]
Rob Nertney. 2025. Remote Attestation for NVIDIA Hopper and Blackwell GPUs, CPUs, and Beyond
work page 2025
-
[27]
NVIDIA. 2024. Are the On-Die Root of Trust and SEC2 security microcontroller physically the same thing? https://forums .developer.nvidia.com/t/are-the- on-die-root-of-trust-and-sec2-security-microcontroller-physically-the-same- thing/307330
work page 2024
-
[28]
NVIDIA. 2025. nvTrust: Ancillary Software for NVIDIA Trusted Computing Solutions. https://github .com/NVIDIA/nvtrust
work page 2025
-
[29]
NVIDIA. 2025. OCSP Service API Documentation. https: //docs.attestation.nvidia.com/OCSP/ocsp_api.html. (2025)
work page 2025
-
[30]
NVIDIA. 2025. open-gpu-kernel-modules. https://github .com/NVIDIA/open- gpu-kernel-modules/blob/main/kernel-open/nvidia-uvm/uvm_gpu_non_ replayable_faults.c#L38
work page 2025
-
[31]
NVIDIA. 2025. RIM Service API Documentation. https://docs .nvidia.com/ attestation/api-docs-rim/latest/index .html. (2025)
work page 2025
-
[32]
Mark Overby. 2023. Attesting NVIDIA GPUs in a confidential computing envi- ronment. https://www .youtube.com/watch?v=CpUYXUBdRUA
work page 2023
- [33]
-
[34]
Philip Rogers, Mark Overby, Vyas Venkataraman, Naveen Cherukuri, James Leroy Deming, Gobikrishna Dhanuskodi, Dwayne Swoboda, Lucien Dunning, Aruna Manjunatha, Aaron Jiricek, et al . 2023. Confidential computing using multi- instancing of parallel processors. US Patent App. 18/123,222
work page 2023
-
[35]
Philip Rogers, Mark Overby, Vyas Venkataraman, Naveen Cherukuri, James Leroy Deming, Gobikrishna Dhanuskodi, Dwayne Swoboda, Lucien Dunning, Aruna Manjunatha, Aaron Jiricek, et al. 2023. Confidential computing using parallel processors with code and data protection. US Patent App. 18/185,654
work page 2023
-
[36]
Philip John Rogers, Mark Overby, Michael Asbury Woodmansee, Vyas Venkatara- man, Naveen Cherukuri, Gobikrishna Dhanuskodi, Dwayne Frank Swoboda, Lucien Burton Dunning, Mark Hairgrove, Sudeshna Guha, et al . 2025. Imple- menting trusted executing environments across multiple processor devices. US Patent 12,219,057
work page 2025
-
[37]
Yifan Tan and Zeyu Mi. 2024. Performance Analysis and Optimization of Nvidia H100 Confidential Computing for AI Workloads. In 2024 IEEE International Symposium on Parallel and Distributed Processing with Applications (ISPA) . IEEE, 1426–1432
work page 2024
- [38]
-
[39]
Kapil Vaswani, Stavros Volos, Cédric Fournet, Antonio Nino Diaz, Ken Gordon, Balaji Vembu, Sam Webster, David Chisnall, Saurabh Kulkarni, Graham Cun- ningham, et al. 2023. Confidential computing within an {AI} accelerator. In 2023 USENIX Annual Technical Conference (USENIX ATC 23) . 501–518
work page 2023
-
[40]
Stavros Volos, Kapil Vaswani, and Rodrigo Bruno. 2018. Graviton: Trusted execution environments on {GPUs }. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18) . 681–696
work page 2018
-
[41]
Chenxu Wang, Fengwei Zhang, Yunjie Deng, Kevin Leach, Jiannong Cao, Zhenyu Ning, Shoumeng Yan, and Zhengyu He. 2024. Cage: Complementing arm cca with gpu extensions. In Network and Distributed System Security (NDSS) Symposium , Vol. 2024
work page 2024
-
[42]
Qifan Wang and David Oswald. 2024. Confidential Computing on Heterogeneous CPU-GPU Systems: Survey and Future Directions.arXiv preprint arXiv:2408.11601 (2024). A PLATFORM CONFIGURATIONS Our host system is equipped with dual-socket AMD EPYC 9634 84-core processors, with SEV-SNP enabled, and an 8-GPU NVIDIA H100 SXM5 setup. Each GPU has 80 GB of GPU memory...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.