Blueprint, Bootstrap, and Bridge: A Security Look at NVIDIA GPU Confidential Computing

arxiv: 2507.02770 · v2 · submitted 2025-07-03 · 💻 cs.CR

Blueprint, Bootstrap, and Bridge: A Security Look at NVIDIA GPU Confidential Computing

Zhongshu Gu , Enriquillo Valdez , Salman Ahmed , Julian James Stephen , Michael Le , Hani Jamjoom , Shixuan Zhao , Zhiqiang Lin This is my paper

Pith reviewed 2026-05-19 06:04 UTC · model grok-4.3

classification 💻 cs.CR

keywords GPU Confidential ComputingNVIDIA GPUSecurity AnalysisConfidential ComputingData ProtectionSystem ArchitectureBootstrap ProcessAI Workload Security

0 comments p. Extension

The pith

NVIDIA GPU confidential computing keeps data transfers protected across the CPU-GPU bridge under its threat model.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper reconstructs a view of NVIDIA's proprietary GPU Confidential Computing system to examine its security architecture and verify that data stays protected during transfers. It breaks down the specialized hardware engines that enforce isolation, the bootstrap sequence that activates those protections, and targeted checks on data movement between the trusted CPU domain and the GPU. A reader would care because GPU-CC lets existing AI applications run securely with no code changes, yet its closed design leaves open questions about whether those protections actually hold for all transfer paths.

Core claim

By mapping the system's blueprint, bootstrap process, and bridge mechanisms, the authors establish that under the GPU-CC threat model data transfers along different paths remain protected when crossing between the trusted CPU and GPU domains.

What carries the argument

The GPU-CC bridge that coordinates hardware engines and software components to maintain isolation and protection for data moving between CPU and GPU domains.

If this is right

Existing AI workloads can continue to use GPU-CC without code changes while retaining the claimed protections.
Data movement between CPU and GPU domains stays isolated even across multiple transfer paths.
The bootstrap sequence successfully coordinates hardware and software to activate those protections before workloads run.
Security researchers gain a usable model for further inspection of the closed GPU-CC implementation.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Similar reconstruction techniques could be applied to other proprietary confidential-computing stacks to check cross-domain data protection.
If the protections hold, GPU-CC could serve as a template for hardware vendors seeking to add secure acceleration to AI pipelines without requiring application rewrites.
The findings open the door to automated tools that monitor the bridge for deviations from the expected protected behavior in production clusters.

Load-bearing premise

The reconstruction of a coherent system view from proprietary components is accurate enough to support the security experiments and conclusions.

What would settle it

An experiment that demonstrates a data leak or exposure along any transfer path when running under the documented GPU-CC threat model and configuration.

Figures

Figures reproduced from arXiv: 2507.02770 by Enriquillo Valdez, Hani Jamjoom, Julian James Stephen, Michael Le, Salman Ahmed, Shixuan Zhao, Zhiqiang Lin, Zhongshu Gu.

**Figure 2.** Figure 2: The Software/Hardware Elements in GPU-CC [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: Classification of Data Read via GPU’s BAR0 [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗

**Figure 4.** Figure 4: Information Leakage in GSP-RM RPC queue elements, one entry for the RX queue’s header, and 63 entries for RX queue elements. If an attacker gains access to the physical address table, they can easily locate all elements in the TX/RX queues. For example, the second entry ( 0x000000016921f000) points to the TX/RX queue’s header, where the readPtr and writePtr fields indicate the next elements to read and wri… view at source ↗

**Figure 5.** Figure 5: Timing Channels in CPU-GSP Memory Transfers [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗

**Figure 6.** Figure 6: The Interactions among SEC2, WLC, and LCIC [PITH_FULL_IMAGE:figures/full_fig_p009_6.png] view at source ↗

**Figure 7.** Figure 7: GPU’s Device and RIM File Certificate Chains [PITH_FULL_IMAGE:figures/full_fig_p012_7.png] view at source ↗

read the original abstract

NVIDIA GPU Confidential Computing (GPU-CC) aims to provide secure execution for AI workloads. For end users, enabling GPU-CC is seamless and requires no modifications to existing applications. However, this ease of adoption relies on a proprietary and highly complex system that is difficult to inspect, creating challenges for researchers seeking to understand its architecture and security landscape. In this work, we provide a security look at GPU-CC by reconstructing a coherent view of the system. We first examine the system's blueprint, focusing on the specialized architectural engines that support its security mechanisms. We then analyze the bootstrap process, which coordinates hardware and software components to establish these protections. Finally, we conduct targeted experiments to assess whether, under the GPU-CC threat model, data transfers along different paths remain protected across the bridge between trusted CPU and GPU domains. We responsibly disclosed all security findings presented in this paper to the NVIDIA Product Security Incident Response Team (PSIRT).

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper reconstructs NVIDIA GPU-CC architecture and bootstrap then runs targeted checks on protected data transfers across the CPU-GPU bridge, but everything rests on an unverified proprietary mapping.

read the letter

The main thing to know is that this is the first public attempt to map out the actual hardware engines, bootstrap sequence, and data paths in NVIDIA's GPU confidential computing. They describe specialized architectural components that enforce the protections and then test whether transfers stay safe under the GPU-CC threat model when crossing the bridge between CPU and GPU domains. That reconstruction and the bridge-specific experiments are what is new here compared with prior vendor summaries. They also did the right thing by disclosing findings to NVIDIA PSIRT. The work gives a concrete picture that was not available before, which is useful for anyone who needs to reason about real deployments rather than just high-level specs. The experiments are presented as targeted probes rather than exhaustive testing, which fits the closed nature of the platform. The central limitation is that the entire assessment depends on the accuracy and completeness of the authors' own reconstruction. Because the system is proprietary, an omitted engine or mis-mapped trust boundary could mean some transfer paths were not actually covered. The abstract gives only a high-level description of the tests, so it is hard to judge the exact methodology, coverage, or whether edge cases were considered. No formal model or external confirmation is provided to back the blueprint. This paper is for security researchers and practitioners who work with confidential AI workloads on GPUs in data centers or clouds. A reader who wants a practical view of how these protections are actually built would get value from the architectural breakdown and the experimental approach. I would send it to peer review. The topic is timely for deployed confidential computing, the reconstruction fills a real gap, and the experiments are a reasonable starting point even if they need more detail and independent checks in revision.

Referee Report

1 major / 2 minor

Summary. The paper reconstructs a coherent architectural view of NVIDIA GPU Confidential Computing (GPU-CC) by first detailing the blueprint of specialized hardware engines supporting its security mechanisms, then analyzing the bootstrap process that coordinates hardware and software to establish trusted domains, and finally reporting targeted experiments assessing whether data transfers along different paths remain protected across the CPU-GPU bridge under the stated GPU-CC threat model. All security findings were responsibly disclosed to NVIDIA PSIRT.

Significance. If the reconstruction accurately captures the closed-source system and the experiments comprehensively cover relevant paths, the work would provide a valuable public analysis of protections for AI workloads on widely deployed NVIDIA hardware. The responsible disclosure and focus on a practical threat model are strengths; however, the absence of independent corroboration (such as a machine-checked model or vendor diagram) limits the strength of the security conclusions.

major comments (1)

[Bridge / targeted experiments] The central claim that data transfers remain protected across the bridge depends on the completeness of the proprietary reconstruction described in the blueprint and bootstrap sections. Without additional validation (e.g., cross-checks against public documentation, a formal model, or explicit discussion of how omitted mechanisms were ruled out), it is difficult to confirm that all relevant data paths were examined in the targeted experiments.

minor comments (2)

[Abstract] The abstract describes the methodology but does not summarize the concrete outcomes of the targeted experiments (e.g., which paths were tested and what the results showed).
Notation for trust domains and data paths could be introduced earlier with a diagram to improve readability when discussing the bridge experiments.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their constructive review and for recognizing the strengths of our responsible disclosure and practical threat model focus. We address the major comment below, revising the manuscript to improve transparency around our reconstruction methodology while remaining honest about inherent limitations of analyzing a closed-source system.

read point-by-point responses

Referee: [Bridge / targeted experiments] The central claim that data transfers remain protected across the bridge depends on the completeness of the proprietary reconstruction described in the blueprint and bootstrap sections. Without additional validation (e.g., cross-checks against public documentation, a formal model, or explicit discussion of how omitted mechanisms were ruled out), it is difficult to confirm that all relevant data paths were examined in the targeted experiments.

Authors: We agree that the strength of our claims on protected data transfers rests on the accuracy and completeness of the blueprint and bootstrap reconstruction. Our analysis draws from public NVIDIA documentation on GPU-CC, hardware interface probing, and empirical experiments across multiple transfer paths under the stated threat model. In the revised manuscript, we have added an explicit subsection detailing our cross-checks against available public specifications, the criteria used to identify and rule out alternative mechanisms, and consistency verification across GPU configurations. These additions aim to make the coverage of relevant paths more transparent. A machine-checked formal model or internal vendor diagrams cannot be provided, as the system is proprietary and such artifacts are not accessible through public channels or the disclosure process. revision: partial

Circularity Check

0 steps flagged

No significant circularity in the derivation chain

full rationale

The paper reconstructs a coherent view of the proprietary GPU-CC system by examining its blueprint and bootstrap process, then performs targeted experiments to assess data transfer protections under the stated threat model. This chain relies on empirical observation and analysis rather than any self-referential definitions, fitted parameters renamed as predictions, or load-bearing self-citations. No equations or derivations are presented that reduce the central claim to its own inputs by construction; the assessment is externally falsifiable via the experiments and responsible disclosure process. The reconstruction serves as an input step whose accuracy is a standard limitation for closed-source systems, not a circularity mechanism.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The analysis depends on the vendor-defined threat model and the feasibility of reconstructing internal behavior from observable interfaces without full source access.

axioms (1)

domain assumption The GPU-CC threat model accurately reflects the intended security boundaries.
All experiments and conclusions are conditioned on this model as stated by NVIDIA.

pith-pipeline@v0.9.0 · 5710 in / 1040 out tokens · 44148 ms · 2026-05-19T06:04:53.427952+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We instrument the GPU kernel module... conduct a series of experiments to identify the security weaknesses... For certain components... we propose well-reasoned speculations about their inner working mechanisms.

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 4 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

GPUBreach: Privilege Escalation Attacks on GPUs using Rowhammer
cs.CR 2026-05 unverdicted novelty 8.0

Unprivileged CUDA kernels can use Rowhammer to tamper with GPU page tables for targeted privilege escalation, leaking cryptographic keys and escalating to CPU root access by bypassing IOMMU.
Revealing NVIDIA Closed-Source Driver Command Streams for CPU-GPU Runtime Behavior Insight
cs.PF 2026-04 conditional novelty 7.0

A technique recovers complete GPU hardware command streams from NVIDIA's closed-source CUDA driver via kernel instrumentation and doorbell watchpoints, demonstrated on data movement and CUDA Graphs.
When Agents Handle Secrets: A Survey of Confidential Computing for Agentic AI
cs.CR 2026-05 unverdicted novelty 5.0

A survey providing a taxonomy of TEE platforms, an agent-centric threat model, and open challenges for applying confidential computing to secure agentic AI systems.
When Agents Handle Secrets: A Survey of Confidential Computing for Agentic AI
cs.CR 2026-05 unverdicted novelty 4.0

A structured survey of confidential computing for agentic AI that catalogs TEE platforms, agent-specific threats, transferable defenses, and remaining gaps in end-to-end frameworks.

Reference graph

Works this paper leans on

42 extracted references · 42 canonical work pages · cited by 3 Pith papers

[1]

AMD. 2020. Strengthening VM isolation with integrity protection and more. AMD (2020)

work page 2020
[2]

AMD. 2023. AMD SEV-TIO: Trusted I/O for Secure Encrypted Virtualization. AMD (2023)

work page 2023
[3]

Phil Rogers Antoine Delignat-Lavaud. 2023. Hopper Confidential Computing: How it Works under the Hood. https://www .nvidia.com/en-us/on-demand/ session/gtcspring23-s51709/

work page 2023
[4]

Henk Birkholz, Dave Thaler, Michael Richardson, Ned Smith, and Wei Pan

work page
[5]

RFC 9334

Remote ATtestation procedureS (RATS) Architecture. RFC 9334. https: //doi.org/10.17487/RFC9334

work page doi:10.17487/rfc9334
[6]

Pau-Chen Cheng, Kevin Eykholt, Zhongshu Gu, Hani Jamjoom, KR Jayaram, Enriquillo Valdez, and Ashish Verma. 2024. Deta: Minimizing data leaks in federated learning via decentralized and trustworthy aggregation. In Proceedings of the nineteenth european conference on computer systems . 219–235

work page 2024
[7]

Pau-Chen Cheng, Wojciech Ozga, Enriquillo Valdez, Salman Ahmed, Zhongshu Gu, Hani Jamjoom, Hubertus Franke, and James Bottomley. 2024. Intel tdx demystified: A top-down approach. Comput. Surveys 56, 9 (2024), 1–33

work page 2024
[8]

Yunjie Deng, Chenxu Wang, Shunchang Yu, Shiqing Liu, Zhenyu Ning, Kevin Leach, Jin Li, Shoumeng Yan, Zhengyu He, Jiannong Cao, et al. 2022. Strongbox: A gpu tee on arm endpoints. In Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security . 769–783

work page 2022
[9]

Gobikrishna Dhanuskodi, Sudeshna Guha, Vidhya Krishnan, Aruna Manjunatha, Rob Nertney, Michael O’Connor, and Phil Rogers. 2023. Creating the First Confidential GPUs. Commun. ACM 67, 1 (2023), 60–67

work page 2023
[10]

Hubert Eichner, Daniel Ramage, Kallista Bonawitz, Dzmitry Huba, Tiziano San- toro, Brett McLarnon, Timon Van Overveldt, Nova Fallen, Peter Kairouz, Al- bert Cheu, et al . 2024. Confidential federated computations. arXiv preprint arXiv:2404.10764 (2024)

work page arXiv 2024
[11]

Zhongshu Gu, Hani Jamjoom, Dong Su, Heqing Huang, Jialong Zhang, Tengfei Ma, Dimitrios Pendarakis, and Ian Molloy. 2019. Reaching data confidentiality and model accountability on the caltrain. In 2019 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN) . IEEE, 336–348

work page 2019
[12]

Jinnan Guo, Peter Pietzuch, Andrew Paverd, and Kapil Vaswani. 2024. Trustwor- thy AI using Confidential Federated Learning: Federated learning and confiden- tial computing are not competing technologies. Queue 22, 2 (2024), 87–107

work page 2024
[13]

Intel. 2023. Intel ® TDX Connect Architecture Specification. https:// cdrdv2.intel.com/v1/dl/getContent/773614. (2023)

work page 2023
[14]

Intel. 2023. Intel ® Trust Domain Extensions. https://cdrdv2 .intel.com/v1/dl/ getContent/690419. (2023)

work page 2023
[15]

Andrei Ivanov, Benjamin Rothenberger, Arnaud Dethise, Marco Canini, Torsten Hoefler, and Adrian Perrig. 2023. {SAGE}: Software-based Attestation for {GPU } Execution. In 2023 USENIX Annual Technical Conference (USENIX ATC 23). 485–499

work page 2023
[16]

Insu Jang, Adrian Tang, Taehoon Kim, Simha Sethumadhavan, and Jaehyuk Huh

work page
[17]

InProceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems

Heterogeneous isolated execution for commodity gpus. InProceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems . 455–468

work page
[18]

Jianyu Jiang, Ji Qi, Tianxiang Shen, Xusheng Chen, Shixiong Zhao, Sen Wang, Li Chen, Gong Zhang, Xiapu Luo, and Heming Cui. 2022. CRONUS: Fault- isolated, secure and high-performance heterogeneous computing for trusted execution environment. In 2022 55th IEEE/ACM International Symposium on Microarchitecture (MICRO). IEEE, 124–143

work page 2022
[19]

David Kaplan. 2017. Protecting vm register state with sev-es. AMD (2017)

work page 2017
[20]

David Kaplan, Jeremy Powell, and Tom Woller. 2016. AMD memory encryption. AMD (2016)

work page 2016
[21]

Sunho Lee, Jungwoo Kim, Seonjin Na, Jongse Park, and Jaehyuk Huh. 2022. Tnpu: Supporting trusted execution with tree-less integrity protection for neural processing unit. In 2022 IEEE International Symposium on High-Performance Computer Architecture (HPCA). IEEE, 229–243

work page 2022
[22]

Haohui Mai, Jiacheng Zhao, Hongren Zheng, Yiyang Zhao, Zibin Liu, Mingyu Gao, Cong Wang, Huimin Cui, Xiaobing Feng, and Christos Kozyrakis. 2023. Honeycomb: Secure and efficient {GPU } executions via static validation. In 17th USENIX Symposium on Operating Systems Design and Implementation (OSDI 23) . 155–172

work page 2023
[23]

MICROCHIP. [n. d.]. CEC1736 Real-Time Platform Root of Trust Controller. https://www.microchip.com/en-us/product/cec1736

work page
[24]

Fan Mo, Hamed Haddadi, Kleomenis Katevas, Eduard Marin, Diego Perino, and Nicolas Kourtellis. 2022. Ppfl: Enhancing privacy in federated learning with confidential computing. GetMobile: Mobile Computing and Communications 25, 4 (2022), 35–38

work page 2022
[25]

Apoorve Mohan, Mengmei Ye, Hubertus Franke, Mudhakar Srivatsa, Zhuoran Liu, and Nelson Mimura Gonzalez. 2024. Securing AI Inference in the Cloud: Is CPU-GPU Confidential Computing Ready?. In 2024 IEEE 17th International Conference on Cloud Computing (CLOUD) . IEEE, 164–175

work page 2024
[26]

Rob Nertney. 2025. Remote Attestation for NVIDIA Hopper and Blackwell GPUs, CPUs, and Beyond

work page 2025
[27]

NVIDIA. 2024. Are the On-Die Root of Trust and SEC2 security microcontroller physically the same thing? https://forums .developer.nvidia.com/t/are-the- on-die-root-of-trust-and-sec2-security-microcontroller-physically-the-same- thing/307330

work page 2024
[28]

NVIDIA. 2025. nvTrust: Ancillary Software for NVIDIA Trusted Computing Solutions. https://github .com/NVIDIA/nvtrust

work page 2025
[29]

NVIDIA. 2025. OCSP Service API Documentation. https: //docs.attestation.nvidia.com/OCSP/ocsp_api.html. (2025)

work page 2025
[30]

NVIDIA. 2025. open-gpu-kernel-modules. https://github .com/NVIDIA/open- gpu-kernel-modules/blob/main/kernel-open/nvidia-uvm/uvm_gpu_non_ replayable_faults.c#L38

work page 2025
[31]

NVIDIA. 2025. RIM Service API Documentation. https://docs .nvidia.com/ attestation/api-docs-rim/latest/index .html. (2025)

work page 2025
[32]

Mark Overby. 2023. Attesting NVIDIA GPUs in a confidential computing envi- ronment. https://www .youtube.com/watch?v=CpUYXUBdRUA

work page 2023
[33]

Do Le Quoc and Christof Fetzer. 2021. Secfl: Confidential federated learning using tees. arXiv preprint arXiv:2110.00981 (2021)

work page arXiv 2021
[34]

Philip Rogers, Mark Overby, Vyas Venkataraman, Naveen Cherukuri, James Leroy Deming, Gobikrishna Dhanuskodi, Dwayne Swoboda, Lucien Dunning, Aruna Manjunatha, Aaron Jiricek, et al . 2023. Confidential computing using multi- instancing of parallel processors. US Patent App. 18/123,222

work page 2023
[35]

Philip Rogers, Mark Overby, Vyas Venkataraman, Naveen Cherukuri, James Leroy Deming, Gobikrishna Dhanuskodi, Dwayne Swoboda, Lucien Dunning, Aruna Manjunatha, Aaron Jiricek, et al. 2023. Confidential computing using parallel processors with code and data protection. US Patent App. 18/185,654

work page 2023
[36]

Philip John Rogers, Mark Overby, Michael Asbury Woodmansee, Vyas Venkatara- man, Naveen Cherukuri, Gobikrishna Dhanuskodi, Dwayne Frank Swoboda, Lucien Burton Dunning, Mark Hairgrove, Sudeshna Guha, et al . 2025. Imple- menting trusted executing environments across multiple processor devices. US Patent 12,219,057

work page 2025
[37]

Yifan Tan and Zeyu Mi. 2024. Performance Analysis and Optimization of Nvidia H100 Confidential Computing for AI Workloads. In 2024 IEEE International Symposium on Parallel and Distributed Processing with Applications (ISPA) . IEEE, 1426–1432

work page 2024
[38]

Yifan Tan, Cheng Tan, Zeyu Mi, and Haibo Chen. 2024. PipeLLM: Fast and Con- fidential Large Language Model Services with Speculative Pipelined Encryption. arXiv preprint arXiv:2411.03357 (2024)

work page arXiv 2024
[39]

Kapil Vaswani, Stavros Volos, Cédric Fournet, Antonio Nino Diaz, Ken Gordon, Balaji Vembu, Sam Webster, David Chisnall, Saurabh Kulkarni, Graham Cun- ningham, et al. 2023. Confidential computing within an {AI} accelerator. In 2023 USENIX Annual Technical Conference (USENIX ATC 23) . 501–518

work page 2023
[40]

Stavros Volos, Kapil Vaswani, and Rodrigo Bruno. 2018. Graviton: Trusted execution environments on {GPUs }. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18) . 681–696

work page 2018
[41]

Chenxu Wang, Fengwei Zhang, Yunjie Deng, Kevin Leach, Jiannong Cao, Zhenyu Ning, Shoumeng Yan, and Zhengyu He. 2024. Cage: Complementing arm cca with gpu extensions. In Network and Distributed System Security (NDSS) Symposium , Vol. 2024

work page 2024
[42]

Qifan Wang and David Oswald. 2024. Confidential Computing on Heterogeneous CPU-GPU Systems: Survey and Future Directions.arXiv preprint arXiv:2408.11601 (2024). A PLATFORM CONFIGURATIONS Our host system is equipped with dual-socket AMD EPYC 9634 84-core processors, with SEV-SNP enabled, and an 8-GPU NVIDIA H100 SXM5 setup. Each GPU has 80 GB of GPU memory...

work page arXiv 2024

[1] [1]

AMD. 2020. Strengthening VM isolation with integrity protection and more. AMD (2020)

work page 2020

[2] [2]

AMD. 2023. AMD SEV-TIO: Trusted I/O for Secure Encrypted Virtualization. AMD (2023)

work page 2023

[3] [3]

Phil Rogers Antoine Delignat-Lavaud. 2023. Hopper Confidential Computing: How it Works under the Hood. https://www .nvidia.com/en-us/on-demand/ session/gtcspring23-s51709/

work page 2023

[4] [4]

Henk Birkholz, Dave Thaler, Michael Richardson, Ned Smith, and Wei Pan

work page

[5] [5]

RFC 9334

Remote ATtestation procedureS (RATS) Architecture. RFC 9334. https: //doi.org/10.17487/RFC9334

work page doi:10.17487/rfc9334

[6] [6]

Pau-Chen Cheng, Kevin Eykholt, Zhongshu Gu, Hani Jamjoom, KR Jayaram, Enriquillo Valdez, and Ashish Verma. 2024. Deta: Minimizing data leaks in federated learning via decentralized and trustworthy aggregation. In Proceedings of the nineteenth european conference on computer systems . 219–235

work page 2024

[7] [7]

Pau-Chen Cheng, Wojciech Ozga, Enriquillo Valdez, Salman Ahmed, Zhongshu Gu, Hani Jamjoom, Hubertus Franke, and James Bottomley. 2024. Intel tdx demystified: A top-down approach. Comput. Surveys 56, 9 (2024), 1–33

work page 2024

[8] [8]

Yunjie Deng, Chenxu Wang, Shunchang Yu, Shiqing Liu, Zhenyu Ning, Kevin Leach, Jin Li, Shoumeng Yan, Zhengyu He, Jiannong Cao, et al. 2022. Strongbox: A gpu tee on arm endpoints. In Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security . 769–783

work page 2022

[9] [9]

Gobikrishna Dhanuskodi, Sudeshna Guha, Vidhya Krishnan, Aruna Manjunatha, Rob Nertney, Michael O’Connor, and Phil Rogers. 2023. Creating the First Confidential GPUs. Commun. ACM 67, 1 (2023), 60–67

work page 2023

[10] [10]

Hubert Eichner, Daniel Ramage, Kallista Bonawitz, Dzmitry Huba, Tiziano San- toro, Brett McLarnon, Timon Van Overveldt, Nova Fallen, Peter Kairouz, Al- bert Cheu, et al . 2024. Confidential federated computations. arXiv preprint arXiv:2404.10764 (2024)

work page arXiv 2024

[11] [11]

Zhongshu Gu, Hani Jamjoom, Dong Su, Heqing Huang, Jialong Zhang, Tengfei Ma, Dimitrios Pendarakis, and Ian Molloy. 2019. Reaching data confidentiality and model accountability on the caltrain. In 2019 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN) . IEEE, 336–348

work page 2019

[12] [12]

Jinnan Guo, Peter Pietzuch, Andrew Paverd, and Kapil Vaswani. 2024. Trustwor- thy AI using Confidential Federated Learning: Federated learning and confiden- tial computing are not competing technologies. Queue 22, 2 (2024), 87–107

work page 2024

[13] [13]

Intel. 2023. Intel ® TDX Connect Architecture Specification. https:// cdrdv2.intel.com/v1/dl/getContent/773614. (2023)

work page 2023

[14] [14]

Intel. 2023. Intel ® Trust Domain Extensions. https://cdrdv2 .intel.com/v1/dl/ getContent/690419. (2023)

work page 2023

[15] [15]

Andrei Ivanov, Benjamin Rothenberger, Arnaud Dethise, Marco Canini, Torsten Hoefler, and Adrian Perrig. 2023. {SAGE}: Software-based Attestation for {GPU } Execution. In 2023 USENIX Annual Technical Conference (USENIX ATC 23). 485–499

work page 2023

[16] [16]

Insu Jang, Adrian Tang, Taehoon Kim, Simha Sethumadhavan, and Jaehyuk Huh

work page

[17] [17]

InProceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems

Heterogeneous isolated execution for commodity gpus. InProceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems . 455–468

work page

[18] [18]

Jianyu Jiang, Ji Qi, Tianxiang Shen, Xusheng Chen, Shixiong Zhao, Sen Wang, Li Chen, Gong Zhang, Xiapu Luo, and Heming Cui. 2022. CRONUS: Fault- isolated, secure and high-performance heterogeneous computing for trusted execution environment. In 2022 55th IEEE/ACM International Symposium on Microarchitecture (MICRO). IEEE, 124–143

work page 2022

[19] [19]

David Kaplan. 2017. Protecting vm register state with sev-es. AMD (2017)

work page 2017

[20] [20]

David Kaplan, Jeremy Powell, and Tom Woller. 2016. AMD memory encryption. AMD (2016)

work page 2016

[21] [21]

Sunho Lee, Jungwoo Kim, Seonjin Na, Jongse Park, and Jaehyuk Huh. 2022. Tnpu: Supporting trusted execution with tree-less integrity protection for neural processing unit. In 2022 IEEE International Symposium on High-Performance Computer Architecture (HPCA). IEEE, 229–243

work page 2022

[22] [22]

Haohui Mai, Jiacheng Zhao, Hongren Zheng, Yiyang Zhao, Zibin Liu, Mingyu Gao, Cong Wang, Huimin Cui, Xiaobing Feng, and Christos Kozyrakis. 2023. Honeycomb: Secure and efficient {GPU } executions via static validation. In 17th USENIX Symposium on Operating Systems Design and Implementation (OSDI 23) . 155–172

work page 2023

[23] [23]

MICROCHIP. [n. d.]. CEC1736 Real-Time Platform Root of Trust Controller. https://www.microchip.com/en-us/product/cec1736

work page

[24] [24]

Fan Mo, Hamed Haddadi, Kleomenis Katevas, Eduard Marin, Diego Perino, and Nicolas Kourtellis. 2022. Ppfl: Enhancing privacy in federated learning with confidential computing. GetMobile: Mobile Computing and Communications 25, 4 (2022), 35–38

work page 2022

[25] [25]

Apoorve Mohan, Mengmei Ye, Hubertus Franke, Mudhakar Srivatsa, Zhuoran Liu, and Nelson Mimura Gonzalez. 2024. Securing AI Inference in the Cloud: Is CPU-GPU Confidential Computing Ready?. In 2024 IEEE 17th International Conference on Cloud Computing (CLOUD) . IEEE, 164–175

work page 2024

[26] [26]

Rob Nertney. 2025. Remote Attestation for NVIDIA Hopper and Blackwell GPUs, CPUs, and Beyond

work page 2025

[27] [27]

NVIDIA. 2024. Are the On-Die Root of Trust and SEC2 security microcontroller physically the same thing? https://forums .developer.nvidia.com/t/are-the- on-die-root-of-trust-and-sec2-security-microcontroller-physically-the-same- thing/307330

work page 2024

[28] [28]

NVIDIA. 2025. nvTrust: Ancillary Software for NVIDIA Trusted Computing Solutions. https://github .com/NVIDIA/nvtrust

work page 2025

[29] [29]

NVIDIA. 2025. OCSP Service API Documentation. https: //docs.attestation.nvidia.com/OCSP/ocsp_api.html. (2025)

work page 2025

[30] [30]

NVIDIA. 2025. open-gpu-kernel-modules. https://github .com/NVIDIA/open- gpu-kernel-modules/blob/main/kernel-open/nvidia-uvm/uvm_gpu_non_ replayable_faults.c#L38

work page 2025

[31] [31]

NVIDIA. 2025. RIM Service API Documentation. https://docs .nvidia.com/ attestation/api-docs-rim/latest/index .html. (2025)

work page 2025

[32] [32]

Mark Overby. 2023. Attesting NVIDIA GPUs in a confidential computing envi- ronment. https://www .youtube.com/watch?v=CpUYXUBdRUA

work page 2023

[33] [33]

Do Le Quoc and Christof Fetzer. 2021. Secfl: Confidential federated learning using tees. arXiv preprint arXiv:2110.00981 (2021)

work page arXiv 2021

[34] [34]

Philip Rogers, Mark Overby, Vyas Venkataraman, Naveen Cherukuri, James Leroy Deming, Gobikrishna Dhanuskodi, Dwayne Swoboda, Lucien Dunning, Aruna Manjunatha, Aaron Jiricek, et al . 2023. Confidential computing using multi- instancing of parallel processors. US Patent App. 18/123,222

work page 2023

[35] [35]

Philip Rogers, Mark Overby, Vyas Venkataraman, Naveen Cherukuri, James Leroy Deming, Gobikrishna Dhanuskodi, Dwayne Swoboda, Lucien Dunning, Aruna Manjunatha, Aaron Jiricek, et al. 2023. Confidential computing using parallel processors with code and data protection. US Patent App. 18/185,654

work page 2023

[36] [36]

Philip John Rogers, Mark Overby, Michael Asbury Woodmansee, Vyas Venkatara- man, Naveen Cherukuri, Gobikrishna Dhanuskodi, Dwayne Frank Swoboda, Lucien Burton Dunning, Mark Hairgrove, Sudeshna Guha, et al . 2025. Imple- menting trusted executing environments across multiple processor devices. US Patent 12,219,057

work page 2025

[37] [37]

Yifan Tan and Zeyu Mi. 2024. Performance Analysis and Optimization of Nvidia H100 Confidential Computing for AI Workloads. In 2024 IEEE International Symposium on Parallel and Distributed Processing with Applications (ISPA) . IEEE, 1426–1432

work page 2024

[38] [38]

Yifan Tan, Cheng Tan, Zeyu Mi, and Haibo Chen. 2024. PipeLLM: Fast and Con- fidential Large Language Model Services with Speculative Pipelined Encryption. arXiv preprint arXiv:2411.03357 (2024)

work page arXiv 2024

[39] [39]

Kapil Vaswani, Stavros Volos, Cédric Fournet, Antonio Nino Diaz, Ken Gordon, Balaji Vembu, Sam Webster, David Chisnall, Saurabh Kulkarni, Graham Cun- ningham, et al. 2023. Confidential computing within an {AI} accelerator. In 2023 USENIX Annual Technical Conference (USENIX ATC 23) . 501–518

work page 2023

[40] [40]

Stavros Volos, Kapil Vaswani, and Rodrigo Bruno. 2018. Graviton: Trusted execution environments on {GPUs }. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18) . 681–696

work page 2018

[41] [41]

Chenxu Wang, Fengwei Zhang, Yunjie Deng, Kevin Leach, Jiannong Cao, Zhenyu Ning, Shoumeng Yan, and Zhengyu He. 2024. Cage: Complementing arm cca with gpu extensions. In Network and Distributed System Security (NDSS) Symposium , Vol. 2024

work page 2024

[42] [42]

Qifan Wang and David Oswald. 2024. Confidential Computing on Heterogeneous CPU-GPU Systems: Survey and Future Directions.arXiv preprint arXiv:2408.11601 (2024). A PLATFORM CONFIGURATIONS Our host system is equipped with dual-socket AMD EPYC 9634 84-core processors, with SEV-SNP enabled, and an 8-GPU NVIDIA H100 SXM5 setup. Each GPU has 80 GB of GPU memory...

work page arXiv 2024