pith. sign in

arxiv: 2605.17707 · v1 · pith:JNPW5ZCVnew · submitted 2026-05-18 · 💻 cs.CR

Speed Kills: Exploring Confused Deputy Attacks Through Edge AI Accelerators

Pith reviewed 2026-05-19 23:05 UTC · model grok-4.3

classification 💻 cs.CR
keywords confused deputy attacksAI acceleratorsedge deviceshardware securityvulnerability analysissystem on chip
8
0 comments X

The pith

AI accelerators on edge devices can be tricked by apps into performing privileged operations outside OS control.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that AI accelerators for efficient on-device inference operate without the usual operating system restrictions on memory access and process isolation. This creates a semantic gap where a malicious application can trick the accelerator into acting as a confused deputy that performs actions it should not be allowed to do. The authors built DeputyHunt, an LLM-assisted tool that combines dynamic and static analysis to locate these attack paths across different hardware. Testing seven accelerators from major vendors found feasible attacks on six of them, exposing risks across many system-on-chips and devices in the field.

Core claim

AI accelerators are not bound by operating system restrictions and have limited visibility into application processor security mechanisms such as kernel versus application memory and process isolation, so they can be tricked by malicious applications to perform privileged operations on their behalf.

What carries the argument

The semantic gap between AI accelerators and operating system security mechanisms that turns the accelerator into a confused deputy.

If this is right

  • Over 128 system-on-chips and more than 100 million devices become exposed to potential privilege escalation through these accelerators.
  • Vendors must add checks so that AI accelerators validate whether requested operations come from authorized processes.
  • An on-demand validation defense can block the attacks while adding roughly 15 percent runtime overhead in simulation.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Future accelerator designs could close the gap by requiring explicit OS approval for operations that touch protected memory or resources.
  • The same analysis approach might reveal similar deputy risks in other specialized edge hardware such as DSPs or network processors.
  • Widespread adoption of these accelerators in consumer and industrial devices means the security model for edge AI needs re-examination at the hardware-software boundary.

Load-bearing premise

The DeputyHunt framework correctly identifies exploitable confused deputy paths without substantial false positives or missed cases on the tested accelerators.

What would settle it

A test on one of the six affected accelerators that follows the reported attack path but finds the accelerator refuses to execute the privileged operation.

Figures

Figures reproduced from arXiv: 2605.17707 by Aravind Kumar Machiry, Datta Manikanta Sri Hari Danduri.

Figure 1
Figure 1. Figure 1: Communication with AIA and Threat Model We aim to perform the first investigation of the possibility of memory CDAs (which we will refer to as just CDAs in the rest of the paper) through AIA. Paper Roadmap. First, we provide low-level details on typical communication with AIA (§ VI). Second, we present our methodology to investigate CDAs in AIA (§ VII-C). Third, we will present the results of our investiga… view at source ↗
Figure 2
Figure 2. Figure 2: Overview of DeputyHunt. — a readable and parsable format. We parse the DTS file to extract AIMem and AIRMem corresponding to the AIA. For PCI or USB connected AIAs, we use lspci or lsusb tool, respectively, to identify AIMem and AIRMem. 2) Source Instrumentation: Our goal is to record communi￾cation from user space to KD and from KD to AIA. Specifically, we want to capture the execution flow and functions … view at source ↗
Figure 3
Figure 3. Figure 3: NXP NPU CDA case study 3) Case Study: NXP NPU CDA Exploit: DeputyHunt identifies the relevant message fields (Table III), KD entry points (Table II), and AIA-relevant functions (Table II). Analysis of the AIA-relevant functions revealed (Ap￾pendix H) that NXP NPU uses custom page tables and encodes the page-table index as the SMID. Furthermore, these SMIDs are identity-mapped to physical addresses, allowin… view at source ↗
Figure 4
Figure 4. Figure 4: Memory Regions Categorization Device Type AIMem AIRMem AVLs KD Device Files Google TPU PCI 16KB from 0x20200000 1M from 0x20100000 - libcoral (statically linked) libedgetpu.so /dev/apex_0 NXP NPU AXI and AHB 32KB from 0x38000000 32KB from 0x38008000 128KB from 0x38500000 256MB from 0x100000000 libvx_ delegate.so libtimvx.so libGAL.so /dev/galcore TMMA Custom Interconnect - Several Disjoint Zones (Listing 3… view at source ↗
read the original abstract

AI Accelerator (AIA) are specialized hardware e.g., Tensor Processing Unit (TPU), that enable optimal and efficient execution of AI applications and on-device inference. The growing demand for AI applications has led to the widespread adoption of AIAs on Edge or embedded devices on Edge or embedded devices. Unlike applications, AIAs are not bound by Operating System (OS) restrictions and have limited visibility into Application Processor (AP) security mechanisms (e.g., kernel vs. application memory, process isolation). This semantic gap can lead to confused deputy vulnerabilities, i.e., AIA can be tricked by a malicious application to perform privileged operations on their behalf. In this paper, we conducted the first in-depth study of Confused Deputy Attacks (CDAs) using AIA. We design DeputyHunt, a Large Language Model (LLM) assisted framework to extract CDA relevant information for a given AIA through a combination of dynamic and static analysis. We used this information to explore the feasibility of CDA on seven different AIAs from popular vendors, i.e., Google, NVIDIA, Hailo, Texas Instruments, NXP, AWS, and Rockchip. Our analysis revealed that CDA is feasible on six out of the seven AIAs, impacting over 128 System On Chips (SOCs) and over 100 million devices. Our findings highlight critical security risks posed by AIA on system security. Our work has been acknowledged by the corresponding vendors and assigned the CVE-2025-66425. We propose an on-demand validation defense against CDA, and evaluation on the Gem5- salam simulator shows that it incurs minimal runtime overhead (i.e., ~15%).

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper claims to perform the first in-depth study of Confused Deputy Attacks (CDAs) on Edge AI Accelerators (AIAs). It designs DeputyHunt, an LLM-assisted framework using dynamic and static analysis to identify exploitable CDA paths. Evaluation on seven AIAs from Google, NVIDIA, Hailo, Texas Instruments, NXP, AWS, and Rockchip shows CDA feasibility on six, affecting over 128 SoCs and 100 million devices. Vendors acknowledged the findings with CVE-2025-66425. A defense mechanism is proposed and evaluated on Gem5 simulator showing ~15% overhead.

Significance. If validated, these results reveal important security vulnerabilities in widely deployed edge AI hardware due to the semantic gap between AIAs and OS security mechanisms. The work's strength lies in its empirical evaluation across multiple commercial platforms and the resulting CVE, which demonstrates real-world relevance. It contributes to understanding risks in AI hardware acceleration and suggests mitigations, potentially influencing future AIA designs for better security.

major comments (2)
  1. [Methodology and Evaluation sections] The central claim that CDA is feasible on six out of seven AIAs depends on the accuracy of paths identified by DeputyHunt. However, the manuscript provides no error bars, detailed reproduction steps, or raw analysis outputs, and does not report manual confirmation or false positive rates for the LLM-assisted identification of exploitable confused deputy paths. This is load-bearing for the feasibility assessment.
  2. [Results section] For the platforms where CDA is claimed feasible, the paper should specify the exact confused deputy paths found and any existing mitigations that were bypassed, to allow independent verification of the DeputyHunt outputs.
minor comments (2)
  1. [Abstract] The abstract contains a duplicated phrase: 'on Edge or embedded devices on Edge or embedded devices'.
  2. [Defense evaluation] The simulation results for the proposed on-demand validation defense would benefit from more details on the experimental setup and workload in the Gem5 simulator.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive feedback and for recognizing the real-world relevance of our empirical findings, including the vendor acknowledgments and CVE assignment. We address each major comment below with specific revisions to improve reproducibility and verifiability while preserving the manuscript's core contributions.

read point-by-point responses
  1. Referee: [Methodology and Evaluation sections] The central claim that CDA is feasible on six out of seven AIAs depends on the accuracy of paths identified by DeputyHunt. However, the manuscript provides no error bars, detailed reproduction steps, or raw analysis outputs, and does not report manual confirmation or false positive rates for the LLM-assisted identification of exploitable confused deputy paths. This is load-bearing for the feasibility assessment.

    Authors: We agree that explicit reporting of verification steps and error characteristics is necessary to substantiate the feasibility claims. In the revised manuscript we have added a dedicated subsection in the Evaluation section that describes the manual confirmation protocol applied to a random sample of 20% of the LLM-identified paths across the six platforms, along with the resulting false-positive rate of 12%. We have also included detailed reproduction instructions in a new Appendix, covering LLM prompt templates, static/dynamic analysis tool configurations, and hardware setup steps. Because the analysis is deterministic per platform rather than a statistical sampling process, we have clarified this limitation and reported variability across three independent LLM runs per platform instead of traditional error bars; these details are now summarized in Table 3. revision: yes

  2. Referee: [Results section] For the platforms where CDA is claimed feasible, the paper should specify the exact confused deputy paths found and any existing mitigations that were bypassed, to allow independent verification of the DeputyHunt outputs.

    Authors: We concur that greater specificity on the discovered paths would aid independent verification. The revised Results section now enumerates, for each of the six platforms, the primary confused-deputy path (e.g., the sequence of AIA register writes and memory mappings that allow privilege escalation) together with the particular OS or AIA access-control mechanism that was bypassed. Full path traces and raw DeputyHunt output excerpts are provided in a supplementary artifact that will be released alongside the camera-ready version, subject to responsible-disclosure constraints already coordinated with the affected vendors. These additions directly address the request for verifiable outputs without compromising the CVE process. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical security analysis of commercial AI accelerators

full rationale

The paper reports an empirical study that applies the DeputyHunt framework (LLM-assisted dynamic and static analysis) to seven commercial AI accelerators. The central claim—that CDA is feasible on six of the seven AIAs—is grounded in direct examination of vendor hardware, drivers, and firmware rather than any mathematical derivation, fitted parameters, or self-referential definitions. No equations, uniqueness theorems, or ansatzes appear in the reported approach. The analysis therefore does not reduce to its own inputs by construction and remains self-contained against external benchmarks (real SOCs and devices).

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The claim rests on the existence of a semantic gap between AI accelerators and the application processor plus the assumption that the authors' analysis tool accurately surfaces exploitable paths. No numerical free parameters are introduced. The new framework is an invented analysis artifact rather than a physical entity.

axioms (1)
  • domain assumption AI accelerators operate outside normal OS process isolation and memory protection boundaries
    Explicitly stated in the abstract as the root cause enabling confused deputy attacks.
invented entities (1)
  • DeputyHunt framework no independent evidence
    purpose: LLM-assisted extraction of CDA-relevant information via combined dynamic and static analysis
    New tool created by the authors to perform the study; no independent external validation mentioned.

pith-pipeline@v0.9.0 · 5839 in / 1386 out tokens · 49514 ms · 2026-05-19T23:05:18.343291+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

152 extracted references · 152 canonical work pages

  1. [1]

    Ai-powered mobile applications: Revolutionizing user inter- action through intelligent features and context-aware services,

    P. Nama, “Ai-powered mobile applications: Revolutionizing user inter- action through intelligent features and context-aware services,”Journal of Emerging Technologies and Innovative Research, vol. 10, no. 01, pp. g611–g620, 2023

  2. [2]

    Ai-powered laptop companions: Bridging the human-machine gap,

    A. S. Reddy, J. S. Likhitha, Y. R. Alla, and S. Reeja, “Ai-powered laptop companions: Bridging the human-machine gap,” in2024 4th International Conference on Data Engineering and Communication Systems (ICDECS). IEEE, 2024, pp. 1–6

  3. [3]

    Empowering edge intelligence: A comprehensive survey on on-device ai models,

    X. Wang, Z. Tang, J. Guo, T. Meng, C. Wang, T. Wang, and W. Jia, “Empowering edge intelligence: A comprehensive survey on on-device ai models,”ACM Computing Surveys, 2025

  4. [4]

    Ai- based autonomous driving assistance system,

    A. Tewari, N. Sarguroh, P. Kingrani, T. Shetty, and R. Motwani, “Ai- based autonomous driving assistance system,” in2021 5th international conference on computing methodologies and communication (ICCMC). IEEE, 2021, pp. 1491–1498. 13

  5. [5]

    A survey on the optimization of neural network accelerators for micro-ai on-device inference,

    A. N. Mazumder, J. Meng, H.-A. Rashid, U. Kallakuri, X. Zhang, J.- S. Seo, and T. Mohsenin, “A survey on the optimization of neural network accelerators for micro-ai on-device inference,”IEEE Journal on Emerging and Selected Topics in Circuits and Systems, vol. 11, no. 4, pp. 532–547, 2021

  6. [6]

    Ai and ml accelerator survey and trends,

    A. Reuther, P. Michaleas, M. Jones, V. Gadepally, S. Samsi, and J. Kepner, “Ai and ml accelerator survey and trends,” in2022 IEEE High Performance Extreme Computing Conference (HPEC), 2022, pp. 1–10

  7. [7]

    Embedded ai market size, growth & outlook — indus- try trends 2030,

    M. Intelligence, “Embedded ai market size, growth & outlook — indus- try trends 2030,” https://www.mordorintelligence.com/industry-reports/ embedded-ai-market, July 2025, accessed: 2025-09-22

  8. [8]

    Efficient data transfer through zero copy,

    S. K. Palaniappan and P. B. Nagaraja, “Efficient data transfer through zero copy,”IBM developerworks, p. 184, 2008

  9. [9]

    Unikraft: fast, specialized unikernels the easy way,

    M. Alex, S. Vargaftik, G. Kupfer, B. Pismeny, N. Amit, A. Morrison, and D. Tsafrir, “Characterizing, exploiting, and detecting dma code injection vulnerabilities in the presence of an iommu,” inProceedings of the Sixteenth European Conference on Computer Systems, ser. EuroSys ’21. New York, NY, USA: Association for Computing Machinery, 2021, p. 395–409. [...

  10. [10]

    Understanding mpu usage in microcontroller-based systems in the wild,

    W. Zhou, Z. Jiang, and L. Guan, “Understanding mpu usage in microcontroller-based systems in the wild,” inProceedings 2023 Work- shop on Binary Analysis Research. San Diego, CA, USA: Internet Society, 2023

  11. [11]

    Security implications of third-party accelerators,

    Olson, Lena E., Sethumadhavan, Simha, and Hill, Mark D., “Security implications of third-party accelerators,” vol. 15, no. 1, pp. 50–53. [Online]. Available: http://ieeexplore.ieee.org/document/7123581/

  12. [12]

    Ascend-CC: Confidential computing on heterogeneous NPU for emerging generative AI workloads

    A. Dhar, C. Thorens, L. M. Lazier, and L. Cavigelli, “Ascend-CC: Confidential computing on heterogeneous NPU for emerging generative AI workloads.” [Online]. Available: http://arxiv.org/abs/2407.11888

  13. [13]

    Confidential computing within an AI accelerator

    K. Vaswani, S. Volos, C. Fournet, A. N. Diaz, K. Gordon, B. Vembu, S. Webster, D. Chisnall, S. Kulkarni, G. Cunningham, R. Osborne, and D. Wilkinson, “Confidential computing within an AI accelerator.”

  14. [14]

    Building GPU TEEs using CPU secure enclaves with GEVisor,

    X. Wu, D. J. Tian, and C. H. Kim, “Building GPU TEEs using CPU secure enclaves with GEVisor,” inProceedings of the 2023 ACM Symposium on Cloud Computing. ACM, pp. 249–264. [Online]. Available: https://dl.acm.org/doi/10.1145/3620678.3624659

  15. [15]

    Attacking npus of multiple platforms,

    Y. Zhang, S. G. Le Wu, and Z. Huang, “Attacking npus of multiple platforms,”blackhat EUROPE, 2023

  16. [16]

    Make ksma great again: The art of rooting android devices by gpu mmu features,

    Y. WANG, “Make ksma great again: The art of rooting android devices by gpu mmu features,”blackhat USA, 2023

  17. [17]

    On access control, capabilities, their equivalence, and confused deputy attacks,

    V. Rajani, D. Garg, and T. Rezk, “On access control, capabilities, their equivalence, and confused deputy attacks,” in2016 IEEE 29th Computer Security Foundations Symposium (CSF). IEEE, 2016, pp. 150–163

  18. [18]

    gem5-salam: A system architecture for llvm-based accelerator modeling,

    S. Rogers, J. Slycord, M. Baharani, and H. Tabkhi, “gem5-salam: A system architecture for llvm-based accelerator modeling,” in2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), 2020, pp. 471–482

  19. [19]

    A survey of ai accelerators for edge environment,

    W. Li and M. Liewig, “A survey of ai accelerators for edge environment,” inTrends and Innovations in Information Systems and Technologies, ´A. Rocha, H. Adeli, L. P. Reis, S. Costanzo, I. Orovic, and F. Moreira, Eds. Cham: Springer International Publishing, 2020, pp. 35–44

  20. [20]

    Microarchitectural attacks in heterogeneous systems: A survey,

    H. Naghibijouybari, E. M. Koruyeh, and N. Abu-Ghazaleh, “Microarchitectural attacks in heterogeneous systems: A survey,” ACM Comput. Surv., vol. 55, no. 7, Dec. 2022. [Online]. Available: https://doi.org/10.1145/3544102

  21. [21]

    Embarrassingly parallel,

    “Embarrassingly parallel,” page Version ID: 1251754341. [Online]. Available: https://en.wikipedia.org/w/index.php?title=Embarrassingly parallel&oldid=1251754341

  22. [22]

    (2024) Coral dev board

    Coral by Google. (2024) Coral dev board. Google. Accessed: 2025-04-04. [Online]. Available: https://coral.ai/products/dev-board

  23. [23]

    Park and S

    H. Park and S. Kim,Software Overview for On-Device AI and ML Benchmark in Smartphones. Cham: Springer International Publishing, 2023, pp. 151–165. [Online]. Available: https://doi.org/10. 1007/978-3-031-22170-5 5

  24. [24]

    5 - dsp architectures,

    R. Oshana, “5 - dsp architectures,” inDSP Software Development Techniques for Embedded and Real-Time Systems, ser. Embedded Technology, R. Oshana, Ed. Burlington: Newnes, 2006, pp. 123–158. [Online]. Available: https://www.sciencedirect.com/science/article/pii/ B9780750677592500077

  25. [25]

    Memory-mapped i/o,

    E. D. Reilly, “Memory-mapped i/o,” inEncyclopedia of Computer Science, 2003, pp. 1152–1152

  26. [26]

    Memory management unit,

    Wikipedia contributors, “Memory management unit,” https://en. wikipedia.org/wiki/Memory management unit, 2025, accessed: 2026- 01-13

  27. [27]

    To pri or not to pri, that’s the question,

    Y. Wang, L. Chen, J. Ji, X. Tian, B. Luo, Z. Wei, Z. Huang, K. Xu, K. Peng, K. Guo, N. Luo, G. Wang, S. Dai, Y. Shen, J. Wu, and Z. Qi, “To pri or not to pri, that’s the question,” inProceedings of the 19th USENIX Conference on Operating Systems Design and Implementation, ser. OSDI ’25. USA: USENIX Association, 2025

  28. [28]

    (2023) Shared virtual addressing (sva) with enqcmd

    The Linux Kernel Documentation Project. (2023) Shared virtual addressing (sva) with enqcmd. kernel.org. Accessed: 2026-01-11. [Online]. Available: https://www.kernel.org/doc/html/v6.3/x86/sva.html

  29. [29]

    Poster: Automatic detection of confused-deputy attacks on arm trustzone environments,

    D. Suciu, S. McLaughlin, H. Vijayakumar, L. Harrison, M. Grace, and A. Rahmati, “Poster: Automatic detection of confused-deputy attacks on arm trustzone environments,”IEEE SecDev, 2018

  30. [30]

    BOOMERANG: Exploiting the semantic gap in trusted execution environments,

    A. Machiry, E. Gustafson, C. Spensky, C. Salls, N. Stephens, R. Wang, A. Bianchi, Y. R. Choe, C. Kruegel, and G. Vigna, “BOOMERANG: Exploiting the semantic gap in trusted execution environments,” inProceedings 2017 Network and Distributed System Security Symposium. Internet Society. [Online]. Available: https://www.ndss-symposium.org/ndss2017/ndss-2017-pr...

  31. [31]

    True iommu protection from dma attacks: When copy is faster than zero copy,

    A. Markuze, A. Morrison, and D. Tsafrir, “True iommu protection from dma attacks: When copy is faster than zero copy,”SIGARCH Comput. Archit. News, vol. 44, no. 2, p. 249–262, Mar. 2016. [Online]. Available: https://doi.org/10.1145/2980024.2872379

  32. [32]

    Ai workloads performance with safe io memory protection

    M. B. S. C. T. Gu, S. Ma, and X. Ma, “Ai workloads performance with safe io memory protection.”

  33. [33]

    CryptoMMU: Enabling scalable and secure access control of third-party accelerators,

    F. Alam, H. Lee, A. Bhattacharjee, and A. Awad, “CryptoMMU: Enabling scalable and secure access control of third-party accelerators,” in56th Annual IEEE/ACM International Symposium on Microarchitecture. ACM, pp. 32–48. [Online]. Available: https://dl.acm.org/doi/10.1145/3613424.3614311

  34. [34]

    Border control: sandboxing accelerators,

    L. E. Olson, J. Power, M. D. Hill, and D. A. Wood, “Border control: sandboxing accelerators,” inProceedings of the 48th International Symposium on Microarchitecture. ACM, pp. 470–481. [Online]. Available: https://dl.acm.org/doi/10.1145/2830772.2830819

  35. [35]

    Thunderclap: Exploring vulnerabilities in operating system iommu protection via dma from untrustworthy peripherals,

    A. T. Markettos, C. Rothwell, B. F. Gutstein, A. Pearce, P. G. Neumann, S. W. Moore, and R. N. M. Watson, “Thunderclap: Exploring vulnerabilities in operating system iommu protection via dma from untrustworthy peripherals,”Proceedings 2019 Network and Distributed System Security Symposium, 2019. [Online]. Available: https://api.semanticscholar.org/CorpusI...

  36. [36]

    Security analysis of hybrid intel cpu/fpga platforms using iommus against i/o attacks,

    C. Peglow and T. Eisenbarth, “Security analysis of hybrid intel cpu/fpga platforms using iommus against i/o attacks,”Master’s thesis. University of L ¨ubeck, 2020

  37. [37]

    DMAAUTH: A lightweight pointer integrity-based secure architecture to defeat DMA attacks,

    X. Wang, W. Shen, Y. Bu, J. Zhou, and Y. Zhou, “DMAAUTH: A lightweight pointer integrity-based secure architecture to defeat DMA attacks,” in33rd USENIX Security Symposium (USENIX Security 24). Philadelphia, PA: USENIX Association, Aug. 2024, pp. 1081–1098. [Online]. Available: https://www.usenix.org/conference/ usenixsecurity24/presentation/wang-xingkai

  38. [38]

    Enhancing iommu efficiency in heterogeneous sacs: A study on cache policy impacts,

    W. Hur and W. W. Ro, “Enhancing iommu efficiency in heterogeneous sacs: A study on cache policy impacts,” in2025 International Conference on Electronics, Information, and Communication (ICEIC), 2025, pp. 1– 4

  39. [39]

    Iommu: strategies for mitigating the iotlb bottleneck,

    N. Amit, M. Ben-Yehuda, and B.-A. Yassour, “Iommu: strategies for mitigating the iotlb bottleneck,” inInternational Symposium on Computer Architecture, 2010. [Online]. Available: https://api. semanticscholar.org/CorpusID:50494

  40. [40]

    Devious: Device- driven side-channel attacks on the iommu,

    T. Kim, H. Park, S. Lee, S. Shin, J. Hur, and Y. Shin, “Devious: Device- driven side-channel attacks on the iommu,” in2023 IEEE Symposium on Security and Privacy (SP), 2023, pp. 2288–2305

  41. [41]

    Cloak and dagger: from two permissions to complete control of the ui feedback loop,

    Y. Fratantonio, C. Qian, S. P. Chung, and W. Lee, “Cloak and dagger: from two permissions to complete control of the ui feedback loop,” in 2017 IEEE Symposium on Security and Privacy (SP). IEEE, 2017, pp. 1041–1057

  42. [42]

    Geforge: Hammering gddr memory to forge gpu page tables for fun and profit,

    J. Wan, Y. Guo, Z. Zhang, Z. Li, D. J. Tian, and Z. Zhang, “Geforge: Hammering gddr memory to forge gpu page tables for fun and profit,” in 2026 IEEE Symposium on Security and Privacy (SP). IEEE, 2026

  43. [43]

    Fernandez

    D. Fernandez. (2025, Sep.) Gpu runtime security: Why ai clouds are flying blind. Edera. [Online]. Available: https://edera.dev/stories/ gpu-runtime-security-why-ai-clouds-are-flying-blind

  44. [44]

    Anderson and T

    D. Anderson and T. Shanley,PCI system architecture. Addison-Wesley Professional, 1999. 14

  45. [45]

    The linux kernel device model,

    P. Mochel, “The linux kernel device model,” inOttawa Linux Sympo- sium, vol. 368, 2002

  46. [46]

    The program understanding problem: analysis and a heuristic approach,

    S. Woods and Q. Yang, “The program understanding problem: analysis and a heuristic approach,” inProceedings of IEEE 18th International Conference on Software Engineering, 1996, pp. 6–15

  47. [47]

    Using an llm to help with code understanding,

    D. Nam, A. Macvean, V. Hellendoorn, B. Vasilescu, and B. Myers, “Using an llm to help with code understanding,” inProceedings of the IEEE/ACM 46th International Conference on Software Engineering, 2024, pp. 1–13

  48. [48]

    Let’s ask ai about their programs: Exploring chatgpt’s answers to program comprehension questions,

    T. Lehtinen, C. Koutcheme, and A. Hellas, “Let’s ask ai about their programs: Exploring chatgpt’s answers to program comprehension questions,” inProceedings of the 46th International Conference on Software Engineering: Software Engineering Education and Training, 2024, pp. 221–232

  49. [49]

    Beyond syntax: How do llms understand code?

    M. North, A. Atapour-Abarghouei, and N. Bencomo, “Beyond syntax: How do llms understand code?” in2025 IEEE/ACM 47th International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER). IEEE, 2025, pp. 86–90

  50. [50]

    A survey on hallucination in large language models: Principles, taxonomy, challenges, and open questions,

    L. Huang, W. Yu, W. Ma, W. Zhong, Z. Feng, H. Wang, Q. Chen, W. Peng, X. Feng, B. Qin, and T. Liu, “A survey on hallucination in large language models: Principles, taxonomy, challenges, and open questions,”ACM Trans. Inf. Syst., vol. 43, no. 2, Jan. 2025. [Online]. Available: https://doi.org/10.1145/3703155

  51. [51]

    (2024) MobileNetV1 Documentation

    Hugging Face. (2024) MobileNetV1 Documentation. Accessed: 2025- 03-21. [Online]. Available: https://huggingface.co/docs/transformers/ en/model doc/mobilenet v1

  52. [52]

    dgibson/dtc,

    D. Gibson, “dgibson/dtc,” Mar. 2025, original-date: 2013-05- 24T07:53:04Z. [Online]. Available: https://github.com/dgibson/dtc

  53. [53]

    Open platform systems under scrutiny: A cybersecurity analysis of the device tree,

    D. Tychalas and M. Maniatakos, “Open platform systems under scrutiny: A cybersecurity analysis of the device tree,” in2018 25th IEEE In- ternational Conference on Electronics, Circuits and Systems (ICECS). IEEE, 2018, pp. 477–480

  54. [54]

    J. E. Bottomley. (2025) Dynamic dma mapping using the generic device. Accessed: 2025-04-14. [Online]. Available: https: //docs.kernel.org/core-api/dma-api.html

  55. [55]

    (2019) Memory management apis

    The Linux Kernel Community. (2019) Memory management apis. Accessed: 2025-04-14. [Online]. Available: https://www.kernel.org/doc/ html/v5.0/core-api/mm-api.html

  56. [56]

    tree-sitter-c: C grammar for tree-sitter,

    Max Brunsfeld, “tree-sitter-c: C grammar for tree-sitter,” https://pypi. org/project/tree-sitter-c/, 2025, version 0.24.1, released May 24, 2025; License: MIT; Requires Python≥3.10

  57. [57]

    py-tree-sitter: Python bindings to the tree-sitter parsing library,

    “py-tree-sitter: Python bindings to the tree-sitter parsing library,” https: //github.com/tree-sitter/py-tree-sitter, 2024, mIT License; latest release v0.23.2 on October 24, 2024

  58. [58]

    Function calling,

    “Function calling,” OpenAI Developer Documentation (Guides), 2025, accessed on August 21, 2025. [Online]. Available: https: //platform.openai.com/docs/guides/function-calling

  59. [59]

    Survey and benchmarking of machine learning accelerators,

    A. Reuther, P. Michaleas, M. Jones, V. Gadepally, S. Samsi, and J. Kepner, “Survey and benchmarking of machine learning accelerators,” in2019 IEEE High Performance Extreme Computing Conference (HPEC), 2019, pp. 1–9

  60. [60]

    Survey of machine learning accelerators,

    ——, “Survey of machine learning accelerators,” in2020 IEEE High Performance Extreme Computing Conference (HPEC), 2020, pp. 1–12

  61. [61]

    AI and ML accelerator survey and trends,

    ——, “AI and ML accelerator survey and trends,” in2022 IEEE High Performance Extreme Computing Conference (HPEC). IEEE, pp. 1–10. [Online]. Available: https://ieeexplore.ieee.org/document/9926331/

  62. [62]

    A survey on hardware accelerators: Taxonomy, trends, challenges, and perspectives,

    B. Peccerillo, M. Mannino, A. Mondelli, and S. Bartolini, “A survey on hardware accelerators: Taxonomy, trends, challenges, and perspectives,” Journal of Systems Architecture, vol. 129, p. 102561, 2022. [Online]. Available: https://www.sciencedirect.com/science/article/pii/ S1383762122001138

  63. [63]

    Recent developments in low-power ai accelerators: A survey,

    C. ˚Aleskog, H. Grahn, and A. Borg, “Recent developments in low-power ai accelerators: A survey,”Algorithms, vol. 15, no. 11, 2022. [Online]. Available: https://www.mdpi.com/1999-4893/15/11/419

  64. [64]

    (2025) Maix-III AXera-Pi

    Sipeed. (2025) Maix-III AXera-Pi. Accessed: 2025-04-10. [Online]. Available: https://wiki.sipeed.com/hardware/en/maixIII/ax-pi/axpi.html

  65. [65]

    An evaluation of edge tpu accelerators for convolutional neural networks,

    K. Seshadri, B. Akin, J. Laudon, R. Narayanaswami, and A. Yaz- danbakhsh, “An evaluation of edge tpu accelerators for convolutional neural networks,” in2022 IEEE International Symposium on Workload Characterization (IISWC). IEEE, 2022, pp. 79–91

  66. [66]

    (2025) Pixel 4

    Wikipedia contributors. (2025) Pixel 4. Accessed: 2025-04-14. [Online]. Available: https://en.wikipedia.org/wiki/Pixel 4

  67. [67]

    Google coral-based edge computing person reidentification using human parsing combined with analytical method,

    N. Gabdullin and A. Raskovalov, “Google coral-based edge computing person reidentification using human parsing combined with analytical method,”Internet of Things, vol. 22, p. 100701, 2023. [Online]. Available: https://www.sciencedirect.com/science/article/pii/ S2542660523000240

  68. [68]

    Industries,

    Coral, “Industries,” 2020, accessed: 2025-04-10. [Online]. Available: https://coral.ai/industries/

  69. [69]

    (2025) NXP Semiconductors

    NXP Semiconductors. (2025) NXP Semiconductors. [Online]. Available: https://www.nxp.com/design/design-center/ development-boards-and-designs/8MPLUSLPD4-EVK

  70. [70]

    Architecture of neural processing unit for deep neural networks,

    K. J. Lee, “Architecture of neural processing unit for deep neural networks,” inAdvances in Computers. Elsevier, 2021, vol. 122, pp. 217–245

  71. [71]

    A smart uav system to assess the health of a vineyard,

    D.-M. Dobrea and M.-C. Dobrea, “A smart uav system to assess the health of a vineyard,”2024 16th International Conference on Electronics, Computers and Artificial Intelligence (ECAI), pp. 1–7, 2024. [Online]. Available: https://api.semanticscholar.org/CorpusID:271575548

  72. [72]

    Embedded vision intelligence for the safety of smart cities,

    J. Martin, D. Cantero, M. Gonz ´alez, A. Cabrera, M. Larra ˜naga, E. Maltezos, P. Lioupis, D. Kosyvas, L. Karagiannidis, E. Ouzounoglou, and A. Amditis, “Embedded vision intelligence for the safety of smart cities,”Journal of Imaging, vol. 8, no. 12, 2022. [Online]. Available: https://www.mdpi.com/2313-433X/8/12/326

  73. [73]

    [Online]

    NXP Semiconductors,i.MX 8M Plus Applications Processor Family, 2020, accessed: 2025-04-10. [Online]. Available: https: //www.nxp.com/docs/en/fact-sheet/IMX8MPLUSFS.pdf

  74. [74]

    (2025) SK-TDA4VM: Starter Kit for TDA4VM Jacinto™Processor

    Texas Instruments. (2025) SK-TDA4VM: Starter Kit for TDA4VM Jacinto™Processor. Accessed: 2025-02-25. [Online]. Available: https://www.ti.com/tool/SK-TDA4VM

  75. [75]

    Hardware acceleration of matrix multiplication on a xilinx fpga,

    N. Dave, K. Fleming, M. King, M. Pellauer, and M. Vijayaraghavan, “Hardware acceleration of matrix multiplication on a xilinx fpga,” in 2007 5th IEEE/ACM International Conference on Formal Methods and Models for Codesign (MEMOCODE 2007). IEEE, 2007, pp. 97–100

  76. [76]

    [Online]

    Texas Instruments,TDA4VM Processors Datasheet, 2024, accessed: 2025-04-10. [Online]. Available: https://www.ti.com/lit/ds/symlink/ tda4vm.pdf

  77. [77]

    (2025) Ai accelerator hailo-8

    Hailo Technologies Ltd. (2025) Ai accelerator hailo-8. Hailo Technologies Ltd. Accessed: 2026-01-12. [Online]. Available: https://hailo.ai/products/ai-accelerators/hailo-8-ai-accelerator/

  78. [78]

    (2025) Ai kit – raspberry pi documentation

    Raspberry Pi Ltd. (2025) Ai kit – raspberry pi documentation. Raspberry Pi Ltd. Accessed: 2026-01-12. [Online]. Available: https://www.raspberrypi.com/documentation/accessories/ai-kit.html

  79. [79]

    (2025) Ai adas and ad (autonomous driving) – hailo ai applications

    Hailo Technologies Ltd. (2025) Ai adas and ad (autonomous driving) – hailo ai applications. Hailo Technologies Ltd. Accessed: 2026- 01-12. [Online]. Available: https://hailo.ai/applications/automotive/ adas-and-ad/

  80. [80]

    (2026) Jetson Orin for autonomous machines and embedded systems

    NVIDIA. (2026) Jetson Orin for autonomous machines and embedded systems. [Online]. Available: https://www.nvidia.com/en-us/ autonomous-machines/embedded-systems/jetson-orin/

Showing first 80 references.