Speed Kills: Exploring Confused Deputy Attacks Through Edge AI Accelerators

Aravind Kumar Machiry; Datta Manikanta Sri Hari Danduri

arxiv: 2605.17707 · v1 · pith:JNPW5ZCVnew · submitted 2026-05-18 · 💻 cs.CR

Speed Kills: Exploring Confused Deputy Attacks Through Edge AI Accelerators

Datta Manikanta Sri Hari Danduri , Aravind Kumar Machiry This is my paper

Pith reviewed 2026-05-19 23:05 UTC · model grok-4.3

classification 💻 cs.CR

keywords confused deputy attacksAI acceleratorsedge deviceshardware securityvulnerability analysissystem on chip

0 comments

The pith

AI accelerators on edge devices can be tricked by apps into performing privileged operations outside OS control.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that AI accelerators for efficient on-device inference operate without the usual operating system restrictions on memory access and process isolation. This creates a semantic gap where a malicious application can trick the accelerator into acting as a confused deputy that performs actions it should not be allowed to do. The authors built DeputyHunt, an LLM-assisted tool that combines dynamic and static analysis to locate these attack paths across different hardware. Testing seven accelerators from major vendors found feasible attacks on six of them, exposing risks across many system-on-chips and devices in the field.

Core claim

AI accelerators are not bound by operating system restrictions and have limited visibility into application processor security mechanisms such as kernel versus application memory and process isolation, so they can be tricked by malicious applications to perform privileged operations on their behalf.

What carries the argument

The semantic gap between AI accelerators and operating system security mechanisms that turns the accelerator into a confused deputy.

If this is right

Over 128 system-on-chips and more than 100 million devices become exposed to potential privilege escalation through these accelerators.
Vendors must add checks so that AI accelerators validate whether requested operations come from authorized processes.
An on-demand validation defense can block the attacks while adding roughly 15 percent runtime overhead in simulation.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Future accelerator designs could close the gap by requiring explicit OS approval for operations that touch protected memory or resources.
The same analysis approach might reveal similar deputy risks in other specialized edge hardware such as DSPs or network processors.
Widespread adoption of these accelerators in consumer and industrial devices means the security model for edge AI needs re-examination at the hardware-software boundary.

Load-bearing premise

The DeputyHunt framework correctly identifies exploitable confused deputy paths without substantial false positives or missed cases on the tested accelerators.

What would settle it

A test on one of the six affected accelerators that follows the reported attack path but finds the accelerator refuses to execute the privileged operation.

Figures

Figures reproduced from arXiv: 2605.17707 by Aravind Kumar Machiry, Datta Manikanta Sri Hari Danduri.

**Figure 1.** Figure 1: Communication with AIA and Threat Model We aim to perform the first investigation of the possibility of memory CDAs (which we will refer to as just CDAs in the rest of the paper) through AIA. Paper Roadmap. First, we provide low-level details on typical communication with AIA (§ VI). Second, we present our methodology to investigate CDAs in AIA (§ VII-C). Third, we will present the results of our investiga… view at source ↗

**Figure 2.** Figure 2: Overview of DeputyHunt. — a readable and parsable format. We parse the DTS file to extract AIMem and AIRMem corresponding to the AIA. For PCI or USB connected AIAs, we use lspci or lsusb tool, respectively, to identify AIMem and AIRMem. 2) Source Instrumentation: Our goal is to record communication from user space to KD and from KD to AIA. Specifically, we want to capture the execution flow and functions … view at source ↗

**Figure 3.** Figure 3: NXP NPU CDA case study 3) Case Study: NXP NPU CDA Exploit: DeputyHunt identifies the relevant message fields (Table III), KD entry points (Table II), and AIA-relevant functions (Table II). Analysis of the AIA-relevant functions revealed (Appendix H) that NXP NPU uses custom page tables and encodes the page-table index as the SMID. Furthermore, these SMIDs are identity-mapped to physical addresses, allowin… view at source ↗

**Figure 4.** Figure 4: Memory Regions Categorization Device Type AIMem AIRMem AVLs KD Device Files Google TPU PCI 16KB from 0x20200000 1M from 0x20100000 - libcoral (statically linked) libedgetpu.so /dev/apex_0 NXP NPU AXI and AHB 32KB from 0x38000000 32KB from 0x38008000 128KB from 0x38500000 256MB from 0x100000000 libvx_ delegate.so libtimvx.so libGAL.so /dev/galcore TMMA Custom Interconnect - Several Disjoint Zones (Listing 3… view at source ↗

read the original abstract

AI Accelerator (AIA) are specialized hardware e.g., Tensor Processing Unit (TPU), that enable optimal and efficient execution of AI applications and on-device inference. The growing demand for AI applications has led to the widespread adoption of AIAs on Edge or embedded devices on Edge or embedded devices. Unlike applications, AIAs are not bound by Operating System (OS) restrictions and have limited visibility into Application Processor (AP) security mechanisms (e.g., kernel vs. application memory, process isolation). This semantic gap can lead to confused deputy vulnerabilities, i.e., AIA can be tricked by a malicious application to perform privileged operations on their behalf. In this paper, we conducted the first in-depth study of Confused Deputy Attacks (CDAs) using AIA. We design DeputyHunt, a Large Language Model (LLM) assisted framework to extract CDA relevant information for a given AIA through a combination of dynamic and static analysis. We used this information to explore the feasibility of CDA on seven different AIAs from popular vendors, i.e., Google, NVIDIA, Hailo, Texas Instruments, NXP, AWS, and Rockchip. Our analysis revealed that CDA is feasible on six out of the seven AIAs, impacting over 128 System On Chips (SOCs) and over 100 million devices. Our findings highlight critical security risks posed by AIA on system security. Our work has been acknowledged by the corresponding vendors and assigned the CVE-2025-66425. We propose an on-demand validation defense against CDA, and evaluation on the Gem5- salam simulator shows that it incurs minimal runtime overhead (i.e., ~15%).

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper shows confused deputy attacks are feasible on six of seven commercial AI accelerators using an LLM-assisted analysis framework, with vendor CVE acknowledgment as supporting evidence.

read the letter

The main takeaway is that this work finds concrete ways untrusted apps could misuse AI accelerators on edge devices to escalate privileges, and the authors report this holds on most of the platforms they checked with some external confirmation from vendors. They built DeputyHunt to pull out relevant details from hardware interfaces by mixing static analysis, dynamic runs, and LLM assistance, then applied it to seven accelerators from Google, NVIDIA, Hailo, Texas Instruments, NXP, AWS, and Rockchip. The result is a claim of feasibility on six, touching over 128 SOCs and 100 million devices, plus a proposed on-demand validation defense that adds roughly 15% overhead in simulation. This is new as a targeted study of confused deputy issues on AI accelerators rather than general hardware. The empirical focus on real commercial platforms and the CVE assignment give the findings some weight beyond pure speculation. The approach avoids heavy self-referential math and instead relies on direct hardware checks, which is a plus for reproducibility in principle. The soft spot is the heavy use of LLMs in the analysis pipeline, which can misread driver or firmware details and create false paths that look exploitable but are not. The abstract and reported results give limited reproduction steps or raw outputs, so it is hard to judge how much manual confirmation happened after the automated steps. If the identified paths were not all validated by hand or with open artifacts, the feasibility numbers could be overstated. This is worth attention for researchers working on embedded AI security or hardware isolation. A reader who cares about attack surfaces in accelerators would find the platform coverage and defense idea useful even if the exact attack details need more scrutiny. I would send it to peer review because the scale of the claimed impact and the multi-vendor testing make it worth referee time, though the authors should add clearer validation evidence for the LLM outputs.

Referee Report

2 major / 2 minor

Summary. The paper claims to perform the first in-depth study of Confused Deputy Attacks (CDAs) on Edge AI Accelerators (AIAs). It designs DeputyHunt, an LLM-assisted framework using dynamic and static analysis to identify exploitable CDA paths. Evaluation on seven AIAs from Google, NVIDIA, Hailo, Texas Instruments, NXP, AWS, and Rockchip shows CDA feasibility on six, affecting over 128 SoCs and 100 million devices. Vendors acknowledged the findings with CVE-2025-66425. A defense mechanism is proposed and evaluated on Gem5 simulator showing ~15% overhead.

Significance. If validated, these results reveal important security vulnerabilities in widely deployed edge AI hardware due to the semantic gap between AIAs and OS security mechanisms. The work's strength lies in its empirical evaluation across multiple commercial platforms and the resulting CVE, which demonstrates real-world relevance. It contributes to understanding risks in AI hardware acceleration and suggests mitigations, potentially influencing future AIA designs for better security.

major comments (2)

[Methodology and Evaluation sections] The central claim that CDA is feasible on six out of seven AIAs depends on the accuracy of paths identified by DeputyHunt. However, the manuscript provides no error bars, detailed reproduction steps, or raw analysis outputs, and does not report manual confirmation or false positive rates for the LLM-assisted identification of exploitable confused deputy paths. This is load-bearing for the feasibility assessment.
[Results section] For the platforms where CDA is claimed feasible, the paper should specify the exact confused deputy paths found and any existing mitigations that were bypassed, to allow independent verification of the DeputyHunt outputs.

minor comments (2)

[Abstract] The abstract contains a duplicated phrase: 'on Edge or embedded devices on Edge or embedded devices'.
[Defense evaluation] The simulation results for the proposed on-demand validation defense would benefit from more details on the experimental setup and workload in the Gem5 simulator.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive feedback and for recognizing the real-world relevance of our empirical findings, including the vendor acknowledgments and CVE assignment. We address each major comment below with specific revisions to improve reproducibility and verifiability while preserving the manuscript's core contributions.

read point-by-point responses

Referee: [Methodology and Evaluation sections] The central claim that CDA is feasible on six out of seven AIAs depends on the accuracy of paths identified by DeputyHunt. However, the manuscript provides no error bars, detailed reproduction steps, or raw analysis outputs, and does not report manual confirmation or false positive rates for the LLM-assisted identification of exploitable confused deputy paths. This is load-bearing for the feasibility assessment.

Authors: We agree that explicit reporting of verification steps and error characteristics is necessary to substantiate the feasibility claims. In the revised manuscript we have added a dedicated subsection in the Evaluation section that describes the manual confirmation protocol applied to a random sample of 20% of the LLM-identified paths across the six platforms, along with the resulting false-positive rate of 12%. We have also included detailed reproduction instructions in a new Appendix, covering LLM prompt templates, static/dynamic analysis tool configurations, and hardware setup steps. Because the analysis is deterministic per platform rather than a statistical sampling process, we have clarified this limitation and reported variability across three independent LLM runs per platform instead of traditional error bars; these details are now summarized in Table 3. revision: yes
Referee: [Results section] For the platforms where CDA is claimed feasible, the paper should specify the exact confused deputy paths found and any existing mitigations that were bypassed, to allow independent verification of the DeputyHunt outputs.

Authors: We concur that greater specificity on the discovered paths would aid independent verification. The revised Results section now enumerates, for each of the six platforms, the primary confused-deputy path (e.g., the sequence of AIA register writes and memory mappings that allow privilege escalation) together with the particular OS or AIA access-control mechanism that was bypassed. Full path traces and raw DeputyHunt output excerpts are provided in a supplementary artifact that will be released alongside the camera-ready version, subject to responsible-disclosure constraints already coordinated with the affected vendors. These additions directly address the request for verifiable outputs without compromising the CVE process. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical security analysis of commercial AI accelerators

full rationale

The paper reports an empirical study that applies the DeputyHunt framework (LLM-assisted dynamic and static analysis) to seven commercial AI accelerators. The central claim—that CDA is feasible on six of the seven AIAs—is grounded in direct examination of vendor hardware, drivers, and firmware rather than any mathematical derivation, fitted parameters, or self-referential definitions. No equations, uniqueness theorems, or ansatzes appear in the reported approach. The analysis therefore does not reduce to its own inputs by construction and remains self-contained against external benchmarks (real SOCs and devices).

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The claim rests on the existence of a semantic gap between AI accelerators and the application processor plus the assumption that the authors' analysis tool accurately surfaces exploitable paths. No numerical free parameters are introduced. The new framework is an invented analysis artifact rather than a physical entity.

axioms (1)

domain assumption AI accelerators operate outside normal OS process isolation and memory protection boundaries
Explicitly stated in the abstract as the root cause enabling confused deputy attacks.

invented entities (1)

DeputyHunt framework no independent evidence
purpose: LLM-assisted extraction of CDA-relevant information via combined dynamic and static analysis
New tool created by the authors to perform the study; no independent external validation mentioned.

pith-pipeline@v0.9.0 · 5839 in / 1386 out tokens · 49514 ms · 2026-05-19T23:05:18.343291+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We design DeputyHunt, a Large Language Model (LLM) assisted framework to extract CDA relevant information for a given AIA through a combination of dynamic and static analysis.
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Our analysis revealed that CDA is feasible on six out of the seven AIAs, impacting over 128 System On Chips (SOCs) and over 100 million devices.

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

152 extracted references · 152 canonical work pages

[1]

Ai-powered mobile applications: Revolutionizing user inter- action through intelligent features and context-aware services,

P. Nama, “Ai-powered mobile applications: Revolutionizing user inter- action through intelligent features and context-aware services,”Journal of Emerging Technologies and Innovative Research, vol. 10, no. 01, pp. g611–g620, 2023

work page 2023
[2]

Ai-powered laptop companions: Bridging the human-machine gap,

A. S. Reddy, J. S. Likhitha, Y. R. Alla, and S. Reeja, “Ai-powered laptop companions: Bridging the human-machine gap,” in2024 4th International Conference on Data Engineering and Communication Systems (ICDECS). IEEE, 2024, pp. 1–6

work page 2024
[3]

Empowering edge intelligence: A comprehensive survey on on-device ai models,

X. Wang, Z. Tang, J. Guo, T. Meng, C. Wang, T. Wang, and W. Jia, “Empowering edge intelligence: A comprehensive survey on on-device ai models,”ACM Computing Surveys, 2025

work page 2025
[4]

Ai- based autonomous driving assistance system,

A. Tewari, N. Sarguroh, P. Kingrani, T. Shetty, and R. Motwani, “Ai- based autonomous driving assistance system,” in2021 5th international conference on computing methodologies and communication (ICCMC). IEEE, 2021, pp. 1491–1498. 13

work page 2021
[5]

A survey on the optimization of neural network accelerators for micro-ai on-device inference,

A. N. Mazumder, J. Meng, H.-A. Rashid, U. Kallakuri, X. Zhang, J.- S. Seo, and T. Mohsenin, “A survey on the optimization of neural network accelerators for micro-ai on-device inference,”IEEE Journal on Emerging and Selected Topics in Circuits and Systems, vol. 11, no. 4, pp. 532–547, 2021

work page 2021
[6]

Ai and ml accelerator survey and trends,

A. Reuther, P. Michaleas, M. Jones, V. Gadepally, S. Samsi, and J. Kepner, “Ai and ml accelerator survey and trends,” in2022 IEEE High Performance Extreme Computing Conference (HPEC), 2022, pp. 1–10

work page 2022
[7]

Embedded ai market size, growth & outlook — indus- try trends 2030,

M. Intelligence, “Embedded ai market size, growth & outlook — indus- try trends 2030,” https://www.mordorintelligence.com/industry-reports/ embedded-ai-market, July 2025, accessed: 2025-09-22

work page 2030
[8]

Efficient data transfer through zero copy,

S. K. Palaniappan and P. B. Nagaraja, “Efficient data transfer through zero copy,”IBM developerworks, p. 184, 2008

work page 2008
[9]

Unikraft: fast, specialized unikernels the easy way,

M. Alex, S. Vargaftik, G. Kupfer, B. Pismeny, N. Amit, A. Morrison, and D. Tsafrir, “Characterizing, exploiting, and detecting dma code injection vulnerabilities in the presence of an iommu,” inProceedings of the Sixteenth European Conference on Computer Systems, ser. EuroSys ’21. New York, NY, USA: Association for Computing Machinery, 2021, p. 395–409. [...

work page doi:10.1145/3447786.3456249 2021
[10]

Understanding mpu usage in microcontroller-based systems in the wild,

W. Zhou, Z. Jiang, and L. Guan, “Understanding mpu usage in microcontroller-based systems in the wild,” inProceedings 2023 Work- shop on Binary Analysis Research. San Diego, CA, USA: Internet Society, 2023

work page 2023
[11]

Security implications of third-party accelerators,

Olson, Lena E., Sethumadhavan, Simha, and Hill, Mark D., “Security implications of third-party accelerators,” vol. 15, no. 1, pp. 50–53. [Online]. Available: http://ieeexplore.ieee.org/document/7123581/

work page arXiv
[12]

Ascend-CC: Confidential computing on heterogeneous NPU for emerging generative AI workloads

A. Dhar, C. Thorens, L. M. Lazier, and L. Cavigelli, “Ascend-CC: Confidential computing on heterogeneous NPU for emerging generative AI workloads.” [Online]. Available: http://arxiv.org/abs/2407.11888

work page arXiv
[13]

Confidential computing within an AI accelerator

K. Vaswani, S. Volos, C. Fournet, A. N. Diaz, K. Gordon, B. Vembu, S. Webster, D. Chisnall, S. Kulkarni, G. Cunningham, R. Osborne, and D. Wilkinson, “Confidential computing within an AI accelerator.”

work page
[14]

Building GPU TEEs using CPU secure enclaves with GEVisor,

X. Wu, D. J. Tian, and C. H. Kim, “Building GPU TEEs using CPU secure enclaves with GEVisor,” inProceedings of the 2023 ACM Symposium on Cloud Computing. ACM, pp. 249–264. [Online]. Available: https://dl.acm.org/doi/10.1145/3620678.3624659

work page doi:10.1145/3620678.3624659 2023
[15]

Attacking npus of multiple platforms,

Y. Zhang, S. G. Le Wu, and Z. Huang, “Attacking npus of multiple platforms,”blackhat EUROPE, 2023

work page 2023
[16]

Make ksma great again: The art of rooting android devices by gpu mmu features,

Y. WANG, “Make ksma great again: The art of rooting android devices by gpu mmu features,”blackhat USA, 2023

work page 2023
[17]

On access control, capabilities, their equivalence, and confused deputy attacks,

V. Rajani, D. Garg, and T. Rezk, “On access control, capabilities, their equivalence, and confused deputy attacks,” in2016 IEEE 29th Computer Security Foundations Symposium (CSF). IEEE, 2016, pp. 150–163

work page 2016
[18]

gem5-salam: A system architecture for llvm-based accelerator modeling,

S. Rogers, J. Slycord, M. Baharani, and H. Tabkhi, “gem5-salam: A system architecture for llvm-based accelerator modeling,” in2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), 2020, pp. 471–482

work page 2020
[19]

A survey of ai accelerators for edge environment,

W. Li and M. Liewig, “A survey of ai accelerators for edge environment,” inTrends and Innovations in Information Systems and Technologies, ´A. Rocha, H. Adeli, L. P. Reis, S. Costanzo, I. Orovic, and F. Moreira, Eds. Cham: Springer International Publishing, 2020, pp. 35–44

work page 2020
[20]

Microarchitectural attacks in heterogeneous systems: A survey,

H. Naghibijouybari, E. M. Koruyeh, and N. Abu-Ghazaleh, “Microarchitectural attacks in heterogeneous systems: A survey,” ACM Comput. Surv., vol. 55, no. 7, Dec. 2022. [Online]. Available: https://doi.org/10.1145/3544102

work page doi:10.1145/3544102 2022
[21]

Embarrassingly parallel,

“Embarrassingly parallel,” page Version ID: 1251754341. [Online]. Available: https://en.wikipedia.org/w/index.php?title=Embarrassingly parallel&oldid=1251754341

work page
[22]

(2024) Coral dev board

Coral by Google. (2024) Coral dev board. Google. Accessed: 2025-04-04. [Online]. Available: https://coral.ai/products/dev-board

work page 2024
[23]

Park and S

H. Park and S. Kim,Software Overview for On-Device AI and ML Benchmark in Smartphones. Cham: Springer International Publishing, 2023, pp. 151–165. [Online]. Available: https://doi.org/10. 1007/978-3-031-22170-5 5

work page 2023
[24]

5 - dsp architectures,

R. Oshana, “5 - dsp architectures,” inDSP Software Development Techniques for Embedded and Real-Time Systems, ser. Embedded Technology, R. Oshana, Ed. Burlington: Newnes, 2006, pp. 123–158. [Online]. Available: https://www.sciencedirect.com/science/article/pii/ B9780750677592500077

work page 2006
[25]

Memory-mapped i/o,

E. D. Reilly, “Memory-mapped i/o,” inEncyclopedia of Computer Science, 2003, pp. 1152–1152

work page 2003
[26]

Memory management unit,

Wikipedia contributors, “Memory management unit,” https://en. wikipedia.org/wiki/Memory management unit, 2025, accessed: 2026- 01-13

work page 2025
[27]

To pri or not to pri, that’s the question,

Y. Wang, L. Chen, J. Ji, X. Tian, B. Luo, Z. Wei, Z. Huang, K. Xu, K. Peng, K. Guo, N. Luo, G. Wang, S. Dai, Y. Shen, J. Wu, and Z. Qi, “To pri or not to pri, that’s the question,” inProceedings of the 19th USENIX Conference on Operating Systems Design and Implementation, ser. OSDI ’25. USA: USENIX Association, 2025

work page 2025
[28]

(2023) Shared virtual addressing (sva) with enqcmd

The Linux Kernel Documentation Project. (2023) Shared virtual addressing (sva) with enqcmd. kernel.org. Accessed: 2026-01-11. [Online]. Available: https://www.kernel.org/doc/html/v6.3/x86/sva.html

work page 2023
[29]

Poster: Automatic detection of confused-deputy attacks on arm trustzone environments,

D. Suciu, S. McLaughlin, H. Vijayakumar, L. Harrison, M. Grace, and A. Rahmati, “Poster: Automatic detection of confused-deputy attacks on arm trustzone environments,”IEEE SecDev, 2018

work page 2018
[30]

BOOMERANG: Exploiting the semantic gap in trusted execution environments,

A. Machiry, E. Gustafson, C. Spensky, C. Salls, N. Stephens, R. Wang, A. Bianchi, Y. R. Choe, C. Kruegel, and G. Vigna, “BOOMERANG: Exploiting the semantic gap in trusted execution environments,” inProceedings 2017 Network and Distributed System Security Symposium. Internet Society. [Online]. Available: https://www.ndss-symposium.org/ndss2017/ndss-2017-pr...

work page 2017
[31]

True iommu protection from dma attacks: When copy is faster than zero copy,

A. Markuze, A. Morrison, and D. Tsafrir, “True iommu protection from dma attacks: When copy is faster than zero copy,”SIGARCH Comput. Archit. News, vol. 44, no. 2, p. 249–262, Mar. 2016. [Online]. Available: https://doi.org/10.1145/2980024.2872379

work page doi:10.1145/2980024.2872379 2016
[32]

Ai workloads performance with safe io memory protection

M. B. S. C. T. Gu, S. Ma, and X. Ma, “Ai workloads performance with safe io memory protection.”

work page
[33]

CryptoMMU: Enabling scalable and secure access control of third-party accelerators,

F. Alam, H. Lee, A. Bhattacharjee, and A. Awad, “CryptoMMU: Enabling scalable and secure access control of third-party accelerators,” in56th Annual IEEE/ACM International Symposium on Microarchitecture. ACM, pp. 32–48. [Online]. Available: https://dl.acm.org/doi/10.1145/3613424.3614311

work page doi:10.1145/3613424.3614311
[34]

Border control: sandboxing accelerators,

L. E. Olson, J. Power, M. D. Hill, and D. A. Wood, “Border control: sandboxing accelerators,” inProceedings of the 48th International Symposium on Microarchitecture. ACM, pp. 470–481. [Online]. Available: https://dl.acm.org/doi/10.1145/2830772.2830819

work page doi:10.1145/2830772.2830819
[35]

Thunderclap: Exploring vulnerabilities in operating system iommu protection via dma from untrustworthy peripherals,

A. T. Markettos, C. Rothwell, B. F. Gutstein, A. Pearce, P. G. Neumann, S. W. Moore, and R. N. M. Watson, “Thunderclap: Exploring vulnerabilities in operating system iommu protection via dma from untrustworthy peripherals,”Proceedings 2019 Network and Distributed System Security Symposium, 2019. [Online]. Available: https://api.semanticscholar.org/CorpusI...

work page 2019
[36]

Security analysis of hybrid intel cpu/fpga platforms using iommus against i/o attacks,

C. Peglow and T. Eisenbarth, “Security analysis of hybrid intel cpu/fpga platforms using iommus against i/o attacks,”Master’s thesis. University of L ¨ubeck, 2020

work page 2020
[37]

DMAAUTH: A lightweight pointer integrity-based secure architecture to defeat DMA attacks,

X. Wang, W. Shen, Y. Bu, J. Zhou, and Y. Zhou, “DMAAUTH: A lightweight pointer integrity-based secure architecture to defeat DMA attacks,” in33rd USENIX Security Symposium (USENIX Security 24). Philadelphia, PA: USENIX Association, Aug. 2024, pp. 1081–1098. [Online]. Available: https://www.usenix.org/conference/ usenixsecurity24/presentation/wang-xingkai

work page 2024
[38]

Enhancing iommu efficiency in heterogeneous sacs: A study on cache policy impacts,

W. Hur and W. W. Ro, “Enhancing iommu efficiency in heterogeneous sacs: A study on cache policy impacts,” in2025 International Conference on Electronics, Information, and Communication (ICEIC), 2025, pp. 1– 4

work page 2025
[39]

Iommu: strategies for mitigating the iotlb bottleneck,

N. Amit, M. Ben-Yehuda, and B.-A. Yassour, “Iommu: strategies for mitigating the iotlb bottleneck,” inInternational Symposium on Computer Architecture, 2010. [Online]. Available: https://api. semanticscholar.org/CorpusID:50494

work page 2010
[40]

Devious: Device- driven side-channel attacks on the iommu,

T. Kim, H. Park, S. Lee, S. Shin, J. Hur, and Y. Shin, “Devious: Device- driven side-channel attacks on the iommu,” in2023 IEEE Symposium on Security and Privacy (SP), 2023, pp. 2288–2305

work page 2023
[41]

Cloak and dagger: from two permissions to complete control of the ui feedback loop,

Y. Fratantonio, C. Qian, S. P. Chung, and W. Lee, “Cloak and dagger: from two permissions to complete control of the ui feedback loop,” in 2017 IEEE Symposium on Security and Privacy (SP). IEEE, 2017, pp. 1041–1057

work page 2017
[42]

Geforge: Hammering gddr memory to forge gpu page tables for fun and profit,

J. Wan, Y. Guo, Z. Zhang, Z. Li, D. J. Tian, and Z. Zhang, “Geforge: Hammering gddr memory to forge gpu page tables for fun and profit,” in 2026 IEEE Symposium on Security and Privacy (SP). IEEE, 2026

work page 2026
[43]

Fernandez

D. Fernandez. (2025, Sep.) Gpu runtime security: Why ai clouds are flying blind. Edera. [Online]. Available: https://edera.dev/stories/ gpu-runtime-security-why-ai-clouds-are-flying-blind

work page 2025
[44]

Anderson and T

D. Anderson and T. Shanley,PCI system architecture. Addison-Wesley Professional, 1999. 14

work page 1999
[45]

The linux kernel device model,

P. Mochel, “The linux kernel device model,” inOttawa Linux Sympo- sium, vol. 368, 2002

work page 2002
[46]

The program understanding problem: analysis and a heuristic approach,

S. Woods and Q. Yang, “The program understanding problem: analysis and a heuristic approach,” inProceedings of IEEE 18th International Conference on Software Engineering, 1996, pp. 6–15

work page 1996
[47]

Using an llm to help with code understanding,

D. Nam, A. Macvean, V. Hellendoorn, B. Vasilescu, and B. Myers, “Using an llm to help with code understanding,” inProceedings of the IEEE/ACM 46th International Conference on Software Engineering, 2024, pp. 1–13

work page 2024
[48]

Let’s ask ai about their programs: Exploring chatgpt’s answers to program comprehension questions,

T. Lehtinen, C. Koutcheme, and A. Hellas, “Let’s ask ai about their programs: Exploring chatgpt’s answers to program comprehension questions,” inProceedings of the 46th International Conference on Software Engineering: Software Engineering Education and Training, 2024, pp. 221–232

work page 2024
[49]

Beyond syntax: How do llms understand code?

M. North, A. Atapour-Abarghouei, and N. Bencomo, “Beyond syntax: How do llms understand code?” in2025 IEEE/ACM 47th International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER). IEEE, 2025, pp. 86–90

work page 2025
[50]

A survey on hallucination in large language models: Principles, taxonomy, challenges, and open questions,

L. Huang, W. Yu, W. Ma, W. Zhong, Z. Feng, H. Wang, Q. Chen, W. Peng, X. Feng, B. Qin, and T. Liu, “A survey on hallucination in large language models: Principles, taxonomy, challenges, and open questions,”ACM Trans. Inf. Syst., vol. 43, no. 2, Jan. 2025. [Online]. Available: https://doi.org/10.1145/3703155

work page doi:10.1145/3703155 2025
[51]

(2024) MobileNetV1 Documentation

Hugging Face. (2024) MobileNetV1 Documentation. Accessed: 2025- 03-21. [Online]. Available: https://huggingface.co/docs/transformers/ en/model doc/mobilenet v1

work page 2024
[52]

dgibson/dtc,

D. Gibson, “dgibson/dtc,” Mar. 2025, original-date: 2013-05- 24T07:53:04Z. [Online]. Available: https://github.com/dgibson/dtc

work page 2025
[53]

Open platform systems under scrutiny: A cybersecurity analysis of the device tree,

D. Tychalas and M. Maniatakos, “Open platform systems under scrutiny: A cybersecurity analysis of the device tree,” in2018 25th IEEE In- ternational Conference on Electronics, Circuits and Systems (ICECS). IEEE, 2018, pp. 477–480

work page 2018
[54]

J. E. Bottomley. (2025) Dynamic dma mapping using the generic device. Accessed: 2025-04-14. [Online]. Available: https: //docs.kernel.org/core-api/dma-api.html

work page 2025
[55]

(2019) Memory management apis

The Linux Kernel Community. (2019) Memory management apis. Accessed: 2025-04-14. [Online]. Available: https://www.kernel.org/doc/ html/v5.0/core-api/mm-api.html

work page 2019
[56]

tree-sitter-c: C grammar for tree-sitter,

Max Brunsfeld, “tree-sitter-c: C grammar for tree-sitter,” https://pypi. org/project/tree-sitter-c/, 2025, version 0.24.1, released May 24, 2025; License: MIT; Requires Python≥3.10

work page 2025
[57]

py-tree-sitter: Python bindings to the tree-sitter parsing library,

“py-tree-sitter: Python bindings to the tree-sitter parsing library,” https: //github.com/tree-sitter/py-tree-sitter, 2024, mIT License; latest release v0.23.2 on October 24, 2024

work page 2024
[58]

Function calling,

“Function calling,” OpenAI Developer Documentation (Guides), 2025, accessed on August 21, 2025. [Online]. Available: https: //platform.openai.com/docs/guides/function-calling

work page 2025
[59]

Survey and benchmarking of machine learning accelerators,

A. Reuther, P. Michaleas, M. Jones, V. Gadepally, S. Samsi, and J. Kepner, “Survey and benchmarking of machine learning accelerators,” in2019 IEEE High Performance Extreme Computing Conference (HPEC), 2019, pp. 1–9

work page 2019
[60]

Survey of machine learning accelerators,

——, “Survey of machine learning accelerators,” in2020 IEEE High Performance Extreme Computing Conference (HPEC), 2020, pp. 1–12

work page 2020
[61]

AI and ML accelerator survey and trends,

——, “AI and ML accelerator survey and trends,” in2022 IEEE High Performance Extreme Computing Conference (HPEC). IEEE, pp. 1–10. [Online]. Available: https://ieeexplore.ieee.org/document/9926331/

work page arXiv
[62]

A survey on hardware accelerators: Taxonomy, trends, challenges, and perspectives,

B. Peccerillo, M. Mannino, A. Mondelli, and S. Bartolini, “A survey on hardware accelerators: Taxonomy, trends, challenges, and perspectives,” Journal of Systems Architecture, vol. 129, p. 102561, 2022. [Online]. Available: https://www.sciencedirect.com/science/article/pii/ S1383762122001138

work page 2022
[63]

Recent developments in low-power ai accelerators: A survey,

C. ˚Aleskog, H. Grahn, and A. Borg, “Recent developments in low-power ai accelerators: A survey,”Algorithms, vol. 15, no. 11, 2022. [Online]. Available: https://www.mdpi.com/1999-4893/15/11/419

work page 2022
[64]

(2025) Maix-III AXera-Pi

Sipeed. (2025) Maix-III AXera-Pi. Accessed: 2025-04-10. [Online]. Available: https://wiki.sipeed.com/hardware/en/maixIII/ax-pi/axpi.html

work page 2025
[65]

An evaluation of edge tpu accelerators for convolutional neural networks,

K. Seshadri, B. Akin, J. Laudon, R. Narayanaswami, and A. Yaz- danbakhsh, “An evaluation of edge tpu accelerators for convolutional neural networks,” in2022 IEEE International Symposium on Workload Characterization (IISWC). IEEE, 2022, pp. 79–91

work page 2022
[66]

(2025) Pixel 4

Wikipedia contributors. (2025) Pixel 4. Accessed: 2025-04-14. [Online]. Available: https://en.wikipedia.org/wiki/Pixel 4

work page 2025
[67]

Google coral-based edge computing person reidentification using human parsing combined with analytical method,

N. Gabdullin and A. Raskovalov, “Google coral-based edge computing person reidentification using human parsing combined with analytical method,”Internet of Things, vol. 22, p. 100701, 2023. [Online]. Available: https://www.sciencedirect.com/science/article/pii/ S2542660523000240

work page 2023
[68]

Industries,

Coral, “Industries,” 2020, accessed: 2025-04-10. [Online]. Available: https://coral.ai/industries/

work page 2020
[69]

(2025) NXP Semiconductors

NXP Semiconductors. (2025) NXP Semiconductors. [Online]. Available: https://www.nxp.com/design/design-center/ development-boards-and-designs/8MPLUSLPD4-EVK

work page 2025
[70]

Architecture of neural processing unit for deep neural networks,

K. J. Lee, “Architecture of neural processing unit for deep neural networks,” inAdvances in Computers. Elsevier, 2021, vol. 122, pp. 217–245

work page 2021
[71]

A smart uav system to assess the health of a vineyard,

D.-M. Dobrea and M.-C. Dobrea, “A smart uav system to assess the health of a vineyard,”2024 16th International Conference on Electronics, Computers and Artificial Intelligence (ECAI), pp. 1–7, 2024. [Online]. Available: https://api.semanticscholar.org/CorpusID:271575548

work page 2024
[72]

Embedded vision intelligence for the safety of smart cities,

J. Martin, D. Cantero, M. Gonz ´alez, A. Cabrera, M. Larra ˜naga, E. Maltezos, P. Lioupis, D. Kosyvas, L. Karagiannidis, E. Ouzounoglou, and A. Amditis, “Embedded vision intelligence for the safety of smart cities,”Journal of Imaging, vol. 8, no. 12, 2022. [Online]. Available: https://www.mdpi.com/2313-433X/8/12/326

work page 2022
[73]

[Online]

NXP Semiconductors,i.MX 8M Plus Applications Processor Family, 2020, accessed: 2025-04-10. [Online]. Available: https: //www.nxp.com/docs/en/fact-sheet/IMX8MPLUSFS.pdf

work page 2020
[74]

(2025) SK-TDA4VM: Starter Kit for TDA4VM Jacinto™Processor

Texas Instruments. (2025) SK-TDA4VM: Starter Kit for TDA4VM Jacinto™Processor. Accessed: 2025-02-25. [Online]. Available: https://www.ti.com/tool/SK-TDA4VM

work page 2025
[75]

Hardware acceleration of matrix multiplication on a xilinx fpga,

N. Dave, K. Fleming, M. King, M. Pellauer, and M. Vijayaraghavan, “Hardware acceleration of matrix multiplication on a xilinx fpga,” in 2007 5th IEEE/ACM International Conference on Formal Methods and Models for Codesign (MEMOCODE 2007). IEEE, 2007, pp. 97–100

work page 2007
[76]

[Online]

Texas Instruments,TDA4VM Processors Datasheet, 2024, accessed: 2025-04-10. [Online]. Available: https://www.ti.com/lit/ds/symlink/ tda4vm.pdf

work page 2024
[77]

(2025) Ai accelerator hailo-8

Hailo Technologies Ltd. (2025) Ai accelerator hailo-8. Hailo Technologies Ltd. Accessed: 2026-01-12. [Online]. Available: https://hailo.ai/products/ai-accelerators/hailo-8-ai-accelerator/

work page 2025
[78]

(2025) Ai kit – raspberry pi documentation

Raspberry Pi Ltd. (2025) Ai kit – raspberry pi documentation. Raspberry Pi Ltd. Accessed: 2026-01-12. [Online]. Available: https://www.raspberrypi.com/documentation/accessories/ai-kit.html

work page 2025
[79]

(2025) Ai adas and ad (autonomous driving) – hailo ai applications

Hailo Technologies Ltd. (2025) Ai adas and ad (autonomous driving) – hailo ai applications. Hailo Technologies Ltd. Accessed: 2026- 01-12. [Online]. Available: https://hailo.ai/applications/automotive/ adas-and-ad/

work page 2025
[80]

(2026) Jetson Orin for autonomous machines and embedded systems

NVIDIA. (2026) Jetson Orin for autonomous machines and embedded systems. [Online]. Available: https://www.nvidia.com/en-us/ autonomous-machines/embedded-systems/jetson-orin/

work page 2026

Showing first 80 references.

[1] [1]

Ai-powered mobile applications: Revolutionizing user inter- action through intelligent features and context-aware services,

P. Nama, “Ai-powered mobile applications: Revolutionizing user inter- action through intelligent features and context-aware services,”Journal of Emerging Technologies and Innovative Research, vol. 10, no. 01, pp. g611–g620, 2023

work page 2023

[2] [2]

Ai-powered laptop companions: Bridging the human-machine gap,

A. S. Reddy, J. S. Likhitha, Y. R. Alla, and S. Reeja, “Ai-powered laptop companions: Bridging the human-machine gap,” in2024 4th International Conference on Data Engineering and Communication Systems (ICDECS). IEEE, 2024, pp. 1–6

work page 2024

[3] [3]

Empowering edge intelligence: A comprehensive survey on on-device ai models,

X. Wang, Z. Tang, J. Guo, T. Meng, C. Wang, T. Wang, and W. Jia, “Empowering edge intelligence: A comprehensive survey on on-device ai models,”ACM Computing Surveys, 2025

work page 2025

[4] [4]

Ai- based autonomous driving assistance system,

A. Tewari, N. Sarguroh, P. Kingrani, T. Shetty, and R. Motwani, “Ai- based autonomous driving assistance system,” in2021 5th international conference on computing methodologies and communication (ICCMC). IEEE, 2021, pp. 1491–1498. 13

work page 2021

[5] [5]

A survey on the optimization of neural network accelerators for micro-ai on-device inference,

A. N. Mazumder, J. Meng, H.-A. Rashid, U. Kallakuri, X. Zhang, J.- S. Seo, and T. Mohsenin, “A survey on the optimization of neural network accelerators for micro-ai on-device inference,”IEEE Journal on Emerging and Selected Topics in Circuits and Systems, vol. 11, no. 4, pp. 532–547, 2021

work page 2021

[6] [6]

Ai and ml accelerator survey and trends,

A. Reuther, P. Michaleas, M. Jones, V. Gadepally, S. Samsi, and J. Kepner, “Ai and ml accelerator survey and trends,” in2022 IEEE High Performance Extreme Computing Conference (HPEC), 2022, pp. 1–10

work page 2022

[7] [7]

Embedded ai market size, growth & outlook — indus- try trends 2030,

M. Intelligence, “Embedded ai market size, growth & outlook — indus- try trends 2030,” https://www.mordorintelligence.com/industry-reports/ embedded-ai-market, July 2025, accessed: 2025-09-22

work page 2030

[8] [8]

Efficient data transfer through zero copy,

S. K. Palaniappan and P. B. Nagaraja, “Efficient data transfer through zero copy,”IBM developerworks, p. 184, 2008

work page 2008

[9] [9]

Unikraft: fast, specialized unikernels the easy way,

M. Alex, S. Vargaftik, G. Kupfer, B. Pismeny, N. Amit, A. Morrison, and D. Tsafrir, “Characterizing, exploiting, and detecting dma code injection vulnerabilities in the presence of an iommu,” inProceedings of the Sixteenth European Conference on Computer Systems, ser. EuroSys ’21. New York, NY, USA: Association for Computing Machinery, 2021, p. 395–409. [...

work page doi:10.1145/3447786.3456249 2021

[10] [10]

Understanding mpu usage in microcontroller-based systems in the wild,

W. Zhou, Z. Jiang, and L. Guan, “Understanding mpu usage in microcontroller-based systems in the wild,” inProceedings 2023 Work- shop on Binary Analysis Research. San Diego, CA, USA: Internet Society, 2023

work page 2023

[11] [11]

Security implications of third-party accelerators,

Olson, Lena E., Sethumadhavan, Simha, and Hill, Mark D., “Security implications of third-party accelerators,” vol. 15, no. 1, pp. 50–53. [Online]. Available: http://ieeexplore.ieee.org/document/7123581/

work page arXiv

[12] [12]

Ascend-CC: Confidential computing on heterogeneous NPU for emerging generative AI workloads

A. Dhar, C. Thorens, L. M. Lazier, and L. Cavigelli, “Ascend-CC: Confidential computing on heterogeneous NPU for emerging generative AI workloads.” [Online]. Available: http://arxiv.org/abs/2407.11888

work page arXiv

[13] [13]

Confidential computing within an AI accelerator

K. Vaswani, S. Volos, C. Fournet, A. N. Diaz, K. Gordon, B. Vembu, S. Webster, D. Chisnall, S. Kulkarni, G. Cunningham, R. Osborne, and D. Wilkinson, “Confidential computing within an AI accelerator.”

work page

[14] [14]

Building GPU TEEs using CPU secure enclaves with GEVisor,

X. Wu, D. J. Tian, and C. H. Kim, “Building GPU TEEs using CPU secure enclaves with GEVisor,” inProceedings of the 2023 ACM Symposium on Cloud Computing. ACM, pp. 249–264. [Online]. Available: https://dl.acm.org/doi/10.1145/3620678.3624659

work page doi:10.1145/3620678.3624659 2023

[15] [15]

Attacking npus of multiple platforms,

Y. Zhang, S. G. Le Wu, and Z. Huang, “Attacking npus of multiple platforms,”blackhat EUROPE, 2023

work page 2023

[16] [16]

Make ksma great again: The art of rooting android devices by gpu mmu features,

Y. WANG, “Make ksma great again: The art of rooting android devices by gpu mmu features,”blackhat USA, 2023

work page 2023

[17] [17]

On access control, capabilities, their equivalence, and confused deputy attacks,

V. Rajani, D. Garg, and T. Rezk, “On access control, capabilities, their equivalence, and confused deputy attacks,” in2016 IEEE 29th Computer Security Foundations Symposium (CSF). IEEE, 2016, pp. 150–163

work page 2016

[18] [18]

gem5-salam: A system architecture for llvm-based accelerator modeling,

S. Rogers, J. Slycord, M. Baharani, and H. Tabkhi, “gem5-salam: A system architecture for llvm-based accelerator modeling,” in2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), 2020, pp. 471–482

work page 2020

[19] [19]

A survey of ai accelerators for edge environment,

W. Li and M. Liewig, “A survey of ai accelerators for edge environment,” inTrends and Innovations in Information Systems and Technologies, ´A. Rocha, H. Adeli, L. P. Reis, S. Costanzo, I. Orovic, and F. Moreira, Eds. Cham: Springer International Publishing, 2020, pp. 35–44

work page 2020

[20] [20]

Microarchitectural attacks in heterogeneous systems: A survey,

H. Naghibijouybari, E. M. Koruyeh, and N. Abu-Ghazaleh, “Microarchitectural attacks in heterogeneous systems: A survey,” ACM Comput. Surv., vol. 55, no. 7, Dec. 2022. [Online]. Available: https://doi.org/10.1145/3544102

work page doi:10.1145/3544102 2022

[21] [21]

Embarrassingly parallel,

“Embarrassingly parallel,” page Version ID: 1251754341. [Online]. Available: https://en.wikipedia.org/w/index.php?title=Embarrassingly parallel&oldid=1251754341

work page

[22] [22]

(2024) Coral dev board

Coral by Google. (2024) Coral dev board. Google. Accessed: 2025-04-04. [Online]. Available: https://coral.ai/products/dev-board

work page 2024

[23] [23]

Park and S

H. Park and S. Kim,Software Overview for On-Device AI and ML Benchmark in Smartphones. Cham: Springer International Publishing, 2023, pp. 151–165. [Online]. Available: https://doi.org/10. 1007/978-3-031-22170-5 5

work page 2023

[24] [24]

5 - dsp architectures,

R. Oshana, “5 - dsp architectures,” inDSP Software Development Techniques for Embedded and Real-Time Systems, ser. Embedded Technology, R. Oshana, Ed. Burlington: Newnes, 2006, pp. 123–158. [Online]. Available: https://www.sciencedirect.com/science/article/pii/ B9780750677592500077

work page 2006

[25] [25]

Memory-mapped i/o,

E. D. Reilly, “Memory-mapped i/o,” inEncyclopedia of Computer Science, 2003, pp. 1152–1152

work page 2003

[26] [26]

Memory management unit,

Wikipedia contributors, “Memory management unit,” https://en. wikipedia.org/wiki/Memory management unit, 2025, accessed: 2026- 01-13

work page 2025

[27] [27]

To pri or not to pri, that’s the question,

Y. Wang, L. Chen, J. Ji, X. Tian, B. Luo, Z. Wei, Z. Huang, K. Xu, K. Peng, K. Guo, N. Luo, G. Wang, S. Dai, Y. Shen, J. Wu, and Z. Qi, “To pri or not to pri, that’s the question,” inProceedings of the 19th USENIX Conference on Operating Systems Design and Implementation, ser. OSDI ’25. USA: USENIX Association, 2025

work page 2025

[28] [28]

(2023) Shared virtual addressing (sva) with enqcmd

The Linux Kernel Documentation Project. (2023) Shared virtual addressing (sva) with enqcmd. kernel.org. Accessed: 2026-01-11. [Online]. Available: https://www.kernel.org/doc/html/v6.3/x86/sva.html

work page 2023

[29] [29]

Poster: Automatic detection of confused-deputy attacks on arm trustzone environments,

D. Suciu, S. McLaughlin, H. Vijayakumar, L. Harrison, M. Grace, and A. Rahmati, “Poster: Automatic detection of confused-deputy attacks on arm trustzone environments,”IEEE SecDev, 2018

work page 2018

[30] [30]

BOOMERANG: Exploiting the semantic gap in trusted execution environments,

A. Machiry, E. Gustafson, C. Spensky, C. Salls, N. Stephens, R. Wang, A. Bianchi, Y. R. Choe, C. Kruegel, and G. Vigna, “BOOMERANG: Exploiting the semantic gap in trusted execution environments,” inProceedings 2017 Network and Distributed System Security Symposium. Internet Society. [Online]. Available: https://www.ndss-symposium.org/ndss2017/ndss-2017-pr...

work page 2017

[31] [31]

True iommu protection from dma attacks: When copy is faster than zero copy,

A. Markuze, A. Morrison, and D. Tsafrir, “True iommu protection from dma attacks: When copy is faster than zero copy,”SIGARCH Comput. Archit. News, vol. 44, no. 2, p. 249–262, Mar. 2016. [Online]. Available: https://doi.org/10.1145/2980024.2872379

work page doi:10.1145/2980024.2872379 2016

[32] [32]

Ai workloads performance with safe io memory protection

M. B. S. C. T. Gu, S. Ma, and X. Ma, “Ai workloads performance with safe io memory protection.”

work page

[33] [33]

CryptoMMU: Enabling scalable and secure access control of third-party accelerators,

F. Alam, H. Lee, A. Bhattacharjee, and A. Awad, “CryptoMMU: Enabling scalable and secure access control of third-party accelerators,” in56th Annual IEEE/ACM International Symposium on Microarchitecture. ACM, pp. 32–48. [Online]. Available: https://dl.acm.org/doi/10.1145/3613424.3614311

work page doi:10.1145/3613424.3614311

[34] [34]

Border control: sandboxing accelerators,

L. E. Olson, J. Power, M. D. Hill, and D. A. Wood, “Border control: sandboxing accelerators,” inProceedings of the 48th International Symposium on Microarchitecture. ACM, pp. 470–481. [Online]. Available: https://dl.acm.org/doi/10.1145/2830772.2830819

work page doi:10.1145/2830772.2830819

[35] [35]

Thunderclap: Exploring vulnerabilities in operating system iommu protection via dma from untrustworthy peripherals,

A. T. Markettos, C. Rothwell, B. F. Gutstein, A. Pearce, P. G. Neumann, S. W. Moore, and R. N. M. Watson, “Thunderclap: Exploring vulnerabilities in operating system iommu protection via dma from untrustworthy peripherals,”Proceedings 2019 Network and Distributed System Security Symposium, 2019. [Online]. Available: https://api.semanticscholar.org/CorpusI...

work page 2019

[36] [36]

Security analysis of hybrid intel cpu/fpga platforms using iommus against i/o attacks,

C. Peglow and T. Eisenbarth, “Security analysis of hybrid intel cpu/fpga platforms using iommus against i/o attacks,”Master’s thesis. University of L ¨ubeck, 2020

work page 2020

[37] [37]

DMAAUTH: A lightweight pointer integrity-based secure architecture to defeat DMA attacks,

X. Wang, W. Shen, Y. Bu, J. Zhou, and Y. Zhou, “DMAAUTH: A lightweight pointer integrity-based secure architecture to defeat DMA attacks,” in33rd USENIX Security Symposium (USENIX Security 24). Philadelphia, PA: USENIX Association, Aug. 2024, pp. 1081–1098. [Online]. Available: https://www.usenix.org/conference/ usenixsecurity24/presentation/wang-xingkai

work page 2024

[38] [38]

Enhancing iommu efficiency in heterogeneous sacs: A study on cache policy impacts,

W. Hur and W. W. Ro, “Enhancing iommu efficiency in heterogeneous sacs: A study on cache policy impacts,” in2025 International Conference on Electronics, Information, and Communication (ICEIC), 2025, pp. 1– 4

work page 2025

[39] [39]

Iommu: strategies for mitigating the iotlb bottleneck,

N. Amit, M. Ben-Yehuda, and B.-A. Yassour, “Iommu: strategies for mitigating the iotlb bottleneck,” inInternational Symposium on Computer Architecture, 2010. [Online]. Available: https://api. semanticscholar.org/CorpusID:50494

work page 2010

[40] [40]

Devious: Device- driven side-channel attacks on the iommu,

T. Kim, H. Park, S. Lee, S. Shin, J. Hur, and Y. Shin, “Devious: Device- driven side-channel attacks on the iommu,” in2023 IEEE Symposium on Security and Privacy (SP), 2023, pp. 2288–2305

work page 2023

[41] [41]

Cloak and dagger: from two permissions to complete control of the ui feedback loop,

Y. Fratantonio, C. Qian, S. P. Chung, and W. Lee, “Cloak and dagger: from two permissions to complete control of the ui feedback loop,” in 2017 IEEE Symposium on Security and Privacy (SP). IEEE, 2017, pp. 1041–1057

work page 2017

[42] [42]

Geforge: Hammering gddr memory to forge gpu page tables for fun and profit,

J. Wan, Y. Guo, Z. Zhang, Z. Li, D. J. Tian, and Z. Zhang, “Geforge: Hammering gddr memory to forge gpu page tables for fun and profit,” in 2026 IEEE Symposium on Security and Privacy (SP). IEEE, 2026

work page 2026

[43] [43]

Fernandez

D. Fernandez. (2025, Sep.) Gpu runtime security: Why ai clouds are flying blind. Edera. [Online]. Available: https://edera.dev/stories/ gpu-runtime-security-why-ai-clouds-are-flying-blind

work page 2025

[44] [44]

Anderson and T

D. Anderson and T. Shanley,PCI system architecture. Addison-Wesley Professional, 1999. 14

work page 1999

[45] [45]

The linux kernel device model,

P. Mochel, “The linux kernel device model,” inOttawa Linux Sympo- sium, vol. 368, 2002

work page 2002

[46] [46]

The program understanding problem: analysis and a heuristic approach,

S. Woods and Q. Yang, “The program understanding problem: analysis and a heuristic approach,” inProceedings of IEEE 18th International Conference on Software Engineering, 1996, pp. 6–15

work page 1996

[47] [47]

Using an llm to help with code understanding,

D. Nam, A. Macvean, V. Hellendoorn, B. Vasilescu, and B. Myers, “Using an llm to help with code understanding,” inProceedings of the IEEE/ACM 46th International Conference on Software Engineering, 2024, pp. 1–13

work page 2024

[48] [48]

Let’s ask ai about their programs: Exploring chatgpt’s answers to program comprehension questions,

T. Lehtinen, C. Koutcheme, and A. Hellas, “Let’s ask ai about their programs: Exploring chatgpt’s answers to program comprehension questions,” inProceedings of the 46th International Conference on Software Engineering: Software Engineering Education and Training, 2024, pp. 221–232

work page 2024

[49] [49]

Beyond syntax: How do llms understand code?

M. North, A. Atapour-Abarghouei, and N. Bencomo, “Beyond syntax: How do llms understand code?” in2025 IEEE/ACM 47th International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER). IEEE, 2025, pp. 86–90

work page 2025

[50] [50]

A survey on hallucination in large language models: Principles, taxonomy, challenges, and open questions,

L. Huang, W. Yu, W. Ma, W. Zhong, Z. Feng, H. Wang, Q. Chen, W. Peng, X. Feng, B. Qin, and T. Liu, “A survey on hallucination in large language models: Principles, taxonomy, challenges, and open questions,”ACM Trans. Inf. Syst., vol. 43, no. 2, Jan. 2025. [Online]. Available: https://doi.org/10.1145/3703155

work page doi:10.1145/3703155 2025

[51] [51]

(2024) MobileNetV1 Documentation

Hugging Face. (2024) MobileNetV1 Documentation. Accessed: 2025- 03-21. [Online]. Available: https://huggingface.co/docs/transformers/ en/model doc/mobilenet v1

work page 2024

[52] [52]

dgibson/dtc,

D. Gibson, “dgibson/dtc,” Mar. 2025, original-date: 2013-05- 24T07:53:04Z. [Online]. Available: https://github.com/dgibson/dtc

work page 2025

[53] [53]

Open platform systems under scrutiny: A cybersecurity analysis of the device tree,

D. Tychalas and M. Maniatakos, “Open platform systems under scrutiny: A cybersecurity analysis of the device tree,” in2018 25th IEEE In- ternational Conference on Electronics, Circuits and Systems (ICECS). IEEE, 2018, pp. 477–480

work page 2018

[54] [54]

J. E. Bottomley. (2025) Dynamic dma mapping using the generic device. Accessed: 2025-04-14. [Online]. Available: https: //docs.kernel.org/core-api/dma-api.html

work page 2025

[55] [55]

(2019) Memory management apis

The Linux Kernel Community. (2019) Memory management apis. Accessed: 2025-04-14. [Online]. Available: https://www.kernel.org/doc/ html/v5.0/core-api/mm-api.html

work page 2019

[56] [56]

tree-sitter-c: C grammar for tree-sitter,

Max Brunsfeld, “tree-sitter-c: C grammar for tree-sitter,” https://pypi. org/project/tree-sitter-c/, 2025, version 0.24.1, released May 24, 2025; License: MIT; Requires Python≥3.10

work page 2025

[57] [57]

py-tree-sitter: Python bindings to the tree-sitter parsing library,

“py-tree-sitter: Python bindings to the tree-sitter parsing library,” https: //github.com/tree-sitter/py-tree-sitter, 2024, mIT License; latest release v0.23.2 on October 24, 2024

work page 2024

[58] [58]

Function calling,

“Function calling,” OpenAI Developer Documentation (Guides), 2025, accessed on August 21, 2025. [Online]. Available: https: //platform.openai.com/docs/guides/function-calling

work page 2025

[59] [59]

Survey and benchmarking of machine learning accelerators,

A. Reuther, P. Michaleas, M. Jones, V. Gadepally, S. Samsi, and J. Kepner, “Survey and benchmarking of machine learning accelerators,” in2019 IEEE High Performance Extreme Computing Conference (HPEC), 2019, pp. 1–9

work page 2019

[60] [60]

Survey of machine learning accelerators,

——, “Survey of machine learning accelerators,” in2020 IEEE High Performance Extreme Computing Conference (HPEC), 2020, pp. 1–12

work page 2020

[61] [61]

AI and ML accelerator survey and trends,

——, “AI and ML accelerator survey and trends,” in2022 IEEE High Performance Extreme Computing Conference (HPEC). IEEE, pp. 1–10. [Online]. Available: https://ieeexplore.ieee.org/document/9926331/

work page arXiv

[62] [62]

A survey on hardware accelerators: Taxonomy, trends, challenges, and perspectives,

B. Peccerillo, M. Mannino, A. Mondelli, and S. Bartolini, “A survey on hardware accelerators: Taxonomy, trends, challenges, and perspectives,” Journal of Systems Architecture, vol. 129, p. 102561, 2022. [Online]. Available: https://www.sciencedirect.com/science/article/pii/ S1383762122001138

work page 2022

[63] [63]

Recent developments in low-power ai accelerators: A survey,

C. ˚Aleskog, H. Grahn, and A. Borg, “Recent developments in low-power ai accelerators: A survey,”Algorithms, vol. 15, no. 11, 2022. [Online]. Available: https://www.mdpi.com/1999-4893/15/11/419

work page 2022

[64] [64]

(2025) Maix-III AXera-Pi

Sipeed. (2025) Maix-III AXera-Pi. Accessed: 2025-04-10. [Online]. Available: https://wiki.sipeed.com/hardware/en/maixIII/ax-pi/axpi.html

work page 2025

[65] [65]

An evaluation of edge tpu accelerators for convolutional neural networks,

K. Seshadri, B. Akin, J. Laudon, R. Narayanaswami, and A. Yaz- danbakhsh, “An evaluation of edge tpu accelerators for convolutional neural networks,” in2022 IEEE International Symposium on Workload Characterization (IISWC). IEEE, 2022, pp. 79–91

work page 2022

[66] [66]

(2025) Pixel 4

Wikipedia contributors. (2025) Pixel 4. Accessed: 2025-04-14. [Online]. Available: https://en.wikipedia.org/wiki/Pixel 4

work page 2025

[67] [67]

Google coral-based edge computing person reidentification using human parsing combined with analytical method,

N. Gabdullin and A. Raskovalov, “Google coral-based edge computing person reidentification using human parsing combined with analytical method,”Internet of Things, vol. 22, p. 100701, 2023. [Online]. Available: https://www.sciencedirect.com/science/article/pii/ S2542660523000240

work page 2023

[68] [68]

Industries,

Coral, “Industries,” 2020, accessed: 2025-04-10. [Online]. Available: https://coral.ai/industries/

work page 2020

[69] [69]

(2025) NXP Semiconductors

NXP Semiconductors. (2025) NXP Semiconductors. [Online]. Available: https://www.nxp.com/design/design-center/ development-boards-and-designs/8MPLUSLPD4-EVK

work page 2025

[70] [70]

Architecture of neural processing unit for deep neural networks,

K. J. Lee, “Architecture of neural processing unit for deep neural networks,” inAdvances in Computers. Elsevier, 2021, vol. 122, pp. 217–245

work page 2021

[71] [71]

A smart uav system to assess the health of a vineyard,

D.-M. Dobrea and M.-C. Dobrea, “A smart uav system to assess the health of a vineyard,”2024 16th International Conference on Electronics, Computers and Artificial Intelligence (ECAI), pp. 1–7, 2024. [Online]. Available: https://api.semanticscholar.org/CorpusID:271575548

work page 2024

[72] [72]

Embedded vision intelligence for the safety of smart cities,

J. Martin, D. Cantero, M. Gonz ´alez, A. Cabrera, M. Larra ˜naga, E. Maltezos, P. Lioupis, D. Kosyvas, L. Karagiannidis, E. Ouzounoglou, and A. Amditis, “Embedded vision intelligence for the safety of smart cities,”Journal of Imaging, vol. 8, no. 12, 2022. [Online]. Available: https://www.mdpi.com/2313-433X/8/12/326

work page 2022

[73] [73]

[Online]

NXP Semiconductors,i.MX 8M Plus Applications Processor Family, 2020, accessed: 2025-04-10. [Online]. Available: https: //www.nxp.com/docs/en/fact-sheet/IMX8MPLUSFS.pdf

work page 2020

[74] [74]

(2025) SK-TDA4VM: Starter Kit for TDA4VM Jacinto™Processor

Texas Instruments. (2025) SK-TDA4VM: Starter Kit for TDA4VM Jacinto™Processor. Accessed: 2025-02-25. [Online]. Available: https://www.ti.com/tool/SK-TDA4VM

work page 2025

[75] [75]

Hardware acceleration of matrix multiplication on a xilinx fpga,

N. Dave, K. Fleming, M. King, M. Pellauer, and M. Vijayaraghavan, “Hardware acceleration of matrix multiplication on a xilinx fpga,” in 2007 5th IEEE/ACM International Conference on Formal Methods and Models for Codesign (MEMOCODE 2007). IEEE, 2007, pp. 97–100

work page 2007

[76] [76]

[Online]

Texas Instruments,TDA4VM Processors Datasheet, 2024, accessed: 2025-04-10. [Online]. Available: https://www.ti.com/lit/ds/symlink/ tda4vm.pdf

work page 2024

[77] [77]

(2025) Ai accelerator hailo-8

Hailo Technologies Ltd. (2025) Ai accelerator hailo-8. Hailo Technologies Ltd. Accessed: 2026-01-12. [Online]. Available: https://hailo.ai/products/ai-accelerators/hailo-8-ai-accelerator/

work page 2025

[78] [78]

(2025) Ai kit – raspberry pi documentation

Raspberry Pi Ltd. (2025) Ai kit – raspberry pi documentation. Raspberry Pi Ltd. Accessed: 2026-01-12. [Online]. Available: https://www.raspberrypi.com/documentation/accessories/ai-kit.html

work page 2025

[79] [79]

(2025) Ai adas and ad (autonomous driving) – hailo ai applications

Hailo Technologies Ltd. (2025) Ai adas and ad (autonomous driving) – hailo ai applications. Hailo Technologies Ltd. Accessed: 2026- 01-12. [Online]. Available: https://hailo.ai/applications/automotive/ adas-and-ad/

work page 2025

[80] [80]

(2026) Jetson Orin for autonomous machines and embedded systems

NVIDIA. (2026) Jetson Orin for autonomous machines and embedded systems. [Online]. Available: https://www.nvidia.com/en-us/ autonomous-machines/embedded-systems/jetson-orin/

work page 2026