pith. sign in

arxiv: 2605.10987 · v1 · submitted 2026-05-09 · 💻 cs.LG · cs.AI· cs.CR

AESOP: Adversarial Execution-path Selection to Overload Deep Learning Pipelines

Pith reviewed 2026-05-13 06:47 UTC · model grok-4.3

classification 💻 cs.LG cs.AIcs.CR
keywords adversarial attacksinference pipelinesexecution path selectionmodel overloaddeep learning efficiencyreal-time systems
0
0 comments X

The pith

Path-aware attacks on ML pipelines inflate FLOPs by 2407 times by targeting vulnerable execution paths.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that dynamic machine learning pipelines, where upstream model outputs route work to downstream components, create a new attack surface based on execution path selection. Existing single-model attacks cannot exploit this because they do not account for how input-dependent routing multiplies workload volume and per-component costs. AESOP formalizes the adversarial path-selection problem and solves it with vulnerability-guided ranking plus adaptive loss weighting, showing that path-directed attacks produce 20 times more overload than the best single-model baselines on identical inputs. The work evaluates the method on multiple pipelines including production-like variants with batching and defenses, measuring extreme resource inflation in both white-box and gray-box settings. If correct, this means pipeline operators must defend against path choice rather than isolated model vulnerabilities to preserve real-time availability.

Core claim

AESOP shows that formalizing the adversarial path-selection problem and solving it via vulnerability-guided path ranking with adaptive loss weighting allows an attacker to direct computation toward high-cost execution paths, producing 2407 times FLOP inflation and 419 times latency inflation in white-box settings and 58 times FLOP and 17 times latency in gray-box settings on the same inputs and budgets where single-model attacks reach only 117 times.

What carries the argument

vulnerability-guided path ranking combined with adaptive loss weighting

If this is right

  • Real-time pipelines face throughput collapse from 0.578 to 0.006 inputs per second under sustained path-targeted attacks.
  • System defenses cannot neutralize the attack but only redirect it, forcing operators to accept either massive data loss or throughput failure.
  • Gray-box attacks still achieve 58 times FLOP inflation, showing partial pipeline knowledge suffices for substantial overload.
  • Batching and confidence-threshold defenses in production variants do not eliminate the path-selection advantage.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same path-selection principle could apply to any composed system whose routing depends on intermediate outputs, such as microservice graphs.
  • Defenses that randomize or hide path costs might reduce the attack surface without full pipeline redesign.
  • Operators could test pipelines by simulating path-aware attacks during development to identify high-cost routes before deployment.

Load-bearing premise

An attacker can obtain enough knowledge of the pipeline structure and per-path vulnerabilities to perform guided ranking and adaptive weighting.

What would settle it

Measure whether the 20 times gap in FLOP inflation disappears when an attacker is given only black-box access with no pipeline structure information and must attack without path ranking.

Figures

Figures reproduced from arXiv: 2605.10987 by Mingfang Ji, Ravishka Shemal Rathnasuriya, Simin Chen, Tingxi Li, Wei Yang, Yitao Hu.

Figure 1
Figure 1. Figure 1: Execution paths in a dynamic deep learning pipeline. [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Approach Overview. neural component and edge edge e ∈ E carries data between components through an inter-process queue. Each component v exhibits three input-dependent behaviors: a per-inference cost cv, an output cardinality ov that determines downstream workload, and a gating function gv that may forward, drop, or route inputs based on predicted labels, confidence thresholds, or shape constraints. The to… view at source ↗
Figure 3
Figure 3. Figure 3: Traffic-monitoring pipeline in two configurations. [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Pipeline applications used in evaluation. Implementation [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗
read the original abstract

Modern machine learning deployments increasingly compose specialized models into dynamic inference pipelines, where upstream components produce intermediate predictions that determine the workload and inputs of downstream components. The cost of processing an input is therefore not determined by any single model, but by two coupled factors: the per-inference cost of each invoked component and its workload volume. Because these pipelines run under hard real-time constraints, efficiency is a fundamental requirement for system availability. We show that this structure creates an efficiency-attack surface that existing methods targeting single models cannot exploit: on identical inputs and budgets, path-aware targeting inflates FLOPs by $2,407\times$ while the strongest single-model baseline achieves $117\times$ -- a $20\times$ gap attributable entirely to where the attack is directed. We formalize this as the adversarial path-selection problem and present AESOP, a framework combining vulnerability-guided path ranking with adaptive loss weighting. We evaluate AESOP on five pipelines plus a production-realistic deployment variant with batching, bounded buffering, and confidence-threshold defenses. AESOP achieves up to $2,407\times$ FLOPs and $419\times$ latency inflation in white-box setting and 58$\times$ FLOPs / 17$\times$ latency in gray-box settings. Under system-level defenses, the attack is not neutralized but redirected: pipelines are forced to choose between throughput collapse ($0.578 \to 0.006$ input/s) and $96.7\%$ data loss to sustain throughput.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 1 minor

Summary. The paper claims that dynamic ML inference pipelines, where upstream predictions determine downstream workloads, create an attack surface exploitable via path selection. AESOP uses vulnerability-guided path ranking and adaptive loss weighting to direct attacks, achieving 2407× FLOPs and 419× latency inflation in white-box settings (vs. 117× for the strongest single-model baseline) on identical inputs/budgets, with gray-box results at 58× FLOPs/17× latency. Evaluations on five pipelines plus a production variant with batching and defenses show the attack forces throughput collapse (0.578→0.006 input/s) or 96.7% data loss.

Significance. If the results hold under realistic conditions, the work identifies a new efficiency attack surface in composed ML systems that single-model attacks cannot reach, with direct implications for real-time pipeline availability and defense design. The white-box/gray-box contrast usefully bounds attack potency, and the system-level defense evaluation (throughput vs. data loss tradeoff) strengthens the practical relevance.

major comments (3)
  1. [Abstract] Abstract: The headline claim of a 20× gap 'attributable entirely to where the attack is directed' is contradicted by the white-box (2407× FLOPs) vs. gray-box (58× FLOPs) numbers; the gap is largely knowledge-dependent rather than purely directional, and the threat model must explicitly justify why an attacker would possess the pipeline topology and per-path vulnerability information required for ranking and weighting.
  2. [Evaluation] Evaluation section: No details are provided on run count, variance, data exclusion criteria, or statistical tests supporting the concrete multipliers (2407×, 419×, 58×); without these, the internal validity of the central empirical claims cannot be assessed and the 20× gap cannot be treated as robust.
  3. [Method] § on adaptive loss weighting: The method is described at a high level but lacks the precise formulation, pseudocode, or hyperparameter sensitivity analysis needed to reproduce the reported overload factors or to verify that the weighting is not simply amplifying the path-ranking effect by construction.
minor comments (1)
  1. [Evaluation] The production-realistic variant is mentioned but its exact batch size, buffer bounds, and confidence-threshold values are not tabulated, making it difficult to map the 0.578→0.006 input/s result to concrete system parameters.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed comments. We address each major point below and have made revisions to improve clarity, reproducibility, and rigor where appropriate.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The headline claim of a 20× gap 'attributable entirely to where the attack is directed' is contradicted by the white-box (2407× FLOPs) vs. gray-box (58× FLOPs) numbers; the gap is largely knowledge-dependent rather than purely directional, and the threat model must explicitly justify why an attacker would possess the pipeline topology and per-path vulnerability information required for ranking and weighting.

    Authors: We thank the referee for this observation. The 20× gap specifically compares AESOP (path-aware) against the strongest single-model baseline, both under the white-box setting with identical inputs and budgets; the gray-box results (58×) are reported separately to bound attack potency under reduced knowledge. We will revise the abstract to explicitly qualify the 20× comparison as white-box only and to distinguish the two threat models. We will also expand the threat-model section to justify that pipeline topology is often obtainable via documentation, reverse engineering, or probing in deployed systems, while per-path vulnerabilities can be estimated from limited queries or public model information, making the attack realistic for adversaries with partial system access. revision: partial

  2. Referee: [Evaluation] Evaluation section: No details are provided on run count, variance, data exclusion criteria, or statistical tests supporting the concrete multipliers (2407×, 419×, 58×); without these, the internal validity of the central empirical claims cannot be assessed and the 20× gap cannot be treated as robust.

    Authors: We agree that these details are essential. In the revised manuscript we will add: all experiments were repeated for 10 independent runs using different random seeds; results report mean values accompanied by standard deviations; no data points were excluded; and paired t-tests confirm statistical significance of the reported gaps (p < 0.01). These additions will appear in the Evaluation section together with a supplementary table summarizing the statistics. revision: yes

  3. Referee: [Method] § on adaptive loss weighting: The method is described at a high level but lacks the precise formulation, pseudocode, or hyperparameter sensitivity analysis needed to reproduce the reported overload factors or to verify that the weighting is not simply amplifying the path-ranking effect by construction.

    Authors: We acknowledge the need for greater precision. The revised manuscript will include the exact mathematical formulation of the adaptive loss (with the dynamic weighting rule based on per-path vulnerability scores), pseudocode in the appendix, and a new sensitivity analysis subsection that varies the weighting hyperparameters and shows their effect on overload factors. This analysis will demonstrate that the weighting provides complementary gains beyond path ranking alone. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical attack framework with no self-referential derivations or fitted predictions

full rationale

The paper presents AESOP as an empirical framework for adversarial path selection in ML pipelines, evaluated on five pipelines plus a production variant. No equations, derivations, or first-principles results are claimed that reduce the reported multipliers (2407× FLOPs, 419× latency) to definitions of the attack itself. The central results are framed as measured outcomes under white-box and gray-box settings rather than predictions derived from fitted parameters or self-citations. The path-ranking and adaptive weighting components are described as algorithmic choices, not tautological redefinitions of the overload metric. No self-citation load-bearing steps, uniqueness theorems, or ansatz smuggling appear in the provided text. The 20× gap is presented as an empirical observation attributable to attack direction, with explicit contrast to baselines and gray-box degradation, keeping the derivation self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract provides no explicit free parameters, axioms, or invented entities; the framework is described as a combination of path ranking and loss weighting without further decomposition.

pith-pipeline@v0.9.0 · 5590 in / 1100 out tokens · 78428 ms · 2026-05-13T06:47:45.546713+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

44 extracted references · 44 canonical work pages

  1. [1]

    Nexus: A gpu cluster engine for accelerating dnn-based video analysis,

    H. Shen, L. Chen, Y . Jin, L. Zhao, B. Kong, M. Philipose, A. Krishna- murthy, and R. Sundaram, “Nexus: A gpu cluster engine for accelerating dnn-based video analysis,” inProceedings of the 27th ACM Symposium on Operating Systems Principles, 2019, pp. 322–337

  2. [2]

    Scrooge: A cost-effective deep learning inference system,

    Y . Hu, R. Ghosh, and R. Govindan, “Scrooge: A cost-effective deep learning inference system,” inProceedings of the ACM Symposium on Cloud Computing, 2021, pp. 624–638

  3. [3]

    Ipa: Inference pipeline adap- tation to achieve high accuracy and cost-efficiency,

    S. Ghafouri, K. Razavi, M. Salmani, A. Sanaee, T. Lorido-Botran, L. Wang, J. Doyle, and P. Jamshidi, “Ipa: Inference pipeline adap- tation to achieve high accuracy and cost-efficiency,”arXiv preprint arXiv:2308.12871, 2023

  4. [4]

    Dream: A dynamic scheduler for dynamic real-time multi-model ml workloads,

    S. Kim, H. Kwon, J. Song, J. Jo, Y .-H. Chen, L. Lai, and V . Chandra, “Dream: A dynamic scheduler for dynamic real-time multi-model ml workloads,” inProceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 4, 2023, pp. 73–86

  5. [5]

    Pard: Enhancing goodput for inference pipeline via proactive request dropping,

    Z. Zhao, Y . Hu, S. Chen, M. Ji, W. Yang, Y . Zhang, L. Zhao, W. Li, X. Liu, W. Quet al., “Pard: Enhancing goodput for inference pipeline via proactive request dropping,”arXiv preprint arXiv:2602.08747, 2026

  6. [6]

    Inferline: Ml prediction pipeline provisioning and management for tight latency objectives,

    D. Crankshaw, G.-E. Sela, C. Zumar, X. Mo, J. E. Gonzalez, I. Stoica, and A. Tumanov, “Inferline: Ml prediction pipeline provisioning and management for tight latency objectives,” 2020. [Online]. Available: https://arxiv.org/abs/1812.01776

  7. [7]

    Nvidia dynamo-triton: Scalable ai inference platform,

    NVIDIA Corporation, “Nvidia dynamo-triton: Scalable ai inference platform,” https://developer.nvidia.com/dynamo-triton, 2026, accessed: 2026-05-05

  8. [8]

    Mrdjan Jankovic

    L. Ullrich, M. Buchholz, K. Dietmayer, and K. Graichen, “Ai safety assurance for automated vehicles: A survey on research, standardization, regulation,”IEEE Transactions on Intelligent Vehicles, vol. 10, no. 10, p. 4784–4803, Oct. 2025. [Online]. Available: http://dx.doi.org/10.1109/TIV .2024.3496797

  9. [9]

    Loki: A system for serving ml inference pipelines with hardware and accuracy scaling,

    S. Ahmad, H. Guan, and R. K. Sitaraman, “Loki: A system for serving ml inference pipelines with hardware and accuracy scaling,” inProceedings of the 33rd International Symposium on High-Performance Parallel and Distributed Computing, ser. HPDC ’24. ACM, 2024, p. 267–280. [Online]. Available: http://dx.doi.org/10.1145/3625549.3658688

  10. [10]

    2024 ai inference infrastructure survey highlights,

    BentoML, “2024 ai inference infrastructure survey highlights,” https:// www.bentoml.com/blog/2024-ai-infra-survey-highlights, 2025, accessed: 2026-05-05

  11. [11]

    A multi-stage deep-learning-based vehicle and license plate recognition system with real-time edge inference,

    A. Ammar, A. Koubaa, W. Boulila, B. Benjdira, and Y . Alhabashi, “A multi-stage deep-learning-based vehicle and license plate recognition system with real-time edge inference,”Sensors, vol. 23, no. 4, p. 2120, 2023

  12. [12]

    A multi stage deep learning approach for real-time vehicle detection, tracking, and speed measurement in intelligent transportation systems,

    R. Li, “A multi stage deep learning approach for real-time vehicle detection, tracking, and speed measurement in intelligent transportation systems,”Scientific reports, vol. 15, no. 1, p. 22531, 2025

  13. [13]

    Utility-aware load shedding for real-time video analytics at the edge,

    E. Saurez, H. Gupta, H. Roger, S. Bhowmik, U. Ramachandran, and K. Rothermel, “Utility-aware load shedding for real-time video analytics at the edge,”arXiv preprint arXiv:2307.02409, 2023

  14. [14]

    Addressing significant challenges for animal detection in camera trap images: a novel deep learning-based approach,

    M. Mulero-P ´azm´any, S. Hurtado, C. Barba-Gonz ´alez, M. L. Antequera- G´omez, F. D´ıaz-Ruiz, R. Real, I. Navas-Delgado, and J. F. Aldana-Montes, “Addressing significant challenges for animal detection in camera trap images: a novel deep learning-based approach,”Scientific Reports, vol. 15, no. 1, p. 16191, 2025

  15. [15]

    Paying attention to other animal detections improves camera trap classification models,

    G. Dussert, S. Dray, S. Chamaill ´e-Jammes, and V . Miele, “Paying attention to other animal detections improves camera trap classification models,”Methods in Ecology and Evolution, vol. 17, no. 4, pp. 1248– 1258, 2026

  16. [16]

    Reliable and efficient integration of ai into camera traps for smart wildlife monitoring based on continual learning,

    D. Velasco-Montero, J. Fern ´andez-Berni, R. Carmona-Gal ´an, A. San- glas, and F. Palomares, “Reliable and efficient integration of ai into camera traps for smart wildlife monitoring based on continual learning,” Ecological Informatics, vol. 83, p. 102815, 2024

  17. [17]

    A smart camera trap for detection of endotherms and ectotherms,

    D. M. Corva, N. I. Semianiw, A. C. Eichholtzer, S. D. Adams, M. P. Mahmud, K. Gaur, A. J. Pestell, D. A. Driscoll, and A. Z. Kouzani, “A smart camera trap for detection of endotherms and ectotherms,”Sensors, vol. 22, no. 11, p. 4094, 2022

  18. [18]

    Child face age-progression via deep feature aging,

    D. Deb, D. Aggarwal, and A. K. Jain, “Child face age-progression via deep feature aging,”arXiv preprint arXiv:2003.08788, 2020

  19. [19]

    Dager: Deep age, gender and emotion recognition using convolutional neural network,

    A. Dehghan, E. G. Ortiz, G. Shu, and S. Z. Masood, “Dager: Deep age, gender and emotion recognition using convolutional neural network,” arXiv preprint arXiv:1702.04280, 2017

  20. [20]

    Child abduction, amber alert, and crime control theater,

    T. Griffin and M. K. Miller, “Child abduction, amber alert, and crime control theater,”Criminal justice review, vol. 33, no. 2, pp. 159–176, 2008

  21. [21]

    Sponge examples: Energy-latency attacks on neural networks,

    I. Shumailov, Y . Zhao, D. Bates, N. Papernot, R. Mullins, and R. Ander- son, “Sponge examples: Energy-latency attacks on neural networks,” in 2021 IEEE European symposium on security and privacy (EuroS&P). IEEE, 2021, pp. 212–231

  22. [22]

    Phantom sponges: Exploiting non-maximum suppression to attack deep object detectors,

    A. Shapira, A. Zolfi, L. Demetrio, B. Biggio, and A. Shabtai, “Phantom sponges: Exploiting non-maximum suppression to attack deep object detectors,” inProceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2023, pp. 4571–4580

  23. [23]

    Overload: Latency attacks on object detection for edge devices,

    E.-C. Chen, P.-Y . Chen, I. Chung, C.-R. Leeet al., “Overload: Latency attacks on object detection for edge devices,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024, pp. 24 716–24 725

  24. [24]

    Slowtrack: Increasing the latency of camera-based perception in autonomous driving using adversarial examples,

    C. Ma, N. Wang, Q. A. Chen, and C. Shen, “Slowtrack: Increasing the latency of camera-based perception in autonomous driving using adversarial examples,” inProceedings of the AAAI Conference on Artificial Intelligence, vol. 38, no. 5, 2024, pp. 4062–4070

  25. [25]

    Slowlidar: Increasing the latency of lidar-based detection using adversarial examples,

    H. Liu, Y . Wu, Z. Yu, Y . V orobeychik, and N. Zhang, “Slowlidar: Increasing the latency of lidar-based detection using adversarial examples,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 5146–5155

  26. [26]

    Nmtsloth: understanding and testing efficiency degradation of neural machine translation systems,

    S. Chen, C. Liu, M. Haque, Z. Song, and W. Yang, “Nmtsloth: understanding and testing efficiency degradation of neural machine translation systems,” inProceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2022, pp. 1148–1160

  27. [27]

    Slothspeech: Denial-of-service attack against speech recognition models,

    M. Haque, R. Shah, S. Chen, B. Sisman, C. Liu, and W. Yang, “Slothspeech: Denial-of-service attack against speech recognition models,” 08 2023, pp. 1274–1278

  28. [28]

    Ilfo: Adversarial attack on adaptive neural networks,

    M. Haque, A. Chauhan, C. Liu, and W. Yang, “Ilfo: Adversarial attack on adaptive neural networks,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 14 264–14 273

  29. [29]

    Gradmdm: Adversarial attack on dynamic networks,

    J. Pan, L. G. Foo, Q. Zheng, Z. Fan, H. Rahmani, Q. Ke, and J. Liu, “Gradmdm: Adversarial attack on dynamic networks,”IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 9, pp. 11 374– 11 381, 2023

  30. [30]

    Gradauto: Energy-oriented attack on dynamic neural networks,

    J. Pan, Q. Zheng, Z. Fan, H. Rahmani, Q. Ke, and J. Liu, “Gradauto: Energy-oriented attack on dynamic neural networks,” inEuropean Conference on Computer Vision. Springer, 2022, pp. 637–653

  31. [31]

    Jaguar: Low latency mobile augmented reality with flexible tracking,

    W. Zhang, B. Han, and P. Hui, “Jaguar: Low latency mobile augmented reality with flexible tracking,” inProceedings of the 26th ACM International Conference on Multimedia, ser. MM ’18. New York, NY , USA: Association for Computing Machinery, 2018, p. 355–363. [Online]. Available: https://doi.org/10.1145/3240508.3240561

  32. [32]

    Distributing inference tasks over interconnected systems through dynamic dnns,

    C. Singhal, Y . Wu, F. Malandrino, M. Levorato, and C. F. Chiasserini, “Distributing inference tasks over interconnected systems through dynamic dnns,”IEEE Transactions on Networking, pp. 1–14, 2025

  33. [33]

    Human action recognition from various data modalities: A review,

    Z. Sun, Q. Ke, H. Rahmani, M. Bennamoun, G. Wang, and J. Liu, “Human action recognition from various data modalities: A review,”IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 3, pp. 3200–3225, 2023

  34. [34]

    Rim: Offloading inference to the edge,

    Y . Hu, W. Pang, X. Liu, R. Ghosh, B. Ko, W.-H. Lee, and R. Govindan, “Rim: Offloading inference to the edge,” inProceedings of the Inter- national Conference on Internet-of-Things Design and Implementation, 2021, pp. 80–92

  35. [35]

    Sok: Efficiency robustness of dynamic deep learning systems,

    R. Rathnasuriya, T. Li, Z. Xu, Z. Song, M. Haque, S. Chen, and W. Yang, “Sok: Efficiency robustness of dynamic deep learning systems,”USENIX Security Symposium, pp. 4683–4702, 2025

  36. [36]

    Exploiting efficiency vulnerabilities in dynamic deep learning systems,

    R. Rathnasuriya and W. Yang, “Exploiting efficiency vulnerabilities in dynamic deep learning systems,”arXiv preprint arXiv:2506.17621, 2025

  37. [37]

    Adversarial machine learning,

    A. Vassilev, A. Oprea, A. Fordyce, and H. Anderson, “Adversarial machine learning,”Gaithersburg, MD, 2024

  38. [38]

    Taxonomy of machine learning safety: A survey and primer,

    S. Mohseni, H. Wang, C. Xiao, Z. Yu, Z. Wang, and J. Yadawa, “Taxonomy of machine learning safety: A survey and primer,”ACM Computing Surveys, vol. 55, no. 8, pp. 1–38, 2022

  39. [39]

    Secure machine learning hardware: Challenges and progress [feature],

    K. Lee, M. Ashok, S. Maji, R. Agrawal, A. Joshi, M. Yan, J. S. Emer, and A. P. Chandrakasan, “Secure machine learning hardware: Challenges and progress [feature],”IEEE Circuits and Systems Magazine, vol. 25, no. 1, pp. 8–34, 2025

  40. [40]

    A panda? no, it’s a sloth: Slowdown attacks on adaptive multi-exit neural network inference,

    S. Hong, Y . Kaya, I.-V . Modoranu, and T. Dumitras ¸, “A panda? no, it’s a sloth: Slowdown attacks on adaptive multi-exit neural network inference,” arXiv preprint arXiv:2010.02432, 2020

  41. [41]

    Deeplabv3: DeepLabV3+ MobileNet pretrained on cityscapes for ground masks,

    M. Teng, “Deeplabv3: DeepLabV3+ MobileNet pretrained on cityscapes for ground masks,” https://github.com/cc-ai/Deeplabv3, 2019, accessed: 2026-05-06

  42. [42]

    On-device facial verification using nuf-net model of deep learning,

    C. Termritthikun, Y . Jamtsho, and P. Muneesawang, “On-device facial verification using nuf-net model of deep learning,”Engineering Applications of Artificial Intelligence, vol. 85, pp. 579–589, 2019. [Online]. Available: https://www.sciencedirect.com/science/article/pii/ S0952197619301824

  43. [43]

    Deepperform: An efficient approach for performance testing of resource-constrained neural net- works,

    S. Chen, M. Haque, C. Liu, and W. Yang, “Deepperform: An efficient approach for performance testing of resource-constrained neural net- works,” inProceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering, 2022, pp. 1–13

  44. [44]

    Nicgslowdown: Evaluating the efficiency robustness of neural image caption generation models,

    S. Chen, Z. Song, M. Haque, C. Liu, and W. Yang, “Nicgslowdown: Evaluating the efficiency robustness of neural image caption generation models,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 15 365–15 374. APPENDIXA PIPELINEAPPLICATIONS As shown in Figure 4, we developed five distinct pipeline applications....