Variational Autoencoder-Based Black-Box Adversarial Attack on Collaborative DNN Inference
Pith reviewed 2026-05-19 00:55 UTC · model grok-4.3
The pith
AdVAR-DNN launches black-box adversarial attacks on collaborative DNN inference by detecting partitioning exchanges and using a VAE to generate misclassifying samples.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
AdVAR-DNN is a black-box adversarial attack that leverages classifiers to detect model partitioning information from network exchanges and a variational autoencoder to generate untraceable manipulated samples, compromising collaborative DNN inference with high attack success rate and little to no probability of detection on popular object classification DNNs evaluated with the CIFAR-100 dataset.
What carries the argument
AdVAR-DNN, which combines partitioning-information classifiers with a variational autoencoder to produce adversarial samples that exploit unsecured exchanges in collaborative inference.
If this is right
- The attack succeeds without prior knowledge of the DNN model or its partitioning.
- It produces high misclassification rates on popular object classification DNNs.
- Detection probability stays low or zero during evaluation on CIFAR-100.
- The approach applies to dynamic partitioning scenarios common in IoT collaborative inference.
Where Pith is reading between the lines
- Protecting the channel that carries partitioning data would directly block the first stage of the attack.
- The same detection-plus-VAE pattern could be tested against other distributed inference or edge AI setups.
- Measuring attack performance under realistic network latency and packet loss would show how well the method survives real IoT conditions.
Load-bearing premise
Partitioning information can be reliably detected from unsecured network exchanges and a VAE trained on that information can generate effective adversarial samples without any prior knowledge of the DNN model.
What would settle it
Running AdVAR-DNN against an actual collaborative inference system on CIFAR-100 and measuring both the misclassification rate and any detection rate by the participants.
Figures
read the original abstract
In recent years, Deep Neural Networks (DNNs) have become increasingly integral to IoT-based environments, enabling realtime visual computing. However, the limited computational capacity of these devices has motivated the adoption of collaborative DNN inference, where the IoT device offloads part of the inference-related computation to a remote server. Such offloading often requires dynamic DNN partitioning information to be exchanged among the participants over an unsecured network or via relays/hops, leading to novel privacy vulnerabilities. In this paper, we propose AdVAR-DNN, an adversarial variational autoencoder (VAE)-based misclassification attack, leveraging classifiers to detect model information and a VAE to generate untraceable manipulated samples, specifically designed to compromise the collaborative inference process. AdVAR-DNN attack uses the sensitive information exchange vulnerability of collaborative DNN inference and is black-box in nature in terms of having no prior knowledge about the DNN model and how it is partitioned. Our evaluation using the most popular object classification DNNs on the CIFAR-100 dataset demonstrates the effectiveness of AdVAR-DNN in terms of high attack success rate with little to no probability of detection.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes AdVAR-DNN, a variational autoencoder (VAE)-based black-box adversarial attack targeting collaborative DNN inference in IoT settings. It exploits the exchange of dynamic DNN partitioning information over unsecured networks by employing classifiers to extract model details and a VAE to generate manipulated input samples that induce misclassification while remaining difficult to detect. The attack is presented as requiring no prior knowledge of the target DNN architecture or partitioning strategy. Evaluation is claimed to demonstrate high attack success rates with little to no detection probability when tested on popular object-classification DNNs using the CIFAR-100 dataset.
Significance. If the empirical results hold under rigorous scrutiny, the work would identify a concrete privacy and integrity vulnerability in collaborative inference pipelines that rely on dynamic partitioning over open networks. The combination of traffic-based model inference with VAE-generated adversarial examples could inform defenses for edge-cloud DNN deployments. The black-box framing, if substantiated with reproducible attack pipelines and cross-partition robustness tests, would strengthen the contribution beyond white-box assumptions common in adversarial ML literature.
major comments (2)
- [Abstract] Abstract: The central claim of 'high attack success rate with little to no probability of detection' is asserted without any quantitative metrics, baseline comparisons, error bars, or description of how success rate and detection probability were measured. This absence prevents assessment of whether the reported effectiveness supports the black-box attack narrative.
- [Abstract] Abstract: The black-box claim rests on the assumption that partitioning information can be reliably extracted from unsecured network exchanges without prior model knowledge. No details are supplied on classifier input features, training regime for the VAE, or robustness under realistic conditions (packet loss, timing jitter, concurrent inferences), which are load-bearing for both the attack success and low-detection results.
minor comments (1)
- Clarify the exact network traffic features used by the partitioning classifier and provide pseudocode or a diagram of the end-to-end attack pipeline.
Simulated Author's Rebuttal
We thank the referee for the thoughtful and constructive comments on our manuscript. We address each major comment in detail below, providing clarifications from the full paper and indicating where revisions will be made to strengthen the presentation.
read point-by-point responses
-
Referee: [Abstract] Abstract: The central claim of 'high attack success rate with little to no probability of detection' is asserted without any quantitative metrics, baseline comparisons, error bars, or description of how success rate and detection probability were measured. This absence prevents assessment of whether the reported effectiveness supports the black-box attack narrative.
Authors: We agree that the abstract, being a concise overview, does not embed the specific numerical results. The full manuscript presents these details in the Evaluation section, including attack success rates, detection probabilities measured via standard classifiers, comparisons against adapted baseline attacks, and error bars derived from repeated trials. To address the concern directly, we will revise the abstract to incorporate key quantitative indicators and a brief statement on the measurement approach. revision: yes
-
Referee: [Abstract] Abstract: The black-box claim rests on the assumption that partitioning information can be reliably extracted from unsecured network exchanges without prior model knowledge. No details are supplied on classifier input features, training regime for the VAE, or robustness under realistic conditions (packet loss, timing jitter, concurrent inferences), which are load-bearing for both the attack success and low-detection results.
Authors: The Methodology section describes the classifier's input features drawn from observable network traffic metadata and the VAE training procedure using standard variational objectives on the CIFAR-100 dataset. The black-box property holds because the attack infers partitioning solely from traffic without accessing model weights or architecture. We acknowledge that the evaluation focused on controlled network conditions and did not extensively test robustness to packet loss, jitter, or concurrent flows; we will add an explicit discussion of these assumptions and their implications as a limitation in the revised manuscript. revision: partial
Circularity Check
Empirical attack proposal and evaluation contain no circular derivation
full rationale
The paper introduces AdVAR-DNN as a VAE-based black-box attack that detects partitioning via classifiers and generates adversarial samples, then reports empirical success rates on CIFAR-100 with standard DNNs. No derivation chain, equations, or first-principles claims are present that reduce by construction to fitted parameters, self-definitions, or self-citations; the central results are direct experimental outcomes rather than predictions forced by the method's own inputs.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Collaborative DNN inference requires exchanging dynamic partitioning information over unsecured networks or relays, creating detectable vulnerabilities.
invented entities (1)
-
AdVAR-DNN
no independent evidence
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
AdVAR-DNN attack uses the sensitive information exchange vulnerability of collaborative DNN inference and is black-box in nature in terms of having no prior knowledge about the DNN model and how it is partitioned.
-
IndisputableMonolith/Foundation/AlphaCoordinateFixation.leanJ_uniquely_calibrated_via_higher_derivative unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
we propose AdVAR-DNN, an adversarial variational autoencoder (VAE)-based misclassification attack
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Resource management in mobile edge computing: A comprehensive survey,
X. Zhang and S. Debroy, “Resource management in mobile edge computing: A comprehensive survey,” ACM Computing Surveys , vol. 55, no. 13s, pp. 1–37, 2023
work page 2023
-
[2]
Data security and privacy-preserving in edge computing paradigm: Survey and open issues,
J. Zhang, B. Chen, Y . Zhao, X. Cheng, and F. Hu, “Data security and privacy-preserving in edge computing paradigm: Survey and open issues,” IEEE access, vol. 6, pp. 18 209–18 237, 2018
work page 2018
-
[3]
Intent-driven data falsification attack on collaborative iot-edge environments,
S. Yousefi, S. Bhattacharjee, and S. Debroy, “Intent-driven data falsification attack on collaborative iot-edge environments,” in2024 IEEE/ACM Symposium on Edge Computing (SEC) . IEEE, 2024, pp. 425–430
work page 2024
-
[4]
The limitations of deep learning in adversarial settings,
N. Papernot, P. McDaniel, S. Jha, M. Fredrikson, Z. B. Celik, and A. Swami, “The limitations of deep learning in adversarial settings,” in 2016 IEEE European symposium on security and privacy (EuroS&P) . IEEE, 2016, pp. 372–387
work page 2016
-
[5]
Y . Man, R. Muller, M. Li, Z. B. Celik, and R. Gerdes, “That person moves like a car: Misclassification attack detection for autonomous systems using spatiotemporal consistency,” in 32nd USENIX Security Symposium (USENIX Security 23) , 2023, pp. 6929–6946
work page 2023
-
[6]
A survey on security and privacy issues in edge-computing-assisted internet of things,
A. Alwarafy, K. A. Al-Thelaya, M. Abdallah, J. Schneider, and M. Hamdi, “A survey on security and privacy issues in edge-computing-assisted internet of things,” IEEE Internet of Things Journal , vol. 8, no. 6, pp. 4004–4022, 2020. 9
work page 2020
-
[7]
X. Huang, R. Yu, M. Pan, and L. Shu, “Secure roadside unit hotspot against eavesdropping based traffic analysis in edge computing based internet of vehicles,” IEEE Access, vol. 6, pp. 62 371–62 383, 2018
work page 2018
-
[8]
Adversarial autoencoder based model extraction attacks for collaborative dnn inference at edge,
M. Zneit, X. Zhang, M. Mounesan, and S. Debroy, “Adversarial autoencoder based model extraction attacks for collaborative dnn inference at edge,” in NOMS 2025-2025 IEEE Network Operations and Management Symposium , 2025, pp. 01–09
work page 2025
-
[9]
I know what you see: Power side-channel attack on convolutional neural network accelerators,
L. Wei, B. Luo, Y . Li, Y . Liu, and Q. Xu, “I know what you see: Power side-channel attack on convolutional neural network accelerators,” in Proceedings of the 34th Annual Computer Security Applications Conference , 2018, pp. 393–406
work page 2018
-
[10]
{GhostImage}: Remote perception attacks against camera-based image classification systems,
Y . Man, M. Li, and R. Gerdes, “ {GhostImage}: Remote perception attacks against camera-based image classification systems,” in 23rd International Symposium on Research in Attacks, Intrusions and Defenses (RAID 2020) , 2020, pp. 317–332
work page 2020
-
[11]
Poltergeist: Acoustic adversarial machine learning against cameras and computer vision,
X. Ji, Y . Cheng, Y . Zhang, K. Wang, C. Yan, W. Xu, and K. Fu, “Poltergeist: Acoustic adversarial machine learning against cameras and computer vision,” in 2021 IEEE Symposium on Security and Privacy (SP) . IEEE, 2021, pp. 160–175
work page 2021
-
[12]
One pixel attack for fooling deep neural networks,
J. Su, D. V . Vargas, and K. Sakurai, “One pixel attack for fooling deep neural networks,” IEEE Transactions on Evolutionary Computation , vol. 23, no. 5, pp. 828–841, 2019
work page 2019
-
[13]
Generative adversarial networks: A survey toward private and secure applications,
Z. Cai, Z. Xiong, H. Xu, P. Wang, W. Li, and Y . Pan, “Generative adversarial networks: A survey toward private and secure applications,”ACM Computing Surveys (CSUR), vol. 54, no. 6, pp. 1–38, 2021
work page 2021
-
[14]
Constructing unrestricted adversarial examples with generative models,
Y . Song, R. Shu, N. Kushman, and S. Ermon, “Constructing unrestricted adversarial examples with generative models,” Advances in neural information processing systems, vol. 31, 2018
work page 2018
-
[15]
Handling uncertainty in health data using generative algorithms,
M. A. Loodaricheh, N. Majmudar, A. Raja, and A. Salleb-Aouissi, “Handling uncertainty in health data using generative algorithms,” arXiv preprint arXiv:2503.03715, 2025
-
[16]
Generating Natural Adversarial Examples
Z. Zhao, D. Dua, and S. Singh, “Generating natural adversarial examples,” arXiv preprint arXiv:1710.11342 , 2017
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[17]
Model inversion attacks that exploit confidence information and basic countermeasures,
M. Fredrikson, S. Jha, and T. Ristenpart, “Model inversion attacks that exploit confidence information and basic countermeasures,” in Proceedings of the 22nd ACM SIGSAC conference on computer and communications security , 2015, pp. 1322–1333
work page 2015
-
[18]
Model inversion attacks against collaborative inference,
Z. He, T. Zhang, and R. B. Lee, “Model inversion attacks against collaborative inference,” in Proceedings of the 35th Annual Computer Security Applications Conference, 2019, pp. 148–162
work page 2019
-
[19]
Going deeper in spiking neural networks: Vgg and residual architectures,
A. Sengupta, Y . Ye, R. Wang, C. Liu, and K. Roy, “Going deeper in spiking neural networks: Vgg and residual architectures,” Frontiers in neuroscience, vol. 13, p. 95, 2019
work page 2019
-
[20]
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications
A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam, “Mobilenets: Efficient convolutional neural networks for mobile vision applications,” arXiv preprint arXiv:1704.04861 , 2017
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[21]
R. Lyman, “Character Font Images,” UCI Machine Learning Repository, 2016, DOI: https://doi.org/10.24432/C5X61Q
-
[22]
N. Mishra and S. Pandya, “Internet of things applications, security challenges, attacks, intrusion detection, and future visions: A systematic review,” IEEE Access, vol. 9, pp. 59 353–59 377, 2021
work page 2021
-
[23]
Internet of things: Security and solutions survey,
P. K. Sadhu, V . P. Yanambaka, and A. Abdelgawad, “Internet of things: Security and solutions survey,” Sensors, vol. 22, no. 19, p. 7433, 2022
work page 2022
-
[24]
Iot: Internet of threats? a survey of practical security vulnerabilities in real iot devices,
F. Meneghello, M. Calore, D. Zucchetto, M. Polese, and A. Zanella, “Iot: Internet of threats? a survey of practical security vulnerabilities in real iot devices,” IEEE Internet of Things Journal , vol. 6, no. 5, pp. 8182–8201, 2019
work page 2019
-
[25]
Effect-dnn: Energy-efficient edge framework for real-time dnn inference,
X. Zhang, M. Mounesan, and S. Debroy, “Effect-dnn: Energy-efficient edge framework for real-time dnn inference,” in 2023 IEEE 24th International Symposium on a World of Wireless, Mobile and Multimedia Networks (WoWMoM) . IEEE, 2023, pp. 10–20
work page 2023
-
[26]
Edgerl: Reinforcement learning-driven deep learning model inference optimization at edge,
M. Mounesan, X. Zhang, and S. Debroy, “Edgerl: Reinforcement learning-driven deep learning model inference optimization at edge,” in 2024 20th International Conference on Network and Service Management (CNSM) , 2024, pp. 1–5
work page 2024
-
[27]
Infer-edge: Dynamic dnn inference optimization in just-in-time edge-ai implementations,
——, “Infer-edge: Dynamic dnn inference optimization in just-in-time edge-ai implementations,” in NOMS 2025-2025 IEEE Network Operations and Management Symposium, 2025, pp. 1–9
work page 2025
-
[28]
Data augmentation via latent space interpolation for image classification,
X. Liu, Y . Zou, L. Kong, Z. Diao, J. Yan, J. Wang, S. Li, P. Jia, and J. You, “Data augmentation via latent space interpolation for image classification,” in 2018 24th International Conference on Pattern Recognition (ICPR) . IEEE, 2018, pp. 728–733
work page 2018
-
[29]
T. White, “Sampling generative networks,” arXiv preprint arXiv:1609.04468 , 2016
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[30]
Imagenet classification with deep convolutional neural networks,
A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” Communications of the ACM , vol. 60, no. 6, pp. 84–90, 2017
work page 2017
-
[31]
Adam: A Method for Stochastic Optimization
D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980 , 2014
work page internal anchor Pith review Pith/arXiv arXiv 2014
-
[32]
Adversarial Defense of Image Classification Using a Variational Auto-Encoder
Y . Luo and H. Pfister, “Adversarial defense of image classification using a variational auto-encoder,” arXiv preprint arXiv:1812.02891 , 2018
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[33]
Learning multiple layers of features from tiny images,
A. Krizhevsky, G. Hinton et al., “Learning multiple layers of features from tiny images,” 2009
work page 2009
-
[34]
GitHub, “Github repository,” https://github.com/dissectlab/AdV AR-DNN-LCN2025.git, 2025, accessed: July 31, 2025
work page 2025
-
[35]
arXiv preprint arXiv:2311.02960 , year=
P. Wang, X. Li, C. Yaras, Z. Zhu, L. Balzano, W. Hu, and Q. Qu, “Understanding deep representation learning via layerwise feature compression and discrimination,” arXiv preprint arXiv:2311.02960 , 2023. 10 TABLE III: Visualization of selected samples comparing original images vs. generated adversarial examples using intermediate features of layer 20 in th...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.