Accelerating Targeted Hard-Label Adversarial Attacks in Low-Query Black-Box Settings
Pith reviewed 2026-05-22 14:20 UTC · model grok-4.3
The pith
Using edge information from the target image allows targeted adversarial attacks to succeed with far fewer queries in black-box settings.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that perturbing the target image using its own edge information produces an adversarial example that lies closer to the source image and still reaches the target class, all while requiring substantially fewer queries than existing methods that focus only on geometric properties of the decision boundary.
What carries the argument
Targeted Edge-informed Attack (TEA), which extracts edge information from the target image to carefully perturb it toward the source image.
If this is right
- TEA consistently outperforms state-of-the-art methods with nearly 70% fewer queries across different models in low-query black-box settings.
- The generated adversarial examples remain closer to the source image.
- TEA provides an improved target initialization for established geometry-based attacks.
Where Pith is reading between the lines
- Edge-based guidance could be combined with other image features like color gradients to further reduce queries.
- Defenses against adversarial attacks might need to incorporate edge-preserving mechanisms to counter this approach.
- Testing TEA on video or 3D data could reveal whether the edge principle extends beyond static images.
Load-bearing premise
That incorporating edge information extracted from the target image will systematically produce an adversarial example that remains closer to the source image while still crossing into the desired target decision region, without introducing new failure modes or query overhead in the edge extraction step itself.
What would settle it
A set of experiments on standard image datasets where TEA is compared to baselines and the query count for TEA exceeds that of current methods or fails to achieve the target classification in a majority of cases.
Figures
read the original abstract
Deep neural networks for image classification remain vulnerable to adversarial examples -- small, imperceptible perturbations that induce misclassifications. In black-box settings, where only the final prediction is accessible, crafting targeted attacks that aim to misclassify into a specific target class is particularly challenging due to narrow decision regions. Current state-of-the-art methods often exploit the geometric properties of the decision boundary separating a source image and a target image rather than incorporating information from the images themselves. In contrast, we propose Targeted Edge-informed Attack (TEA), a novel attack that utilizes edge information from the target image to carefully perturb it, thereby producing an adversarial image that is closer to the source image while still achieving the desired target classification. Our approach consistently outperforms current state-of-the-art methods across different models in low query settings (nearly 70% fewer queries are used), a scenario especially relevant in real-world applications with limited queries and black-box access. Furthermore, by efficiently generating a suitable adversarial example, TEA provides an improved target initialization for established geometry-based attacks.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces Targeted Edge-informed Attack (TEA), a method for targeted hard-label black-box adversarial attacks that extracts edge information from a target-class image to generate an initial perturbation closer to the source image while preserving the target classification. The approach is claimed to reduce query counts by nearly 70% relative to prior state-of-the-art methods in low-query regimes and to supply an improved starting point for subsequent geometry-based attacks.
Significance. If the reported query reductions and reliability of the edge-based initializer are substantiated, the work would provide a practical advance in query-efficient targeted attacks, a setting directly relevant to real-world black-box access with strict query budgets. The shift from purely geometric boundary exploitation to incorporation of image-derived structural cues (edges) represents a distinct design choice that could influence follow-on research.
major comments (2)
- [§3] §3 (TEA construction): The central claim that edge information extracted from the target image produces a perturbation that remains inside the target decision region while moving closer to the source, without incurring extra queries or new failure modes, is load-bearing for the 70% query-reduction result. The manuscript provides no formal argument or ablation demonstrating that the edge-guided direction aligns with the unknown decision-boundary geometry sufficiently often to avoid additional boundary searches or rejections in the hard-label setting.
- [Experimental evaluation] Experimental evaluation (results section and tables): The abstract and introduction state a 'nearly 70% fewer queries' improvement, yet the reported numbers lack accompanying standard deviations, number of independent trials, statistical tests against baselines, and a per-phase query breakdown (edge extraction versus subsequent optimization). Without these, it is impossible to confirm that the savings are net and reproducible rather than an artifact of initialization variance.
minor comments (2)
- [Preliminaries] The notation used for the edge-extraction operator and the precise definition of the perturbation step should be introduced with a short equation or pseudocode block in the preliminaries to improve readability.
- [Figures] Figure captions should explicitly state the query budget and model/dataset combination shown, rather than leaving these details to the main text.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on our manuscript. The comments highlight important aspects of rigor and reproducibility that we will address in the revision. Below we respond point-by-point to the major comments.
read point-by-point responses
-
Referee: [§3] §3 (TEA construction): The central claim that edge information extracted from the target image produces a perturbation that remains inside the target decision region while moving closer to the source, without incurring extra queries or new failure modes, is load-bearing for the 70% query-reduction result. The manuscript provides no formal argument or ablation demonstrating that the edge-guided direction aligns with the unknown decision-boundary geometry sufficiently often to avoid additional boundary searches or rejections in the hard-label setting.
Authors: We agree that a formal proof is difficult in the hard-label black-box setting because the decision boundary is unknown. Our design choice is motivated by the observation that edges capture structural information shared between the target-class image and the source image, providing an initialization that is more likely to lie inside the target region. While the current manuscript relies on extensive empirical validation across models and datasets, we will add an ablation study that isolates the contribution of the edge initializer (comparing it to random perturbations and purely geometric initializations) and include a discussion of observed failure modes and their frequency. revision: yes
-
Referee: [Experimental evaluation] Experimental evaluation (results section and tables): The abstract and introduction state a 'nearly 70% fewer queries' improvement, yet the reported numbers lack accompanying standard deviations, number of independent trials, statistical tests against baselines, and a per-phase query breakdown (edge extraction versus subsequent optimization). Without these, it is impossible to confirm that the savings are net and reproducible rather than an artifact of initialization variance.
Authors: We acknowledge the need for statistical reporting. In the revised version we will (i) report mean query counts with standard deviations over at least five independent trials per setting, (ii) state the exact number of trials, (iii) include paired statistical significance tests against the strongest baselines, and (iv) provide an explicit per-phase query breakdown showing that edge extraction adds only a negligible constant number of queries while the subsequent optimization phase accounts for the reported savings. revision: yes
Circularity Check
No circularity: empirical heuristic without derivation chain
full rationale
The paper introduces TEA as an algorithmic construction that extracts edges from a target-class image and uses them to initialize perturbations toward a source image while seeking the target label. No equations, closed-form derivations, or parameter-fitting steps are described that would reduce the reported query savings or success rate to a self-referential definition or to a fitted quantity renamed as a prediction. Performance claims rest on empirical comparisons against prior geometry-based attacks rather than on any mathematical identity or self-citation that is load-bearing for the central result. The method is therefore self-contained as an engineering proposal whose validity is external to its own formulation.
Axiom & Free-Parameter Ledger
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
utilizes edge information from the target image to carefully perturb it... global edge-informed search... patch-based edge-informed search
-
IndisputableMonolith/Foundation/AlexanderDuality.leanalexander_duality_circle_linking unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
TEA provides an improved target initialization for established geometry-based attacks
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Forward citations
Cited by 1 Pith paper
-
Empirical Evidence for Simply Connected Decision Regions in Image Classifiers
Empirical tests with quad-mesh filling indicate that decision regions in modern image classifiers are simply connected.
Reference graph
Works this paper leans on
-
[1]
End to End Learning for Self-Driving Cars
M. Bojarski, D. Del Testa, D. Dworakowski, B. Firner, B. Flepp, P. Goyal, L. D. Jackel, M. Monfort, U. Muller, J. Zhanget al., “End to end learning for self-driving cars,”arXiv preprint arXiv:1604.07316, 2016
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[2]
Deepdriving: Learning affordance for direct perception in autonomous driving,
C. Chen, A. Seff, A. Kornhauser, and J. Xiao, “Deepdriving: Learning affordance for direct perception in autonomous driving,” inProceedings of the IEEE international conference on computer vision, 2015, pp. 2722–2730
work page 2015
-
[3]
Dermatologist-level classification of skin cancer with deep neural networks,
A. Esteva, B. Kuprel, R. A. Novoa, J. Ko, S. M. Swetter, H. M. Blau, and S. Thrun, “Dermatologist-level classification of skin cancer with deep neural networks,”nature, vol. 542, no. 7639, pp. 115–118, 2017
work page 2017
-
[4]
Deep residual learning for image recognition,
K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” inProceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778
work page 2016
-
[5]
Explaining and Harnessing Adversarial Examples
I. J. Goodfellow, J. Shlens, and C. Szegedy, “Explaining and harnessing adversarial examples,”arXiv preprint arXiv:1412.6572, 2014
work page internal anchor Pith review Pith/arXiv arXiv 2014
-
[6]
Intriguing properties of neural networks
C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, and R. Fergus, “Intriguing properties of neural networks,”arXiv preprint arXiv:1312.6199, 2013
work page internal anchor Pith review Pith/arXiv arXiv 2013
-
[7]
Black-box adversarial at- tacks with limited queries and information,
A. Ilyas, L. Engstrom, A. Athalye, and J. Lin, “Black-box adversarial at- tacks with limited queries and information,” inInternational conference on machine learning. PMLR, 2018, pp. 2137–2146
work page 2018
-
[8]
Sign- opt: A query-efficient hard-label adversarial attack,
M. Cheng, S. Singh, P. Chen, P.-Y . Chen, S. Liu, and C.-J. Hsieh, “Sign- opt: A query-efficient hard-label adversarial attack,”arXiv preprint arXiv:1909.10773, 2019
-
[9]
Improving black- box adversarial attacks with a transfer-based prior,
S. Cheng, Y . Dong, T. Pang, H. Su, and J. Zhu, “Improving black- box adversarial attacks with a transfer-based prior,”Advances in neural information processing systems, vol. 32, 2019
work page 2019
-
[10]
Ramboattack: A robust query efficient deep neural network decision exploit,
V . Q. V o, E. Abbasnejad, and D. C. Ranasinghe, “Ramboattack: A robust query efficient deep neural network decision exploit,”arXiv preprint arXiv:2112.05282, 2021
-
[11]
P.-Y . Chen, H. Zhang, Y . Sharma, J. Yi, and C.-J. Hsieh, “Zoo: Zeroth order optimization based black-box attacks to deep neural networks without training substitute models,” inProceedings of the 10th ACM workshop on artificial intelligence and security, 2017, pp. 15–26
work page 2017
-
[12]
Query- efficient hard-label black-box attack: An optimization-based approach,
M. Cheng, T. Le, P.-Y . Chen, H. Zhang, J. Yi, and C.-J. Hsieh, “Query- efficient hard-label black-box attack: An optimization-based approach,” inInternational Conference on Learning Representations, 2018
work page 2018
-
[13]
Decision-Based Adversarial Attacks: Reliable Attacks Against Black-Box Machine Learning Models
W. Brendel, J. Rauber, and M. Bethge, “Decision-based adversarial attacks: Reliable attacks against black-box machine learning models,” arXiv preprint arXiv:1712.04248, 2017
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[14]
Guessing smart: Biased sampling for efficient black-box adversarial attacks,
T. Brunner, F. Diehl, M. T. Le, and A. Knoll, “Guessing smart: Biased sampling for efficient black-box adversarial attacks,” inProceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 4958–4966
work page 2019
-
[15]
Surfree: a fast surrogate- free black-box attack,
T. Maho, T. Furon, and E. Le Merrer, “Surfree: a fast surrogate- free black-box attack,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2016, pp. 770–778
work page 2016
-
[16]
Hybrid batch attacks: Finding black-box adversarial examples with limited queries,
F. Suya, J. Chi, D. Evans, and Y . Tian, “Hybrid batch attacks: Finding black-box adversarial examples with limited queries,” in29th USENIX security symposium (USENIX Security 20), 2020, pp. 1327–1344
work page 2020
-
[17]
Hopskipjumpattack: A query-efficient decision-based attack,
J. Chen, M. I. Jordan, and M. J. Wainwright, “Hopskipjumpattack: A query-efficient decision-based attack,” in2020 ieee symposium on security and privacy (sp). IEEE, 2020, pp. 1277–1294
work page 2020
-
[18]
Qeba: Query-efficient boundary-based blackbox attack,
H. Li, X. Xu, X. Zhang, S. Yang, and B. Li, “Qeba: Query-efficient boundary-based blackbox attack,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 1221– 1230
work page 2020
-
[19]
A geometry-inspired decision-based attack,
Y . Liu, S.-M. Moosavi-Dezfooli, and P. Frossard, “A geometry-inspired decision-based attack,” inProceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 4890–4898
work page 2019
-
[20]
Finding optimal tangent points for reducing distortions of hard-label attacks,
C. Ma, X. Guo, L. Chen, J.-H. Yong, and Y . Wang, “Finding optimal tangent points for reducing distortions of hard-label attacks,”Advances in Neural Information Processing Systems, vol. 34, pp. 19 288–19 300, 2021
work page 2021
-
[21]
Cgba: Curvature- aware geometric black-box attack,
M. F. Reza, A. Rahmati, T. Wu, and H. Dai, “Cgba: Curvature- aware geometric black-box attack,” inProceedings of the IEEE/CVF international conference on computer vision, 2023, pp. 124–133
work page 2023
-
[22]
Diversity can be transferred: Output diversification for white-and black-box attacks,
Y . Tashiro, Y . Song, and S. Ermon, “Diversity can be transferred: Output diversification for white-and black-box attacks,”Advances in neural information processing systems, vol. 33, pp. 4536–4548, 2020
work page 2020
-
[23]
Triangle attack: A query-efficient decision-based adversarial attack,
X. Wang, Z. Zhang, K. Tong, D. Gong, K. He, Z. Li, and W. Liu, “Triangle attack: A query-efficient decision-based adversarial attack,” in European conference on computer vision. Springer, 2022, pp. 156–174
work page 2022
-
[24]
Neighborhood coding of binary images for fast contour following and general binary array processing,
I. Sobel, “Neighborhood coding of binary images for fast contour following and general binary array processing,”Computer graphics and image processing, vol. 8, no. 1, pp. 127–135, 1978
work page 1978
-
[25]
Color and edge-aware adversarial image perturbations,
R. Bassett, M. Graves, and P. Reilly, “Color and edge-aware adversarial image perturbations,”arXiv preprint arXiv:2008.12454, 2020
-
[26]
Visualizing and understanding convo- lutional networks,
M. D. Zeiler and R. Fergus, “Visualizing and understanding convo- lutional networks,” inComputer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part I 13. Springer, 2014, pp. 818–833
work page 2014
-
[27]
R. Geirhos, P. Rubisch, C. Michaelis, M. Bethge, F. A. Wichmann, and W. Brendel, “Imagenet-trained cnns are biased towards texture; in- creasing shape bias improves accuracy and robustness,” inInternational conference on learning representations, 2018
work page 2018
-
[28]
Imagenet: A large-scale hierarchical image database,
J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “Imagenet: A large-scale hierarchical image database,” in2009 IEEE conference on computer vision and pattern recognition. Ieee, 2009, pp. 248–255
work page 2009
-
[29]
Very Deep Convolutional Networks for Large-Scale Image Recognition
K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,”arXiv preprint arXiv:1409.1556, 2014
work page internal anchor Pith review Pith/arXiv arXiv 2014
-
[30]
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gellyet al., “An image is worth 16x16 words: Transformers for image recognition at scale,”arXiv preprint arXiv:2010.11929, 2020
work page internal anchor Pith review Pith/arXiv arXiv 2010
-
[31]
L. Engstrom, A. Ilyas, H. Salman, S. Santurkar, and D. Tsipras, “Robustness (python library),” 2019. [Online]. Available: https: //github.com/MadryLab/robustness
work page 2019
-
[32]
Aha! adaptive history-driven attack for decision-based black-box models,
J. Li, R. Ji, P. Chen, B. Zhang, X. Hong, R. Zhang, S. Li, J. Li, F. Huang, S. Ren, and Y . Sun, JiWu, “Aha! adaptive history-driven attack for decision-based black-box models,” inProceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 16 168–16 177
work page 2021
-
[33]
Towards evaluating the robustness of neural networks,
N. Carlini and D. Wagner, “Towards evaluating the robustness of neural networks,” in2017 ieee symposium on security and privacy (sp). Ieee, 2017, pp. 39–57
work page 2017
-
[34]
{AutoDA}: Au- tomated decision-based iterative adversarial attacks,
Q.-A. Fu, Y . Dong, H. Su, J. Zhu, and C. Zhang, “{AutoDA}: Au- tomated decision-based iterative adversarial attacks,” in31st USENIX Security Symposium (USENIX Security 22), 2022, pp. 3557–3574
work page 2022
-
[35]
Image quality assessment: from error visibility to structural similarity,
Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, “Image quality assessment: from error visibility to structural similarity,”IEEE transactions on image processing, vol. 13, no. 4, pp. 600–612, 2004
work page 2004
-
[36]
Fsim: A feature similarity index for image quality assessment,
L. Zhang, L. Zhang, X. Mou, and D. Zhang, “Fsim: A feature similarity index for image quality assessment,”IEEE transactions on Image Processing, vol. 20, no. 8, pp. 2378–2386, 2011
work page 2011
-
[37]
Perceptual evaluation of adversarial attacks for cnn-based image classification,
S. A. Fezza, Y . Bakhti, W. Hamidouche, and O. D ´eforges, “Perceptual evaluation of adversarial attacks for cnn-based image classification,” in2019 eleventh international conference on quality of multimedia experience (QoMEX). IEEE, 2019, pp. 1–6
work page 2019
-
[38]
Learning transferable visual models from natural language supervision,
A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry, A. Askell, P. Mishkin, J. Clarket al., “Learning transferable visual models from natural language supervision,” inInternational conference on machine learning. PmLR, 2021, pp. 8748–8763
work page 2021
-
[39]
Learning multiple layers of features from tiny images,
A. Krizhevsky, “Learning multiple layers of features from tiny images,” Technical report, University of Toronto, Toronto, ON, Canada, 2009
work page 2009
-
[40]
Intel image classification (scene classification challenge),
Intel and Analytics Vidhya, “Intel image classification (scene classification challenge),” https://www.kaggle.com/datasets/puneet6060/ intel-image-classification, 2018, originally released as the Intel Scene Classification Challenge on Analytics Vidhya
work page 2018
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.