Black-box Adversarial Attacks with Limited Queries and Information

Ilyas A · 2018 · cs.CV · arXiv 1804.08598

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

open full Pith review browse 4 citing papers arXiv PDF

abstract

Current neural network-based classifiers are susceptible to adversarial examples even in the black-box setting, where the attacker only has query access to the model. In practice, the threat model for real-world systems is often more restrictive than the typical black-box model where the adversary can observe the full output of the network on arbitrarily many chosen inputs. We define three realistic threat models that more accurately characterize many real-world classifiers: the query-limited setting, the partial-information setting, and the label-only setting. We develop new attacks that fool classifiers under these more restrictive threat models, where previous methods would be impractical or ineffective. We demonstrate that our methods are effective against an ImageNet classifier under our proposed threat models. We also demonstrate a targeted black-box attack against a commercial classifier, overcoming the challenges of limited query access, partial information, and other practical issues to break the Google Cloud Vision API.

citation-role summary

background 1

citation-polarity summary

unclear 1

representative citing papers

Amnesia: A Stealthy Replay Attack on Continual Learning Dreams

cs.CR · 2026-06-10 · unverdicted · novelty 6.0

Amnesia is a replay composition attack on continual learning that tilts class distributions under visibility (delta) and mass (f) budgets to reduce accuracy while evading audits.

Fast Adversarial Attacks with Gradient Prediction

cs.LG · 2026-05-14 · unverdicted · novelty 6.0

Gradient prediction via linear regression on hidden states recovers most FGSM attack strength at 532% higher throughput by avoiding backward passes.

Hiding Faces in Plain Sight: Disrupting AI Face Synthesis with Adversarial Perturbations

cs.CV · 2019-06-21 · unverdicted · novelty 6.0

Adversarial perturbations disrupt DNN-based face detectors under white-box, gray-box, and black-box settings to sabotage training data for AI face synthesis.

When AI reviews science: Can we trust the referee?

cs.AI · 2026-04-26 · unverdicted · novelty 6.0

AI peer review systems are vulnerable to prompt injections, prestige biases, assertion strength effects, and contextual poisoning, as demonstrated by a new attack taxonomy and causal experiments on real conference submissions.

citing papers explorer

Showing 1 of 1 citing paper after filters.

When AI reviews science: Can we trust the referee? cs.AI · 2026-04-26 · unverdicted · none · ref 67
AI peer review systems are vulnerable to prompt injections, prestige biases, assertion strength effects, and contextual poisoning, as demonstrated by a new attack taxonomy and causal experiments on real conference submissions.

Black-box Adversarial Attacks with Limited Queries and Information

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer