Adversarial examples in the physical world
read the original abstract
Most existing machine learning classifiers are highly vulnerable to adversarial examples. An adversarial example is a sample of input data which has been modified very slightly in a way that is intended to cause a machine learning classifier to misclassify it. In many cases, these modifications can be so subtle that a human observer does not even notice the modification at all, yet the classifier still makes a mistake. Adversarial examples pose security concerns because they could be used to perform an attack on machine learning systems, even if the adversary has no access to the underlying model. Up to now, all previous work have assumed a threat model in which the adversary can feed data directly into the machine learning classifier. This is not always the case for systems operating in the physical world, for example those which are using signals from cameras and other sensors as an input. This paper shows that even in such physical world scenarios, machine learning systems are vulnerable to adversarial examples. We demonstrate this by feeding adversarial images obtained from cell-phone camera to an ImageNet Inception classifier and measuring the classification accuracy of the system. We find that a large fraction of adversarial examples are classified incorrectly even when perceived through the camera.
This paper has not been read by Pith yet.
Forward citations
Cited by 21 Pith papers
-
Targeted Backdoor Attacks on Deep Learning Systems Using Data Poisoning
Injecting around 50 poisoned samples with a stealthy trigger creates backdoors in deep learning models achieving over 90% attack success under a weak threat model with no model or data knowledge required.
-
Attention Hijacking: Response Manipulation Across Queries in Vision-Language Models
Attention Hijacking is a new attack that improves cross-query transferability in VLMs by explicitly steering internal attention to a persistent image-dominant pattern.
-
Beyond Defenses: Manifold-Aligned Regularization for Intrinsic 3D Point Cloud Robustness
MAPR improves adversarial robustness in 3D point cloud networks by aligning latent predictions with intrinsic manifold geometry via curvature/diffusion features and a consistency loss.
-
Beyond Defenses: Manifold-Aligned Regularization for Intrinsic 3D Point Cloud Robustness
MAPR aligns latent and intrinsic geometries in 3D point cloud models via regularization on curvature and diffusion features plus consistency loss, yielding +20% average robustness gains on ModelNet40 without adversari...
-
SyncBreaker:Stage-Aware Multimodal Adversarial Attacks on Audio-Driven Talking Head Generation
SyncBreaker jointly attacks image and audio streams with Multi-Interval Sampling and Cross-Attention Fooling to degrade speech-driven talking head generation more than single-modality baselines.
-
GCP: Guarded Collaborative Perception with Spatial-Temporal Aware Malicious Agent Detection
GCP detects malicious agents in collaborative perception using spatial-temporal aware methods with a confidence-scaled loss and historical BEV flow reconstruction, achieving up to 34.69% AP@0.5 gains under a new BAC attack.
-
Interpretability Beyond Classification Output: Semantic Bottleneck Networks
Semantic Bottleneck Networks add interpretable semantic concept layers to deep networks, recovering SOTA segmentation performance with drastic channel reduction and enabling failure interpretation at over 99% accuracy...
-
Open DNN Box by Power Side-Channel Attack
Power side-channel analysis recovers DNN architecture and parameters at 96.5% average accuracy on real embedded devices.
-
Adversarial Objects Against LiDAR-Based Autonomous Driving Systems
LiDAR-Adv generates adversarial objects to fool LiDAR-based autonomous driving detection systems, tested on Baidu Apollo and with physical 3D prints.
-
Robust Synthesis of Adversarial Visual Examples Using a Deep Image Prior
A DIP-based optimization produces adversarial perturbations and patches that are more robust to affine transformations than standard high-frequency noise while staying imperceptible.
-
Explaining Deep Learning Models with Constrained Adversarial Examples
Introduces CADEX to generate domain-constrained counterfactual explanations for ML models using adversarial perturbations.
-
Towards Universal Physical Adversarial Attacks via a Joint Multi-Objective and Multi-Model Optimization Framework
JMOF is a new optimization framework for physical adversarial attacks that improves cross-model transferability and enables simultaneous attacks on multiple vision tasks such as object detection and semantic segmentation.
-
AGC: Adaptive Geodesic Correction for Adversarial Robustness on Vision-Language Models
AGC is a training-free inference-time defense for CLIP that adaptively corrects features along geodesics to robust augmentations, claiming 44.4% higher average robust accuracy and 10x lower latency than prior baseline...
-
Why Blocking Targeted Adversarial Perturbations Impairs the Ability to Learn
Defensive distillation blocks non-targeted adversarial attacks but cannot block targeted ones without preventing the network from learning via its input gradient.
-
Affine Disentangled GAN for Interpretable and Robust AV Perception
ADIS-GAN disentangles affine transformations in a GAN to achieve over 98% classification accuracy on MNIST within 30 degrees rotation and over 90% under FGSM and PGD attacks while generating rotation and scaling factors.
-
Learning to Cope with Adversarial Attacks
MLAH agent in deep RL demonstrates hierarchical coping mechanisms and improved reward maintenance under spaced adversarial attacks, at the expense of stability.
-
Invariance-inducing regularization using worst-case transformations suffices to boost accuracy and spatial robustness
Invariance-inducing regularization using worst-case transformations reduces relative error by 20% on CIFAR10 transformed examples, improves standard accuracy on SVHN, outperforms equivariant networks, and proves no ac...
-
Using Intuition from Empirical Properties to Simplify Adversarial Training Defense
Modifications to single-step adversarial training based on empirical properties of iterative methods improve accuracy by up to 16.93% against iterative attacks while reducing training cost by 28.75%.
-
Measuring the Transferability of Adversarial Examples
Empirical measurement of adversarial example transferability between VGG and Inception model classes with methodological refinements to attack strength selection, perturbation clipping, and evaluation via SSIM.
-
Machine learning and behavioral economics for personalized choice architecture
Machine learning can support personalized choice architecture in behavioral economics by using individual trait and psychological data to design targeted interventions.
-
Short-term Electric Load Forecasting Using TensorFlow and Deep Auto-Encoders
A TensorFlow-based deep auto-encoder model is proposed for short-term electric load forecasting and claimed to outperform traditional neural networks in accuracy and stability.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.