Characterizing Adversarial Subspaces Using Local Intrinsic Dimensionality

arxiv: 1801.02613 · v3 · pith:VXB4MLFDnew · submitted 2018-01-08 · 💻 cs.LG · cs.CR· cs.CV

Characterizing Adversarial Subspaces Using Local Intrinsic Dimensionality

Xingjun Ma , Bo Li , Yisen Wang , Sarah M. Erfani , Sudanthi Wijewickrema , Grant Schoenebeck , Dawn Song , Michael E. Houle

show 1 more author

James Bailey

This is my paper

classification 💻 cs.LG cs.CRcs.CV

keywords adversarialexamplesregionsattacksdnnsbettercharacteristiccharacterizing

0 comments p. Extension

pith:VXB4MLFD Add to your LaTeX paper

What is a Pith Number?

\usepackage{pith}
\pithnumber{VXB4MLFD}

Prints a linked pith:VXB4MLFD badge after your title and writes the identifier into PDF metadata. Compiles on arXiv with no extra files. Learn more

read the original abstract

Deep Neural Networks (DNNs) have recently been shown to be vulnerable against adversarial examples, which are carefully crafted instances that can mislead DNNs to make errors during prediction. To better understand such attacks, a characterization is needed of the properties of regions (the so-called 'adversarial subspaces') in which adversarial examples lie. We tackle this challenge by characterizing the dimensional properties of adversarial regions, via the use of Local Intrinsic Dimensionality (LID). LID assesses the space-filling capability of the region surrounding a reference example, based on the distance distribution of the example to its neighbors. We first provide explanations about how adversarial perturbation can affect the LID characteristic of adversarial regions, and then show empirically that LID characteristics can facilitate the distinction of adversarial examples generated using state-of-the-art attacks. As a proof-of-concept, we show that a potential application of LID is to distinguish adversarial examples, and the preliminary results show that it can outperform several state-of-the-art detection measures by large margins for five attack strategies considered in this paper across three benchmark datasets. Our analysis of the LID characteristic for adversarial regions not only motivates new directions of effective adversarial defense, but also opens up more challenges for developing new attacks to better understand the vulnerabilities of DNNs.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 6 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

INTARG: Informed Real-Time Adversarial Attack Generation for Time-Series Regression
cs.LG 2026-04 unverdicted novelty 7.0

INTARG generates effective real-time adversarial attacks on time-series regression models by selectively targeting high-confidence high-error steps in a bounded-buffer online setting, increasing prediction error up to...
Intermediate Representations are Strong AI-Generated Image Detectors
cs.CV 2026-05 unverdicted novelty 6.0

Intermediate layer embedding sensitivity to perturbations distinguishes AI-generated images from real ones, yielding higher AUROC on GenImage and Forensics Small benchmarks than prior methods.
Margin-Adaptive Confidence Ranking for Reliable LLM Judgement
cs.LG 2026-05 unverdicted novelty 5.0

Introduces a margin-adaptive confidence ranking method that learns an estimator from simulated diversity and derives margin-dependent generalization bounds for use in fixed-sequence testing of LLM-human agreement.
Insider Attacks in Multi-Agent LLM Consensus Systems
cs.MA 2026-05 unverdicted novelty 5.0

A malicious agent in multi-agent LLM consensus systems can be trained via a surrogate world model and RL to reduce consensus rates and prolong disagreement more effectively than direct prompt attacks.
NeuroTrace: Inference Provenance-Based Detection of Adversarial Examples
cs.CR 2026-04 unverdicted novelty 5.0

NeuroTrace framework builds heterogeneous graphs of inference provenance to detect adversarial examples in DNNs, showing strong transferable performance across attack families in vision and malware domains.
Geometric Analysis of Neural Regression Collapse via Intrinsic Dimension
cs.LG 2025-10 unverdicted novelty 5.0

Neural regression collapse occurs when last-layer feature intrinsic dimension falls below target intrinsic dimension, creating over-compressed and under-compressed regimes that govern generalization based on data quan...