DP-InstaHide: Provably Defusing Poisoning and Backdoor Attacks with Differentially Private Data Augmentations

Amin Ghiasi; Arjun Gupta; Eitan Borgnia; Furong Huang; Jonas Geiping; Liam Fowl; Micah Goldblum; Tom Goldstein; Valeriia Cherepanova

arxiv: 2103.02079 · v2 · pith:UM3GFLDZnew · submitted 2021-03-02 · 💻 cs.LG · cs.CR· cs.CV

DP-InstaHide: Provably Defusing Poisoning and Backdoor Attacks with Differentially Private Data Augmentations

Eitan Borgnia , Jonas Geiping , Valeriia Cherepanova , Liam Fowl , Arjun Gupta , Amin Ghiasi , Furong Huang , Micah Goldblum

show 1 more author

Tom Goldstein

This is my paper

classification 💻 cs.LG cs.CRcs.CV

keywords mixupattackstrainingdatadp-instahidemodelnoiseperformance

0 comments

read the original abstract

Data poisoning and backdoor attacks manipulate training data to induce security breaches in a victim model. These attacks can be provably deflected using differentially private (DP) training methods, although this comes with a sharp decrease in model performance. The InstaHide method has recently been proposed as an alternative to DP training that leverages supposed privacy properties of the mixup augmentation, although without rigorous guarantees. In this work, we show that strong data augmentations, such as mixup and random additive noise, nullify poison attacks while enduring only a small accuracy trade-off. To explain these finding, we propose a training method, DP-InstaHide, which combines the mixup regularizer with additive noise. A rigorous analysis of DP-InstaHide shows that mixup does indeed have privacy advantages, and that training with k-way mixup provably yields at least k times stronger DP guarantees than a naive DP mechanism. Because mixup (as opposed to noise) is beneficial to model performance, DP-InstaHide provides a mechanism for achieving stronger empirical performance against poisoning attacks than other known DP methods.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

FML-bench: A Controlled Study of AI Research Agent Strategies from the Perspective of Search Dynamics
cs.LG 2026-05 unverdicted novelty 7.0

FML-Bench shows a simple greedy hill-climber nearly matches tree search on dense-opportunity tasks while an adaptive agent that broadens search on stagnation outperforms six baselines across 18 tasks.
FML-bench: A Controlled Study of AI Research Agent Strategies from the Perspective of Search Dynamics
cs.LG 2026-05 accept novelty 6.0

FML-Bench shows that a simple greedy hill-climber performs nearly as well as complex tree-search agents on ML research tasks, with an adaptive strategy that switches exploration modes outperforming all tested agents.