Causal Evaluation of Membership Inference Attacks

Aur\'elien Bellet; Cl\'ement Berenfeld; Julie Josse; Linus Bleistein; Mathieu Even; Tudor Cebere

arxiv: 2602.02819 · v4 · pith:WVDSM3O6new · submitted 2026-02-02 · 💻 cs.LG · stat.ML

Causal Evaluation of Membership Inference Attacks

Mathieu Even , Cl\'ement Berenfeld , Linus Bleistein , Tudor Cebere , Julie Josse , Aur\'elien Bellet This is my paper

classification 💻 cs.LG stat.ML

keywords evaluationcausaldatainferenceone-runtrainingzero-runattacks

0 comments

read the original abstract

Membership Inference Attacks (MIAs) aim to distinguish training points (members) from unseen data (non-members), and are widely used to quantify memorization and assess privacy risks. Standard MIA evaluation requires repeated retraining, which is computationally costly for large models. One-run (single training with randomized data inclusion) and zero-run (post hoc evaluation) methods are often used instead, but their statistical validity remains unclear. We address this gap by framing MIA evaluation as a causal inference problem, defining \emph{memorization as the causal effect of including a data point in the training set}. This novel formulation reveals and formalizes key sources of bias in existing protocols: one-run methods suffer from interference between jointly included points, while zero-run evaluations are additionally confounded by distribution shift between member and non-member evaluation data. We derive causal analogues of standard MIA metrics and propose practical estimators for multi-run, one-run, and zero-run regimes with non-asymptotic consistency guarantees. We validate our approach in several settings, including pretrained and fine-tuned LLMs, showing that it enables reliable measurement of MIA performance without retraining and under distribution shift. Overall, our framework provides a principled foundation for privacy evaluation in modern AI systems.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Privacy Auditing with Zero (0) Training Run
cs.CR 2026-05 unverdicted novelty 8.0

Zero-Run auditing supplies valid lower bounds on differential privacy parameters from fixed member and non-member datasets by modeling and correcting distribution-shift confounding via causal-inference techniques.