ML-Leaks: Model and Data Independent Membership Inference Attacks and Defenses on Machine Learning Models

Ahmed Salem , Yang Zhang , Mathias Humbert , Pascal Berrang , Mario Fritz , Michael Backes

Authors on Pith no claims yet

classification 💻 cs.CR cs.AIcs.LG

keywords attacksmodeldatainferencelearningmachinemembershiptraining

read the original abstract

Machine learning (ML) has become a core component of many real-world applications and training data is a key factor that drives current progress. This huge success has led Internet companies to deploy machine learning as a service (MLaaS). Recently, the first membership inference attack has shown that extraction of information on the training set is possible in such MLaaS settings, which has severe security and privacy implications. However, the early demonstrations of the feasibility of such attacks have many assumptions on the adversary, such as using multiple so-called shadow models, knowledge of the target model structure, and having a dataset from the same distribution as the target model's training data. We relax all these key assumptions, thereby showing that such attacks are very broadly applicable at low cost and thereby pose a more severe risk than previously thought. We present the most comprehensive study so far on this emerging and developing threat using eight diverse datasets which show the viability of the proposed attacks across domains. In addition, we propose the first effective defense mechanisms against such broader class of membership inference attacks that maintain a high level of utility of the ML model.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 7 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

DistractMIA: Black-Box Membership Inference on Vision-Language Models via Semantic Distraction
cs.CV 2026-05 unverdicted novelty 7.0

DistractMIA performs output-only black-box membership inference on vision-language models by inserting semantic distractors and measuring shifts in generated text responses.
Revisiting Privacy Leakage in Machine Unlearning: Membership Inference Beyond the Forgotten Set
cs.CR 2026-05 unverdicted novelty 7.0

Unlearning increases privacy leakage for the retain set, and a new tri-class membership inference attack distinguishes forget, retain, and unseen data using pre- and post-unlearning model outputs.
On the (In-)Security of the Shuffling Defense in the Transformer Secure Inference
cs.CR 2026-05 conditional novelty 6.0

An attack aligns differently shuffled intermediate activations from secure Transformer inference queries to recover model weights with low error using roughly one dollar of queries.
A Dual Perspective on Synthetic Trajectory Generators: Utility Framework and Privacy Vulnerabilities
cs.AI 2026-04 unverdicted novelty 6.0

A new framework evaluates utility of synthetic mobility trajectories while a membership inference attack reveals privacy vulnerabilities in generative models thought to be safe.
Towards Reliable Testing of Machine Unlearning
cs.LG 2026-04 unverdicted novelty 6.0

Causal fuzzing with budgeted interventions can detect residual direct and indirect influence of unlearned data that standard attribution methods miss due to proxies, cancellations, and masking.
Jellyfish: Zero-Shot Federated Unlearning Scheme with Knowledge Disentanglement
cs.CR 2026-04 unverdicted novelty 6.0

Jellyfish enables zero-shot federated unlearning through synthetic proxy data generation, channel-restricted knowledge disentanglement, and a composite loss with repair to forget target data while retaining model utility.
Forgetting to Witness: Efficient Federated Unlearning and Its Visible Evaluation
cs.LG 2026-04 unverdicted novelty 5.0

A complete pipeline for federated unlearning via knowledge distillation for efficient removal and a GAN-integrated classifier for visual evaluation of forgetting capacity.