Sequential Membership Inference Attacks

Debabrota Basu; Emilie Kaufmann; Thomas Michel

arxiv: 2602.16596 · v2 · submitted 2026-02-18 · 💻 cs.LG · cs.CR· math.ST· stat.ML· stat.TH

Sequential Membership Inference Attacks

Thomas Michel , Debabrota Basu , Emilie Kaufmann This is my paper

Pith reviewed 2026-05-15 21:14 UTC · model grok-4.3

classification 💻 cs.LG cs.CRmath.STstat.MLstat.TH

keywords membership inferencesequential attacksprivacy auditingdifferential privacySGDmodel updates

0 comments

The pith

Accessing the full sequence of model updates yields stronger membership inference attacks than analyzing only the final model.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that sequential membership inference attacks, which leverage the history of model updates and a controlled insertion of a target example, can detect membership more effectively than traditional attacks on a single model snapshot. By developing an optimal attack called SeMI* for empirical mean computation, it shows that the power of the attack depends specifically on the statistics just before and after the target's insertion. This approach provides tighter privacy audits for models trained with SGD or DP-SGD by exploiting both control over insertion timing and observations across multiple model versions. Experiments across datasets confirm higher attack success rates compared to snapshot-independent methods.

Core claim

SeMI* is an optimal sequential membership inference attack that identifies the presence of a target inserted at a specific step, with its power derived from the isolation property relying on pre- and post-insertion statistics, demonstrating superior performance over final-model-only attacks.

What carries the argument

The isolation property of SeMI*, where attack power depends only on statistics obtained right before and after insertion of the target canary.

If this is right

Practical white-box and black-box SeMI attacks can be developed against (DP-)SGD trained models.
SeMI attacks achieve higher powers than snapshot-independent baselines.
Control over insertion time and observations across the model sequence yield tighter privacy audits.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Dynamic models that update frequently may require new privacy protections beyond static analysis.
Attackers with partial sequence access might still gain advantages if insertion timing can be inferred.
Extending this to other training paradigms could reveal similar sequential vulnerabilities.

Load-bearing premise

The attacker can control the exact insertion time of a target canary and observe the full sequence of model updates.

What would settle it

An experiment where the attacker has no control over when the target is inserted and only observes the final model, checking if the sequential advantage disappears.

read the original abstract

Modern AI models are not static. They go through multiple updates in their lifecycles. We propose to design Sequential Membership Inference (SeMI) attacks leading to tighter privacy audits by exploiting the sequence of models and injecting a target canary at a controlled insertion time. First, for empirical mean computation, we develop SeMI*, an {optimal SeMI attack to identify the presence of a target inserted at a specific insertion step}. We derive the power of SeMI* to show that accessing the model sequence yields more powerful MI attacks than scrutinising only the final model. SeMI* exhibits an isolation property -- its power depends on the statistics obtained right before and after insertion of the target. Leveraging this insight, we develop practical white-box (accessing model gradients) and black-box (accessing loss) SeMI attacks against models trained with (DP-)SGD. Across datasets and models trained with (DP-)SGD, our experiments show that SeMI attacks achieve higher powers than snapshot-independent baselines, and yield tighter privacy audits thanks to (a) control over the insertion time and (b) observations across the model sequence.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

SeMI gains power from model sequences only when the attacker controls exact insertion timing, which is a real but narrow condition that limits how much tighter the audits actually get.

read the letter

The main thing here is that if an attacker can choose when a target point enters the training stream and sees the models right before and after, they can extract more membership signal than from the final model alone. The authors formalize this for the empirical mean with SeMI*, derive its power, and identify an isolation property that pins the advantage to those two adjacent statistics. They then build white-box gradient and black-box loss versions for (DP-)SGD and run experiments showing higher attack success rates than snapshot baselines across a few datasets and models.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes Sequential Membership Inference (SeMI) attacks that exploit sequences of model updates together with controlled insertion of a target canary at a known step. For the empirical-mean case it derives an optimal SeMI* attack whose power is shown to exceed that of any final-snapshot attack via an isolation property depending only on the pre- and post-insertion statistics; the same insight is used to construct practical white-box (gradient) and black-box (loss) attacks against (DP-)SGD training. Experiments across datasets report higher attack power and therefore tighter privacy audits than snapshot baselines.

Significance. If the derivations and experiments hold, the work supplies a theoretically grounded method for strengthening membership-inference audits of sequentially updated models. The isolation property offers a clean first-principles explanation for when sequence information is useful, and the empirical gains on (DP-)SGD models indicate immediate applicability to privacy auditing of realistic training pipelines.

major comments (2)

[Abstract and threat model] Abstract and threat-model section: the central claim that SeMI yields strictly more powerful attacks and tighter audits rests on the attacker both choosing and knowing the precise insertion step; the isolation-property derivation therefore only produces the reported gain when this control is present. The manuscript should explicitly delineate the threat model in which such control is realistic versus standard black-box update streams where timing is opaque.
[Power derivation and SGD extension] Power derivation for SeMI* (empirical mean) and its extension to (DP-)SGD: the isolation property is stated to depend only on pre- and post-insertion statistics, yet the practical attacks appear to require the attacker to designate the insertion index in advance. If the experiments rely on post-hoc selection of the insertion step, the reported power advantage is conditional and should be re-stated as such to avoid over-generalization.

minor comments (2)

Notation for SeMI versus SeMI* should be introduced once in the introduction and used consistently thereafter.
All power expressions and isolation statements should be numbered and cross-referenced in the text.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and insightful comments. We believe the suggested clarifications will strengthen the manuscript and address them point by point below.

read point-by-point responses

Referee: [Abstract and threat model] Abstract and threat-model section: the central claim that SeMI yields strictly more powerful attacks and tighter audits rests on the attacker both choosing and knowing the precise insertion step; the isolation-property derivation therefore only produces the reported gain when this control is present. The manuscript should explicitly delineate the threat model in which such control is realistic versus standard black-box update streams where timing is opaque.

Authors: We agree with the referee that the threat model requires clearer delineation. Our SeMI framework is intended for auditing scenarios in which the auditor (attacker) can control the insertion time of the canary example, for instance by injecting it at a chosen training step in a controlled environment. This is realistic for privacy audits of (DP-)SGD pipelines, where one can decide the timing to maximize detection power. In contrast, for passive observation of opaque update streams, timing would indeed be unknown and the advantage would not apply. We will revise the abstract and add an explicit paragraph in the threat model section to distinguish these cases and qualify the claims accordingly. revision: yes
Referee: [Power derivation and SGD extension] Power derivation for SeMI* (empirical mean) and its extension to (DP-)SGD: the isolation property is stated to depend only on pre- and post-insertion statistics, yet the practical attacks appear to require the attacker to designate the insertion index in advance. If the experiments rely on post-hoc selection of the insertion step, the reported power advantage is conditional and should be re-stated as such to avoid over-generalization.

Authors: The isolation property is derived under the assumption that the insertion step is known and designated in advance by the attacker, allowing access to the exact pre- and post-insertion model statistics. Our experiments follow this setup: the insertion step is chosen and known beforehand, not selected post-hoc. We will update the manuscript to explicitly state this assumption in the power derivation, the practical attack descriptions, and the experimental section to prevent any misinterpretation that the gains hold without knowledge of the insertion time. revision: yes

Circularity Check

0 steps flagged

No significant circularity; SeMI* power derived from explicit isolation math

full rationale

The paper's central derivation computes the power of SeMI* directly from the difference in pre- and post-insertion statistics for the empirical mean estimator, then extends the same isolation logic to (DP-)SGD gradients and losses. This step is a first-principles calculation that does not rename a fitted quantity as a prediction, invoke a self-citation as the sole justification for uniqueness, or smuggle an ansatz through prior work. The isolation property is stated as a mathematical consequence of the attack construction rather than being presupposed, and the claim that sequence access is strictly stronger follows from subtracting the before/after terms, which is not equivalent to the input data by definition. Experiments are presented only as validation under the controlled-insertion assumption, not as the source of the power formula itself.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on domain assumptions about sequential model updates via (DP-)SGD and the ability to control insertion timing; no free parameters or invented entities are introduced in the abstract.

axioms (1)

domain assumption Models undergo sequential updates via SGD or DP-SGD
Invoked when developing practical white-box and black-box attacks against (DP-)SGD trained models.

pith-pipeline@v0.9.0 · 5500 in / 1153 out tokens · 41939 ms · 2026-05-15T21:14:01.804758+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We derive the optimal MI attack, SeMI*, that uses the sequence of model updates to identify the presence of a target inserted at a certain update step... isolation property -- its power depends on the statistics obtained right before and after insertion of the target.

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.