pith. sign in

Membership inference attacks against machine learning models

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

fields

cs.LG 2 cs.CL 1

representative citing papers

Score-based Membership Inference on Diffusion Models

cs.LG · 2025-09-29 · unverdicted · novelty 7.0

Presents SimA, a score-based single-query membership inference attack for diffusion models and LDMs that uses denoiser output norm to reveal training set proximity and outperforms multi-query baselines on eight datasets.

Detecting Pretraining Data from Large Language Models

cs.CL · 2023-10-25 · conditional · novelty 7.0

Min-K% Prob detects pretraining data in LLMs by flagging outlier low-probability words in text, achieving 7.4% better performance than prior methods on the new WIKIMIA benchmark.

TOFU: A Task of Fictitious Unlearning for LLMs

cs.LG · 2024-01-11 · conditional · novelty 6.0

TOFU is a new benchmark with synthetic profiles and metrics demonstrating that existing unlearning algorithms for LLMs fail to achieve effective forgetting of targeted information.

citing papers explorer

Showing 3 of 3 citing papers.

  • Score-based Membership Inference on Diffusion Models cs.LG · 2025-09-29 · unverdicted · none · ref 38

    Presents SimA, a score-based single-query membership inference attack for diffusion models and LDMs that uses denoiser output norm to reveal training set proximity and outperforms multi-query baselines on eight datasets.

  • Detecting Pretraining Data from Large Language Models cs.CL · 2023-10-25 · conditional · none · ref 53

    Min-K% Prob detects pretraining data in LLMs by flagging outlier low-probability words in text, achieving 7.4% better performance than prior methods on the new WIKIMIA benchmark.

  • TOFU: A Task of Fictitious Unlearning for LLMs cs.LG · 2024-01-11 · conditional · none · ref 33

    TOFU is a new benchmark with synthetic profiles and metrics demonstrating that existing unlearning algorithms for LLMs fail to achieve effective forgetting of targeted information.