CatShift detects training data membership in LLMs by comparing output shifts induced by fine-tuning on member versus non-member data, relying on catastrophic forgetting without requiring logit access.
Noisy neighbors: Efficient membership inference attacks against llms
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.CL 2years
2025 2verdicts
UNVERDICTED 2representative citing papers
Set-level data entropy estimators show linear correlation with LLM memorization scores, forming the Entropy-Memorization Linearity.
citing papers explorer
-
Hey, That's My Data! Token-Only Dataset Inference in Large Language Models
CatShift detects training data membership in LLMs by comparing output shifts induced by fine-tuning on member versus non-member data, relying on catastrophic forgetting without requiring logit access.
-
Data Compressibility Quantifies LLM Memorization
Set-level data entropy estimators show linear correlation with LLM memorization scores, forming the Entropy-Memorization Linearity.