pith. sign in

Title resolution pending

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

fields

cs.CL 3

years

2025 2 2023 1

representative citing papers

Detecting Pretraining Data from Large Language Models

cs.CL · 2023-10-25 · conditional · novelty 7.0

Min-K% Prob detects pretraining data in LLMs by flagging outlier low-probability words in text, achieving 7.4% better performance than prior methods on the new WIKIMIA benchmark.

Towards the Anonymization of the Language Modeling

cs.CL · 2025-01-05 · unverdicted · novelty 4.0

Authors introduce MLM and CLM specialization methods that avoid memorizing identifiers in sensitive training data while aiming for a privacy-utility tradeoff on medical datasets.

citing papers explorer

Showing 3 of 3 citing papers.

  • Detecting Pretraining Data from Large Language Models cs.CL · 2023-10-25 · conditional · none · ref 100

    Min-K% Prob detects pretraining data in LLMs by flagging outlier low-probability words in text, achieving 7.4% better performance than prior methods on the new WIKIMIA benchmark.

  • Data Compressibility Quantifies LLM Memorization cs.CL · 2025-07-08 · unverdicted · none · ref 53

    Set-level data entropy estimators show linear correlation with LLM memorization scores, forming the Entropy-Memorization Linearity.

  • Towards the Anonymization of the Language Modeling cs.CL · 2025-01-05 · unverdicted · none · ref 29

    Authors introduce MLM and CLM specialization methods that avoid memorizing identifiers in sensitive training data while aiming for a privacy-utility tradeoff on medical datasets.