Quantifying memorization across neural language models

Nicholas Carlini, Daphne Ippolito, Matthew Jagielski, Katherine Lee, Florian Tramer, Chiyuan Zhang · 2023

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

browse 3 citing papers

citation-role summary

background 2

citation-polarity summary

background 2

representative citing papers

Randomized Antipodal Search Done Right for Data Pareto Improvement of LLM Unlearning

cs.LG · 2026-04-17 · unverdicted · novelty 7.0

RASLIK uses randomized antipodal search on linearized influence kernels to achieve data Pareto improvement in LLM unlearning, outperforming baselines with sublinear complexity and double gains in quality and efficiency.

The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale

cs.CL · 2024-06-25 · unverdicted · novelty 6.0

FineWeb is a curated 15T-token web dataset that produces stronger LLMs than prior open collections, while its educational subset sharply improves performance on MMLU and ARC benchmarks.

DataComp-LM: In search of the next generation of training sets for language models

cs.LG · 2024-06-17 · unverdicted · novelty 6.0

DCLM-Baseline dataset lets a 7B model reach 64% 5-shot MMLU accuracy after 2.6T tokens, beating prior open-data models by 6.6 points on MMLU with 40% less compute.

citing papers explorer

Showing 3 of 3 citing papers.

Randomized Antipodal Search Done Right for Data Pareto Improvement of LLM Unlearning cs.LG · 2026-04-17 · unverdicted · none · ref 6
RASLIK uses randomized antipodal search on linearized influence kernels to achieve data Pareto improvement in LLM unlearning, outperforming baselines with sublinear complexity and double gains in quality and efficiency.
The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale cs.CL · 2024-06-25 · unverdicted · none · ref 54
FineWeb is a curated 15T-token web dataset that produces stronger LLMs than prior open collections, while its educational subset sharply improves performance on MMLU and ARC benchmarks.
DataComp-LM: In search of the next generation of training sets for language models cs.LG · 2024-06-17 · unverdicted · none · ref 37
DCLM-Baseline dataset lets a 7B model reach 64% 5-shot MMLU accuracy after 2.6T tokens, beating prior open-data models by 6.6 points on MMLU with 40% less compute.

Quantifying memorization across neural language models

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer