LLM Benchmark Datasets Should Be Contamination-Resistant

Ali Al-Lawati; Dongwon Lee; Jason Lucas; Suhang Wang

arxiv: 2605.19999 · v1 · pith:BGRULHLZnew · submitted 2026-05-19 · 💻 cs.LG · cs.AI· cs.CR

LLM Benchmark Datasets Should Be Contamination-Resistant

Ali Al-Lawati , Jason Lucas , Dongwon Lee , Suhang Wang This is my paper

Pith reviewed 2026-05-20 07:16 UTC · model grok-4.3

classification 💻 cs.LG cs.AIcs.CR

keywords LLM benchmarkscontaminationdata leakageTransformerevaluationgeneralizationunlearnable data

0 comments

The pith

LLM benchmark datasets should be made contamination-resistant so they remain unlearnable during training yet usable for inference.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper points out that many benchmark datasets for large language models have been included in pretraining corpora, making them contaminated and less effective for measuring true generalization. It argues that benchmarks should instead be contamination-resistant, which means they are unlearnable by the model during training but still allow for effective inference and evaluation. This is accomplished by using the asymmetry in how the Transformer architecture handles training versus inference pipelines. The authors also discuss the properties such datasets should have and mathematical ways to make them compatible with different models. They call for the research community to develop these new kinds of benchmarks and integrate them into evaluation practices to restore reliability in LLM testing.

Core claim

The paper claims that benchmark datasets should be contamination-resistant, i.e., unlearnable, but support inference, achieved by leveraging the asymmetry between inference and training pipelines in the Transformer architecture to prevent contamination while maintaining utility.

What carries the argument

Asymmetry between the inference and training pipelines in the Transformer architecture that enables designing datasets resistant to being learned during pretraining.

If this is right

Contaminated datasets will lose their ability to discriminate model performance reliably.
New methodologies for creating unlearnable datasets that still support inference will be required.
Mathematical advancements will allow these datasets to work across various LLM architectures.
Adoption of contamination-resistant benchmarks will improve the reproducibility and reliability of LLM evaluations.
Supporting platforms and methods will need to be developed to facilitate their use.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

This could encourage more rigorous curation of pretraining data to avoid resistant benchmarks.
Similar principles might apply to other machine learning tasks beyond language models.
It may prompt the creation of standardized tools for generating contamination-resistant evaluation sets.
Researchers could test the limits of this asymmetry in newer model architectures.

Load-bearing premise

The asymmetry between the inference and training pipelines in the Transformer architecture can be leveraged to support contamination-resistance without breaking inference utility.

What would settle it

Finding that no practical way exists to create datasets that models fail to learn from in training but can still accurately infer on without utility loss.

Figures

Figures reproduced from arXiv: 2605.19999 by Ali Al-Lawati, Dongwon Lee, Jason Lucas, Suhang Wang.

**Figure 2.** Figure 2: A contamination-resistant benchmark evaluation framework involves generating multiple projections at curation, and translation to the target model to be evaluated at discovery. textual data to achieve this property, it has to be mapped into a latent form to be contamination-resistant. In addition, we define three properties that CRDs must satisfy to achieve contamination resistance with respect to LLM benc… view at source ↗

**Figure 3.** Figure 3: CRDs rely on the asymmetry between the Training and Inference pipelines in the Transformer architecture The resulting outputs are then evaluated against the ground truth provided in the benchmark dataset. Since this ground truth is maintained in plaintext, it allows for seamless integration with standard scoring heuristics (e.g., Exact Match or semantic similarity). 3. Transformer Training/Inference Asymm… view at source ↗

**Figure 4.** Figure 4: Approximate storage requirements for CRD projection of various existing benchmarks calculated based on number of test questions and average question token size using Llama2-7B and PyramidKV compression Limitations While we provide a general mechanism, its practical applicability may depend on architecture-specific factors. CRDs hinge on LLMs implementing the Transformer architecture, and as a result they … view at source ↗

read the original abstract

Benchmark datasets are critical for reproducible, reliable, and discriminative evaluation of LLMs. However, recent studies reveal that many benchmark datasets are included in pretraining corpora, i.e., $\textit{contaminated}$, which diminishes their value as reliable measures of model generalization. In this paper, we argue that benchmark datasets should be $\textit{contamination-resistant}$, i.e., $\textit{unlearnable}$, but support $\textit{inference}$. To accomplish this, we first highlight the wide prevalence of benchmark dataset contamination and outline the properties of contamination-resistant datasets. Second, we highlight how the asymmetry between the inference and training pipelines in the Transformer architecture can be leveraged to support contamination-resistance. Third, we outline mathematical advancements to make these datasets interoperable across various LLM architectures. Based on the above, we call on the community to ensure the reliability of LLM benchmarking by: (i) advancing novel contamination-resistant methodologies, (ii) developing supporting methods and platforms, and (iii) adopting contamination-resistant benchmarks into existing evaluation pipelines.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This is a short position paper flagging benchmark contamination and calling for resistant designs that exploit Transformer training-inference asymmetry, but it supplies no new methods, math, or tests.

read the letter

The core message is straightforward: many LLM benchmarks are already in pretraining data, so we should build new ones that models cannot learn from during training yet still support normal inference. The authors point to the known asymmetry in how Transformers process sequences at train time versus test time as a possible lever, and they list some high-level properties such a dataset would need. They close with a community call to develop the actual techniques and adopt them in evaluations. That framing is clear and the contamination problem they cite is real and already discussed in earlier papers. The piece does a decent job of naming the desired resistance properties in one place and tying them to an architectural observation that others have noted before. Beyond that, there is little new technical content. The abstract mentions outlining mathematical advancements for interoperability, yet the text stays at the level of suggestions and future-work items rather than delivering any concrete construction, proof sketch, or small-scale experiment. No dataset is proposed, no loss modification is written down, and no empirical check is shown. The central assumption—that the asymmetry can be used without damaging inference utility—remains unexamined here. A reader looking for a worked example or even a toy implementation will come away empty. This kind of note is mainly useful to people who already work on LLM evaluation pipelines and want a concise reminder of the contamination issue plus a pointer toward one possible direction. It is not a methods paper and does not contain results that would change day-to-day benchmarking practice. I would not bring it to a reading group focused on new techniques, and I would not cite it for any technical claim. If the venue accepts short position or vision pieces, a light review might be reasonable to check whether the asymmetry idea can be sharpened; otherwise it is closer to a desk-level note than a full referee process.

Referee Report

1 major / 2 minor

Summary. The manuscript argues that LLM benchmark datasets should be made contamination-resistant—unlearnable during pretraining but still useful for inference—by exploiting the asymmetry between the training and inference pipelines in Transformer models. It reviews the prevalence of contamination, describes desired properties of such datasets, sketches how architectural asymmetry might enable this, outlines needs for mathematical advancements to ensure interoperability, and calls for community action to develop and adopt these benchmarks.

Significance. The problem of benchmark contamination is real and undermines the validity of LLM evaluations. If concrete methods for creating contamination-resistant datasets can be developed as suggested, this would represent a major advance in ensuring reliable and reproducible assessment of model capabilities. The paper's normative stance and high-level roadmap are timely and could help direct research efforts toward solving this issue.

major comments (1)

The central proposal relies on leveraging the asymmetry between inference (autoregressive, causal) and training (bidirectional or full attention) pipelines to make data unlearnable yet inferable. However, no specific mechanism, such as a modified loss, data encoding, or architectural constraint, is provided to realize this, leaving the feasibility of the approach unaddressed.

minor comments (2)

The manuscript would benefit from including references to specific studies on contamination to strengthen the prevalence claim.
Clarify the exact definition of 'unlearnable' in mathematical terms in the properties section.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for recognizing the significance of benchmark contamination and for the constructive feedback. We address the major comment below and will revise the manuscript to strengthen the discussion of feasibility.

read point-by-point responses

Referee: The central proposal relies on leveraging the asymmetry between inference (autoregressive, causal) and training (bidirectional or full attention) pipelines to make data unlearnable yet inferable. However, no specific mechanism, such as a modified loss, data encoding, or architectural constraint, is provided to realize this, leaving the feasibility of the approach unaddressed.

Authors: We agree that the manuscript offers only a high-level sketch of how Transformer training-inference asymmetry could support contamination resistance, without detailing a concrete mechanism such as a modified loss function or specific data encoding. The paper is positioned as a call to action and research roadmap rather than a complete technical solution. To address this, we will revise the relevant sections to include preliminary examples of potential mechanisms (e.g., attention-pattern constraints or encodings that are hard to optimize under full attention but support autoregressive decoding) and explicitly discuss open feasibility questions and required mathematical advances for cross-architecture use. revision: yes

Circularity Check

0 steps flagged

No significant circularity; position paper without derivations or fitted results

full rationale

The paper is an advocacy piece that identifies benchmark contamination as a problem and calls for contamination-resistant designs leveraging Transformer inference-training asymmetry. It contains no equations, proofs, fitted parameters, or closed-form derivations that could reduce to their own inputs by construction. The central claims are normative (benchmarks should be made unlearnable yet inference-supporting) rather than technical results whose correctness depends on self-citation chains or self-definitional steps. All outlined properties and mathematical advancements are presented as future work directions, not as completed constructions internal to the paper. The derivation chain is therefore empty and self-contained.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

This is a position paper. No free parameters, axioms, or invented entities are specified in the abstract; the argument rests on the unproven feasibility of creating unlearnable-yet-inferable datasets.

pith-pipeline@v0.9.0 · 5708 in / 1051 out tokens · 55246 ms · 2026-05-20T07:16:04.472567+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

113 extracted references · 113 canonical work pages · 4 internal anchors

[1]

Position: The Most Expensive Part of an

Kandpal, Nikhil and Raffel, Colin , booktitle =. Position: The Most Expensive Part of an. 2025 , publisher =

work page 2025
[2]

Proceedings of the 42nd International Conference on Machine Learning , series =

Position: Human Baselines in Model Evaluations Need Rigor and Transparency (With Recommendations & Reporting Checklist) , author =. Proceedings of the 42nd International Conference on Machine Learning , series =. 2025 , publisher =

work page 2025
[3]

Position:

Hazra, Sanchaita and Majumder, Bodhisattwa Prasad and Chakrabarty, Tuhin , booktitle =. Position:. 2025 , publisher =

work page 2025
[4]

Toward Generalizable Evaluation in the

Cao, Yixin and Hong, Shibo and Li, Xinze and Ying, Jiahao and Ma, Yubo and Liang, Haiyuan and Liu, Yantao and Yao, Zijun and Wang, Xiaozhi and Huang, Dan and others , journal =. Toward Generalizable Evaluation in the. 2025 , url =

work page 2025
[5]

2025 , publisher =

Chen, Simin and Pusarla, Pranav and Ray, Baishakhi , booktitle =. 2025 , publisher =

work page 2025
[6]

arXiv preprint arXiv:2502.17521 , year =

Recent Advances in Large Language Model Benchmarks against Data Contamination: From Static to Dynamic Evaluation , author =. arXiv preprint arXiv:2502.17521 , year =

work page arXiv
[7]

Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers) , pages =

Investigating Data Contamination in Modern Benchmarks for Large Language Models , author =. Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers) , pages =. 2024 , url =

work page 2024
[8]

Findings of the Association for Computational Linguistics: ACL 2024 , pages =

Generalization or Memorization: Data Contamination and Trustworthy Evaluation for Large Language Models , author =. Findings of the Association for Computational Linguistics: ACL 2024 , pages =. 2024 , url =

work page 2024
[9]

arXiv preprint arXiv:2501.06164 , year =

Model Alignment Search , author =. arXiv preprint arXiv:2501.06164 , year =

work page arXiv
[10]

Proceedings of the 36th International Conference on Machine Learning , volume =

Similarity of Neural Network Representations Revisited , author =. Proceedings of the 36th International Conference on Machine Learning , volume =. 2019 , publisher =

work page 2019
[11]

2025 , url =

Data obfuscation through latent space projection for privacy-preserving AI governance: Case studies in medical diagnosis and finance fraud detection , author=. 2025 , url =

work page 2025
[12]

Findings of the Association for Computational Linguistics: EMNLP 2024 , pages =

An Open-Source Data Contamination Report for Large Language Models , author =. Findings of the Association for Computational Linguistics: EMNLP 2024 , pages =. 2024 , url =

work page 2024
[13]

First Conference on Language Modeling , year=

Mamba: Linear-Time Sequence Modeling with Selective State Spaces , author=. First Conference on Language Modeling , year=

work page
[14]

Proceedings of the 42nd International Conference on Machine Learning , series =

Position: Machine Learning Models Have a Supply Chain Problem , author =. Proceedings of the 42nd International Conference on Machine Learning , series =. 2025 , publisher =

work page 2025
[15]

arXiv preprint arXiv:2505.08389 , year =

Towards Contamination Resistant Benchmarks , author =. arXiv preprint arXiv:2505.08389 , year =

work page arXiv
[16]

Proceedings of the AAAI Conference on Artificial Intelligence , volume=

Training on the benchmark is not all you need , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=. 2025 , url =

work page 2025
[17]

arXiv preprint arXiv:2510.05962 , year =

O'Brien, Dayy. arXiv preprint arXiv:2510.05962 , year =

work page arXiv
[18]

2024 , url =

Rajore, Tanmay and Chandran, Nishanth and Sitaram, Sunayana and Gupta, Divya and Sharma, Rahul and Mittal, Kashish and Swaminathan, Manohar , journal =. 2024 , url =

work page 2024
[19]

Proceedings of the 42nd International Conference on Machine Learning , series =

Position: Theory of Mind Benchmarks Are Broken for Large Language Models , author =. Proceedings of the 42nd International Conference on Machine Learning , series =. 2025 , publisher =

work page 2025
[20]

Proceedings of the 31st International Conference on Computational Linguistics , pages =

Towards Data Contamination Detection for Modern Large Language Models: Limitations, Inconsistencies, and Oracle Challenges , author =. Proceedings of the 31st International Conference on Computational Linguistics , pages =. 2025 , url =

work page 2025
[21]

and Kocyigit, Muhammed Yusuf and Poulton, Andrew and Esiobu, David and Lomeli, Maria and Szilvasy, Gergely and Hupkes, Dieuwke , journal =

Singh, Aaditya K. and Kocyigit, Muhammed Yusuf and Poulton, Andrew and Esiobu, David and Lomeli, Maria and Szilvasy, Gergely and Hupkes, Dieuwke , journal =. Evaluation Data Contamination in. 2024 , url =

work page 2024
[22]

The Emperor's New Clothes in Benchmarking? A Rigorous Examination of Mitigation Strategies for

Sun, Yifan and Wang, Han and Li, Dongbai and Wang, Gang and Zhang, Huan , booktitle =. The Emperor's New Clothes in Benchmarking? A Rigorous Examination of Mitigation Strategies for

work page
[23]

arXiv preprint arXiv:2507.16514 , year =

The Ever-Evolving Science Exam , author =. arXiv preprint arXiv:2507.16514 , year =

work page arXiv
[24]

2025 , url =

Wu, Xiaobao and Pan, Liangming and Xie, Yuxi and Zhou, Ruiwen and Zhao, Shuai and Ma, Yubo and Du, Mingzhe and Mao, Rui and Tuan, Luu Anh and Wang, William Yang , booktitle =. 2025 , url =

work page 2025
[25]

2025 , url =

Wu, Changti and Lian, Shijie and Liu, Zihao and Zhang, Lei and Yang, Laurence Tianruo and Chen, Kai , journal =. 2025 , url =

work page 2025
[26]

Xia, Feifan and Liao, Mingyang and Fang, Yuyang and Li, Defang and Xie, Yantong and Li, Weikang and Li, Yang and Xia, Deguo and Huang, Jizhou , journal =. Cross-. 2025 , url =

work page 2025
[27]

arXiv preprint arXiv:2406.04244 , year =

Benchmark Data Contamination of Large Language Models: A Survey , author =. arXiv preprint arXiv:2406.04244 , year =

work page arXiv
[28]

Advances in Neural Information Processing Systems 37 (NeurIPS 2024) , year =

Automating Dataset Updates Towards Reliable and Timely Evaluation of Large Language Models , author =. Advances in Neural Information Processing Systems 37 (NeurIPS 2024) , year =

work page 2024
[29]

Proceedings of the 42nd International Conference on Machine Learning , series =

Position: Editing Large Language Models Poses Serious Safety Risks , author =. Proceedings of the 42nd International Conference on Machine Learning , series =. 2025 , publisher =

work page 2025
[30]

Forty-second International Conference on Machine Learning Position Paper Track , year=

Position: Language model developers should report train-test overlap , author=. Forty-second International Conference on Machine Learning Position Paper Track , year=

work page
[31]

2025 , url =

Zhao, Jingqian and Wang, Bingbing and Tu, Geng and Zhang, Yice and Wang, Qianlong and Liang, Bin and Li, Jing and Xu, Ruifeng , booktitle =. 2025 , url =

work page 2025
[32]

arXiv preprint arXiv:2509.24771 , year=

Latentevolve: Self-evolving test-time scaling in latent space , author=. arXiv preprint arXiv:2509.24771 , year=

work page arXiv
[33]

International Conference on Learning Representations , volume=

Latent space chain-of-embedding enables output-free llm self-evaluation , author=. International Conference on Learning Representations , volume=. 2025 , url=

work page 2025
[34]

Does Data Contamination Detection Work (Well) for

Fu, Yujuan and Uzuner, Ozlem and Yetisgen, Meliha and Xia, Fei , booktitle =. Does Data Contamination Detection Work (Well) for. 2025 , url =

work page 2025
[35]

2025 , url =

Jain, Naman and Han, King and Gu, Alex and Li, Wen-Ding and Yan, Fanjia and Zhang, Tianjun and Wang, Sida and Solar-Lezama, Armando and Sen, Koushik and Stoica, Ion , booktitle =. 2025 , url =

work page 2025
[36]

Proceedings of the 41st International Conference on Machine Learning , series =

Position: The Platonic Representation Hypothesis , author =. Proceedings of the 41st International Conference on Machine Learning , series =. 2024 , publisher =

work page 2024
[37]

Advances in Neural Information Processing Systems , volume =

Revisiting Model Stitching to Compare Neural Representations , author =. Advances in Neural Information Processing Systems , volume =

work page
[38]

Psychometrika , volume =

A Generalized Solution of the Orthogonal Procrustes Problem , author =. Psychometrika , volume =. 1966 , publisher =

work page 1966
[39]

Wang, Runqian and Ghosh, Soumya and Cox, David and Antognini, Diego and Oliva, Aude and Feris, Rogerio and Karlinsky, Leonid , booktitle =. Trans-. 2024 , pages=

work page 2024
[40]

2025 , url =

Farhadzadeh, Farzad and Das, Debasmit and Borse, Shubhankar and Porikli, Fatih , booktitle =. 2025 , url =

work page 2025
[41]

and Shen, Yelong and Wallis, Phillip and Allen-Zhu, Zeyuan and Li, Yuanzhi and Wang, Shean and Wang, Lu and Chen, Weizhu , booktitle =

Hu, Edward J. and Shen, Yelong and Wallis, Phillip and Allen-Zhu, Zeyuan and Li, Yuanzhi and Wang, Shean and Wang, Lu and Chen, Weizhu , booktitle =. 2022 , url =

work page 2022
[42]

International Conference on Learning Representations (ICLR) , year =

Relative Representations Enable Zero-Shot Latent Space Communication , author =. International Conference on Learning Representations (ICLR) , year =

work page
[43]

Advances in Neural Information Processing Systems (NeurIPS) , year =

Latent Space Translation via Semantic Alignment , author =. Advances in Neural Information Processing Systems (NeurIPS) , year =

work page
[44]

Advances in Neural Information Processing Systems , volume =

Norelli, Antonio and Fumero, Marco and Maiorca, Valentino and Moschella, Luca and Rodol. Advances in Neural Information Processing Systems , volume =. 2023 , publisher =

work page 2023
[45]

Manifold Alignment Using

Wang, Chang and Mahadevan, Sridhar , booktitle =. Manifold Alignment Using. 2008 , organization =

work page 2008
[46]

Advances in Neural Information Processing Systems , author =

Hyperbolic Procrustes Analysis Using. Advances in Neural Information Processing Systems , author =. 2021 , volume =

work page 2021
[47]

International Conference on Learning Representations (ICLR) , year =

Offline Bilingual Word Vectors, Orthogonal Transformations and the Inverted Softmax , author =. International Conference on Learning Representations (ICLR) , year =

work page
[48]

Proceedings of the 31st International Conference on Neural Information Processing Systems , volume =

Raghu, Maithra and Gilmer, Justin and Yosinski, Jason and Sohl-Dickstein, Jascha , title =. Proceedings of the 31st International Conference on Neural Information Processing Systems , volume =. 2017 , isbn =

work page 2017
[49]

Psychometrika , volume =

The Approximation of One Matrix by Another of Lower Rank , author =. Psychometrika , volume =. 1936 , publisher =

work page 1936
[50]

The Quarterly Journal of Mathematics , volume =

Symmetric Gauge Functions and Unitarily Invariant Norms , author =. The Quarterly Journal of Mathematics , volume =. 1960 , publisher =

work page 1960
[51]

Psychometrika , volume =

Generalized Procrustes Analysis , author =. Psychometrika , volume =. 1975 , publisher =

work page 1975
[52]

Findings of the Association for Computational Linguistics: NAACL 2024 , pages =

Large Language Models Sensitivity to the Order of Options in Multiple-Choice Questions , author =. Findings of the Association for Computational Linguistics: NAACL 2024 , pages =. 2024 , url =

work page 2024
[53]

International Conference on Machine Learning , pages=

Lever: Learning to verify language-to-code generation with execution , author=. International Conference on Machine Learning , pages=. 2023 , organization=

work page 2023
[54]

Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) , pages =

Data Contamination: From Memorization to Exploitation , author =. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) , pages =. 2022 , address =

work page 2022
[55]

Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing , pages =

Documenting Large Webtext Corpora: A Case Study on the Colossal Clean Crawled Corpus , author =. Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing , pages =. 2021 , url =

work page 2021
[56]

Advances in Neural Information Processing Systems , volume =

Language Models Are Few-Shot Learners , author =. Advances in Neural Information Processing Systems , volume =

work page
[57]

Llama 2: Open Foundation and Fine-Tuned Chat Models

Llama 2: Open Foundation and Fine-Tuned Chat Models , author =. arXiv preprint arXiv:2307.09288 , year =

work page internal anchor Pith review Pith/arXiv arXiv
[58]

Advances in Neural Information Processing Systems , volume =

Attention Is All You Need , author =. Advances in Neural Information Processing Systems , volume =

work page
[59]

SIAM Journal on Computing , volume =

The Knowledge Complexity of Interactive Proof Systems , author =. SIAM Journal on Computing , volume =. 1989 , organization =

work page 1989
[60]

Annual Cryptology Conference , pages =

Non-Interactive Verifiable Computing: Outsourcing Computation to Untrusted Workers , author =. Annual Cryptology Conference , pages =. 2010 , organization =

work page 2010
[61]

arXiv preprint arXiv:2503.23536 , year =

A Survey on Unlearnable Data , author =. arXiv preprint arXiv:2503.23536 , year =

work page arXiv
[62]

Proceedings of the Interna- tional Conference on Learning Representations (ICLR) , year =

Unlearnable Examples: Making Personal Data Unexploitable , author =. Proceedings of the Interna- tional Conference on Learning Representations (ICLR) , year =

work page
[63]

Advances in Neural Information Processing Systems , volume =

Autoregressive Perturbations for Data Poisoning , author =. Advances in Neural Information Processing Systems , volume =

work page
[64]

Advances in Neural Information Processing Systems , volume =

Adversarial Examples Make Strong Poisons , author =. Advances in Neural Information Processing Systems , volume =

work page
[65]

International Conference on Learning Representations (ICLR) , year =

Language Model Inversion , author =. International Conference on Learning Representations (ICLR) , year =

work page
[66]

Safeguarding

Jin, Shuaifan and Pang, Xiaoyi and Wang, Zhibo and Wang, He and Du, Jiacheng and Hu, Jiahui and Ren, Kui , journal =. Safeguarding. 2025 , url =

work page 2025
[67]

Network and Distributed System Security Symposium , year=

Shadow in the Cache: Unveiling and Mitigating Privacy Risks of KV-cache in LLM Inference , author=. Network and Distributed System Security Symposium , year=

work page
[68]

Proceedings of the 2024 ACM SIGSAC Conference on Computer and Communications Security , pages =

Layer-Wise Noise Injection for Privacy-Preserving Large Language Models , author =. Proceedings of the 2024 ACM SIGSAC Conference on Computer and Communications Security , pages =. 2024 , organization =

work page 2024
[69]

2022 , booktitle =

Differentially Private Fine-Tuning of Language Models , author =. 2022 , booktitle =

work page 2022
[70]

Findings of the Association for Computational Linguistics: ACL 2023 , pages =

Membership Inference Attacks against Language Models via Neighbourhood Comparison , author =. Findings of the Association for Computational Linguistics: ACL 2023 , pages =. 2023 , url =

work page 2023
[71]

30th USENIX Security Symposium (USENIX Security 21) , pages =

Extracting Training Data from Large Language Models , author =. 30th USENIX Security Symposium (USENIX Security 21) , pages =. 2021 , url =

work page 2021
[72]

Examining the Robustness of

Siska, Charlotte and Marazopoulou, Katerina and Ailem, Melissa and Bono, James , booktitle =. Examining the Robustness of. 2024 , url =

work page 2024
[73]

Proceedings of the 41st International Conference on Machine Learning , series =

Stealing Part of a Production Language Model , author =. Proceedings of the 41st International Conference on Machine Learning , series =. 2024 , publisher =

work page 2024
[74]

Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume , pages =

Adversarial Stylometry in the Wild: Transferable Lexical Substitution Attacks on Author Profiling , author =. Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume , pages =. 2021 , url =

work page 2021
[75]

Findings of the association for computational linguistics: ACL 2024 , pages=

A comprehensive evaluation of quantization strategies for large language models , author=. Findings of the association for computational linguistics: ACL 2024 , pages=. 2024 , url =

work page 2024
[76]

International Conference on Learning Representations (ICLR) , year =

A Benchmark for Learning to Translate a New Language from One Grammar Book , author =. International Conference on Learning Representations (ICLR) , year =

work page
[77]

arXiv preprint arXiv:2410.16186 , year =

Contamination Report for Multilingual Benchmarks , author =. arXiv preprint arXiv:2410.16186 , year =

work page arXiv
[78]

Advances in Neural Information Processing Systems , volume=

A careful examination of large language model performance on grade school arithmetic , author=. Advances in Neural Information Processing Systems , volume=

work page
[79]

Mistral 7B

Mistral 7B , author =. arXiv preprint arXiv:2310.06825 , year =

work page internal anchor Pith review Pith/arXiv arXiv
[80]

Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing , pages=

Calibrating language models with adaptive temperature scaling , author=. Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing , pages=. 2024 , url=

work page 2024

Showing first 80 references.

[1] [1]

Position: The Most Expensive Part of an

Kandpal, Nikhil and Raffel, Colin , booktitle =. Position: The Most Expensive Part of an. 2025 , publisher =

work page 2025

[2] [2]

Proceedings of the 42nd International Conference on Machine Learning , series =

Position: Human Baselines in Model Evaluations Need Rigor and Transparency (With Recommendations & Reporting Checklist) , author =. Proceedings of the 42nd International Conference on Machine Learning , series =. 2025 , publisher =

work page 2025

[3] [3]

Position:

Hazra, Sanchaita and Majumder, Bodhisattwa Prasad and Chakrabarty, Tuhin , booktitle =. Position:. 2025 , publisher =

work page 2025

[4] [4]

Toward Generalizable Evaluation in the

Cao, Yixin and Hong, Shibo and Li, Xinze and Ying, Jiahao and Ma, Yubo and Liang, Haiyuan and Liu, Yantao and Yao, Zijun and Wang, Xiaozhi and Huang, Dan and others , journal =. Toward Generalizable Evaluation in the. 2025 , url =

work page 2025

[5] [5]

2025 , publisher =

Chen, Simin and Pusarla, Pranav and Ray, Baishakhi , booktitle =. 2025 , publisher =

work page 2025

[6] [6]

arXiv preprint arXiv:2502.17521 , year =

Recent Advances in Large Language Model Benchmarks against Data Contamination: From Static to Dynamic Evaluation , author =. arXiv preprint arXiv:2502.17521 , year =

work page arXiv

[7] [7]

Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers) , pages =

Investigating Data Contamination in Modern Benchmarks for Large Language Models , author =. Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers) , pages =. 2024 , url =

work page 2024

[8] [8]

Findings of the Association for Computational Linguistics: ACL 2024 , pages =

Generalization or Memorization: Data Contamination and Trustworthy Evaluation for Large Language Models , author =. Findings of the Association for Computational Linguistics: ACL 2024 , pages =. 2024 , url =

work page 2024

[9] [9]

arXiv preprint arXiv:2501.06164 , year =

Model Alignment Search , author =. arXiv preprint arXiv:2501.06164 , year =

work page arXiv

[10] [10]

Proceedings of the 36th International Conference on Machine Learning , volume =

Similarity of Neural Network Representations Revisited , author =. Proceedings of the 36th International Conference on Machine Learning , volume =. 2019 , publisher =

work page 2019

[11] [11]

2025 , url =

Data obfuscation through latent space projection for privacy-preserving AI governance: Case studies in medical diagnosis and finance fraud detection , author=. 2025 , url =

work page 2025

[12] [12]

Findings of the Association for Computational Linguistics: EMNLP 2024 , pages =

An Open-Source Data Contamination Report for Large Language Models , author =. Findings of the Association for Computational Linguistics: EMNLP 2024 , pages =. 2024 , url =

work page 2024

[13] [13]

First Conference on Language Modeling , year=

Mamba: Linear-Time Sequence Modeling with Selective State Spaces , author=. First Conference on Language Modeling , year=

work page

[14] [14]

Proceedings of the 42nd International Conference on Machine Learning , series =

Position: Machine Learning Models Have a Supply Chain Problem , author =. Proceedings of the 42nd International Conference on Machine Learning , series =. 2025 , publisher =

work page 2025

[15] [15]

arXiv preprint arXiv:2505.08389 , year =

Towards Contamination Resistant Benchmarks , author =. arXiv preprint arXiv:2505.08389 , year =

work page arXiv

[16] [16]

Proceedings of the AAAI Conference on Artificial Intelligence , volume=

Training on the benchmark is not all you need , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=. 2025 , url =

work page 2025

[17] [17]

arXiv preprint arXiv:2510.05962 , year =

O'Brien, Dayy. arXiv preprint arXiv:2510.05962 , year =

work page arXiv

[18] [18]

2024 , url =

Rajore, Tanmay and Chandran, Nishanth and Sitaram, Sunayana and Gupta, Divya and Sharma, Rahul and Mittal, Kashish and Swaminathan, Manohar , journal =. 2024 , url =

work page 2024

[19] [19]

Proceedings of the 42nd International Conference on Machine Learning , series =

Position: Theory of Mind Benchmarks Are Broken for Large Language Models , author =. Proceedings of the 42nd International Conference on Machine Learning , series =. 2025 , publisher =

work page 2025

[20] [20]

Proceedings of the 31st International Conference on Computational Linguistics , pages =

Towards Data Contamination Detection for Modern Large Language Models: Limitations, Inconsistencies, and Oracle Challenges , author =. Proceedings of the 31st International Conference on Computational Linguistics , pages =. 2025 , url =

work page 2025

[21] [21]

and Kocyigit, Muhammed Yusuf and Poulton, Andrew and Esiobu, David and Lomeli, Maria and Szilvasy, Gergely and Hupkes, Dieuwke , journal =

Singh, Aaditya K. and Kocyigit, Muhammed Yusuf and Poulton, Andrew and Esiobu, David and Lomeli, Maria and Szilvasy, Gergely and Hupkes, Dieuwke , journal =. Evaluation Data Contamination in. 2024 , url =

work page 2024

[22] [22]

The Emperor's New Clothes in Benchmarking? A Rigorous Examination of Mitigation Strategies for

Sun, Yifan and Wang, Han and Li, Dongbai and Wang, Gang and Zhang, Huan , booktitle =. The Emperor's New Clothes in Benchmarking? A Rigorous Examination of Mitigation Strategies for

work page

[23] [23]

arXiv preprint arXiv:2507.16514 , year =

The Ever-Evolving Science Exam , author =. arXiv preprint arXiv:2507.16514 , year =

work page arXiv

[24] [24]

2025 , url =

Wu, Xiaobao and Pan, Liangming and Xie, Yuxi and Zhou, Ruiwen and Zhao, Shuai and Ma, Yubo and Du, Mingzhe and Mao, Rui and Tuan, Luu Anh and Wang, William Yang , booktitle =. 2025 , url =

work page 2025

[25] [25]

2025 , url =

Wu, Changti and Lian, Shijie and Liu, Zihao and Zhang, Lei and Yang, Laurence Tianruo and Chen, Kai , journal =. 2025 , url =

work page 2025

[26] [26]

Xia, Feifan and Liao, Mingyang and Fang, Yuyang and Li, Defang and Xie, Yantong and Li, Weikang and Li, Yang and Xia, Deguo and Huang, Jizhou , journal =. Cross-. 2025 , url =

work page 2025

[27] [27]

arXiv preprint arXiv:2406.04244 , year =

Benchmark Data Contamination of Large Language Models: A Survey , author =. arXiv preprint arXiv:2406.04244 , year =

work page arXiv

[28] [28]

Advances in Neural Information Processing Systems 37 (NeurIPS 2024) , year =

Automating Dataset Updates Towards Reliable and Timely Evaluation of Large Language Models , author =. Advances in Neural Information Processing Systems 37 (NeurIPS 2024) , year =

work page 2024

[29] [29]

Proceedings of the 42nd International Conference on Machine Learning , series =

Position: Editing Large Language Models Poses Serious Safety Risks , author =. Proceedings of the 42nd International Conference on Machine Learning , series =. 2025 , publisher =

work page 2025

[30] [30]

Forty-second International Conference on Machine Learning Position Paper Track , year=

Position: Language model developers should report train-test overlap , author=. Forty-second International Conference on Machine Learning Position Paper Track , year=

work page

[31] [31]

2025 , url =

Zhao, Jingqian and Wang, Bingbing and Tu, Geng and Zhang, Yice and Wang, Qianlong and Liang, Bin and Li, Jing and Xu, Ruifeng , booktitle =. 2025 , url =

work page 2025

[32] [32]

arXiv preprint arXiv:2509.24771 , year=

Latentevolve: Self-evolving test-time scaling in latent space , author=. arXiv preprint arXiv:2509.24771 , year=

work page arXiv

[33] [33]

International Conference on Learning Representations , volume=

Latent space chain-of-embedding enables output-free llm self-evaluation , author=. International Conference on Learning Representations , volume=. 2025 , url=

work page 2025

[34] [34]

Does Data Contamination Detection Work (Well) for

Fu, Yujuan and Uzuner, Ozlem and Yetisgen, Meliha and Xia, Fei , booktitle =. Does Data Contamination Detection Work (Well) for. 2025 , url =

work page 2025

[35] [35]

2025 , url =

Jain, Naman and Han, King and Gu, Alex and Li, Wen-Ding and Yan, Fanjia and Zhang, Tianjun and Wang, Sida and Solar-Lezama, Armando and Sen, Koushik and Stoica, Ion , booktitle =. 2025 , url =

work page 2025

[36] [36]

Proceedings of the 41st International Conference on Machine Learning , series =

Position: The Platonic Representation Hypothesis , author =. Proceedings of the 41st International Conference on Machine Learning , series =. 2024 , publisher =

work page 2024

[37] [37]

Advances in Neural Information Processing Systems , volume =

Revisiting Model Stitching to Compare Neural Representations , author =. Advances in Neural Information Processing Systems , volume =

work page

[38] [38]

Psychometrika , volume =

A Generalized Solution of the Orthogonal Procrustes Problem , author =. Psychometrika , volume =. 1966 , publisher =

work page 1966

[39] [39]

Wang, Runqian and Ghosh, Soumya and Cox, David and Antognini, Diego and Oliva, Aude and Feris, Rogerio and Karlinsky, Leonid , booktitle =. Trans-. 2024 , pages=

work page 2024

[40] [40]

2025 , url =

Farhadzadeh, Farzad and Das, Debasmit and Borse, Shubhankar and Porikli, Fatih , booktitle =. 2025 , url =

work page 2025

[41] [41]

and Shen, Yelong and Wallis, Phillip and Allen-Zhu, Zeyuan and Li, Yuanzhi and Wang, Shean and Wang, Lu and Chen, Weizhu , booktitle =

Hu, Edward J. and Shen, Yelong and Wallis, Phillip and Allen-Zhu, Zeyuan and Li, Yuanzhi and Wang, Shean and Wang, Lu and Chen, Weizhu , booktitle =. 2022 , url =

work page 2022

[42] [42]

International Conference on Learning Representations (ICLR) , year =

Relative Representations Enable Zero-Shot Latent Space Communication , author =. International Conference on Learning Representations (ICLR) , year =

work page

[43] [43]

Advances in Neural Information Processing Systems (NeurIPS) , year =

Latent Space Translation via Semantic Alignment , author =. Advances in Neural Information Processing Systems (NeurIPS) , year =

work page

[44] [44]

Advances in Neural Information Processing Systems , volume =

Norelli, Antonio and Fumero, Marco and Maiorca, Valentino and Moschella, Luca and Rodol. Advances in Neural Information Processing Systems , volume =. 2023 , publisher =

work page 2023

[45] [45]

Manifold Alignment Using

Wang, Chang and Mahadevan, Sridhar , booktitle =. Manifold Alignment Using. 2008 , organization =

work page 2008

[46] [46]

Advances in Neural Information Processing Systems , author =

Hyperbolic Procrustes Analysis Using. Advances in Neural Information Processing Systems , author =. 2021 , volume =

work page 2021

[47] [47]

International Conference on Learning Representations (ICLR) , year =

Offline Bilingual Word Vectors, Orthogonal Transformations and the Inverted Softmax , author =. International Conference on Learning Representations (ICLR) , year =

work page

[48] [48]

Proceedings of the 31st International Conference on Neural Information Processing Systems , volume =

Raghu, Maithra and Gilmer, Justin and Yosinski, Jason and Sohl-Dickstein, Jascha , title =. Proceedings of the 31st International Conference on Neural Information Processing Systems , volume =. 2017 , isbn =

work page 2017

[49] [49]

Psychometrika , volume =

The Approximation of One Matrix by Another of Lower Rank , author =. Psychometrika , volume =. 1936 , publisher =

work page 1936

[50] [50]

The Quarterly Journal of Mathematics , volume =

Symmetric Gauge Functions and Unitarily Invariant Norms , author =. The Quarterly Journal of Mathematics , volume =. 1960 , publisher =

work page 1960

[51] [51]

Psychometrika , volume =

Generalized Procrustes Analysis , author =. Psychometrika , volume =. 1975 , publisher =

work page 1975

[52] [52]

Findings of the Association for Computational Linguistics: NAACL 2024 , pages =

Large Language Models Sensitivity to the Order of Options in Multiple-Choice Questions , author =. Findings of the Association for Computational Linguistics: NAACL 2024 , pages =. 2024 , url =

work page 2024

[53] [53]

International Conference on Machine Learning , pages=

Lever: Learning to verify language-to-code generation with execution , author=. International Conference on Machine Learning , pages=. 2023 , organization=

work page 2023

[54] [54]

Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) , pages =

Data Contamination: From Memorization to Exploitation , author =. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) , pages =. 2022 , address =

work page 2022

[55] [55]

Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing , pages =

Documenting Large Webtext Corpora: A Case Study on the Colossal Clean Crawled Corpus , author =. Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing , pages =. 2021 , url =

work page 2021

[56] [56]

Advances in Neural Information Processing Systems , volume =

Language Models Are Few-Shot Learners , author =. Advances in Neural Information Processing Systems , volume =

work page

[57] [57]

Llama 2: Open Foundation and Fine-Tuned Chat Models

Llama 2: Open Foundation and Fine-Tuned Chat Models , author =. arXiv preprint arXiv:2307.09288 , year =

work page internal anchor Pith review Pith/arXiv arXiv

[58] [58]

Advances in Neural Information Processing Systems , volume =

Attention Is All You Need , author =. Advances in Neural Information Processing Systems , volume =

work page

[59] [59]

SIAM Journal on Computing , volume =

The Knowledge Complexity of Interactive Proof Systems , author =. SIAM Journal on Computing , volume =. 1989 , organization =

work page 1989

[60] [60]

Annual Cryptology Conference , pages =

Non-Interactive Verifiable Computing: Outsourcing Computation to Untrusted Workers , author =. Annual Cryptology Conference , pages =. 2010 , organization =

work page 2010

[61] [61]

arXiv preprint arXiv:2503.23536 , year =

A Survey on Unlearnable Data , author =. arXiv preprint arXiv:2503.23536 , year =

work page arXiv

[62] [62]

Proceedings of the Interna- tional Conference on Learning Representations (ICLR) , year =

Unlearnable Examples: Making Personal Data Unexploitable , author =. Proceedings of the Interna- tional Conference on Learning Representations (ICLR) , year =

work page

[63] [63]

Advances in Neural Information Processing Systems , volume =

Autoregressive Perturbations for Data Poisoning , author =. Advances in Neural Information Processing Systems , volume =

work page

[64] [64]

Advances in Neural Information Processing Systems , volume =

Adversarial Examples Make Strong Poisons , author =. Advances in Neural Information Processing Systems , volume =

work page

[65] [65]

International Conference on Learning Representations (ICLR) , year =

Language Model Inversion , author =. International Conference on Learning Representations (ICLR) , year =

work page

[66] [66]

Safeguarding

Jin, Shuaifan and Pang, Xiaoyi and Wang, Zhibo and Wang, He and Du, Jiacheng and Hu, Jiahui and Ren, Kui , journal =. Safeguarding. 2025 , url =

work page 2025

[67] [67]

Network and Distributed System Security Symposium , year=

Shadow in the Cache: Unveiling and Mitigating Privacy Risks of KV-cache in LLM Inference , author=. Network and Distributed System Security Symposium , year=

work page

[68] [68]

Proceedings of the 2024 ACM SIGSAC Conference on Computer and Communications Security , pages =

Layer-Wise Noise Injection for Privacy-Preserving Large Language Models , author =. Proceedings of the 2024 ACM SIGSAC Conference on Computer and Communications Security , pages =. 2024 , organization =

work page 2024

[69] [69]

2022 , booktitle =

Differentially Private Fine-Tuning of Language Models , author =. 2022 , booktitle =

work page 2022

[70] [70]

Findings of the Association for Computational Linguistics: ACL 2023 , pages =

Membership Inference Attacks against Language Models via Neighbourhood Comparison , author =. Findings of the Association for Computational Linguistics: ACL 2023 , pages =. 2023 , url =

work page 2023

[71] [71]

30th USENIX Security Symposium (USENIX Security 21) , pages =

Extracting Training Data from Large Language Models , author =. 30th USENIX Security Symposium (USENIX Security 21) , pages =. 2021 , url =

work page 2021

[72] [72]

Examining the Robustness of

Siska, Charlotte and Marazopoulou, Katerina and Ailem, Melissa and Bono, James , booktitle =. Examining the Robustness of. 2024 , url =

work page 2024

[73] [73]

Proceedings of the 41st International Conference on Machine Learning , series =

Stealing Part of a Production Language Model , author =. Proceedings of the 41st International Conference on Machine Learning , series =. 2024 , publisher =

work page 2024

[74] [74]

Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume , pages =

Adversarial Stylometry in the Wild: Transferable Lexical Substitution Attacks on Author Profiling , author =. Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume , pages =. 2021 , url =

work page 2021

[75] [75]

Findings of the association for computational linguistics: ACL 2024 , pages=

A comprehensive evaluation of quantization strategies for large language models , author=. Findings of the association for computational linguistics: ACL 2024 , pages=. 2024 , url =

work page 2024

[76] [76]

International Conference on Learning Representations (ICLR) , year =

A Benchmark for Learning to Translate a New Language from One Grammar Book , author =. International Conference on Learning Representations (ICLR) , year =

work page

[77] [77]

arXiv preprint arXiv:2410.16186 , year =

Contamination Report for Multilingual Benchmarks , author =. arXiv preprint arXiv:2410.16186 , year =

work page arXiv

[78] [78]

Advances in Neural Information Processing Systems , volume=

A careful examination of large language model performance on grade school arithmetic , author=. Advances in Neural Information Processing Systems , volume=

work page

[79] [79]

Mistral 7B

Mistral 7B , author =. arXiv preprint arXiv:2310.06825 , year =

work page internal anchor Pith review Pith/arXiv arXiv

[80] [80]

Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing , pages=

Calibrating language models with adaptive temperature scaling , author=. Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing , pages=. 2024 , url=

work page 2024