Lossless Anti-Distillation Sampling

Di He; Jingchu Gai; Xinyue Ai; Zhang Zhang; Zhenyu He; Zibo Diao

arxiv: 2605.18829 · v1 · pith:K6XP26PGnew · submitted 2026-05-12 · 💻 cs.LG · cs.CR

Lossless Anti-Distillation Sampling

Zibo Diao , Jingchu Gai , Xinyue Ai , Zhang Zhang , Zhenyu He , Di He This is my paper

Pith reviewed 2026-05-20 21:35 UTC · model grok-4.3

classification 💻 cs.LG cs.CR

keywords anti-distillationsampling methodgenerative modelsmodel distillation defenseuniform convergencegeneralization gapsemantic bucketing

0 comments

The pith

Tying random seeds to query semantics correlates distillation data and slows student convergence while leaving single users unaffected.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces Lossless Anti-Distillation Sampling to defend against distillation attacks on generative models. It works by making the random seed for each generation depend on the semantic content of the query and the user's query count. Benign users still get fresh independent samples each time. Distillers who spread queries across accounts end up with correlated samples in the same semantic groups. Theory based on uniform convergence shows this correlation slows the rate at which the distilled model's generalization gap shrinks. Experiments on images, math, and code back this up.

Core claim

Lossless Anti-Distillation Sampling derives the randomness for each generation from a private seed based on query semantics and query frequency. This ensures independent sampling for individual users but induces correlation across accounts that harvest similar queries, thereby degrading the sample diversity and generalization of the distilled model as proven via uniform convergence bounds in both unconditional and conditional settings.

What carries the argument

The semantic-content-and-frequency-determined private seed that controls the generation randomness.

If this is right

LADS maintains exact statistical fidelity for benign single-account users.
The harvested dataset for distillers becomes less diverse due to repeated seeds in semantic buckets.
Uniform convergence theory predicts a slower convergence rate for the distiller's generalization gap compared to i.i.d. sampling.
Empirical results show degraded performance in distilled models for image generation, mathematical reasoning, and code generation.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If semantic bucketing can be made robust, similar seed-sharing could apply to other data-harvesting threats like model extraction.
Distillers might try to evade by paraphrasing queries heavily, but this could reduce the effectiveness of their own training data.
Extending to conditional generation suggests applications in text or multimodal models where semantics are clearer.

Load-bearing premise

Queries can be reliably grouped into semantic buckets so that similar queries from different accounts share the same seed and this correlation measurably harms the distilled model's learning.

What would settle it

A distiller using many accounts to query semantically similar prompts and training a student model that achieves generalization performance comparable to one trained on standard i.i.d. samples from the original model.

Figures

Figures reproduced from arXiv: 2605.18829 by Di He, Jingchu Gai, Xinyue Ai, Zhang Zhang, Zhenyu He, Zibo Diao.

read the original abstract

Frontier commercial generative models face a growing threat from distillation, whereby a distiller harvests generated responses and trains a competing model of its own at drastically lower cost. Existing defenses either rely on modifying the models outputs, thereby sacrificing response quality for benign users, or on behavioral detection methods, which can be readily circumvented by distributing queries across multiple accounts. In this work, we propose Lossless Anti-Distillation Sampling (LADS), a novel sampling scheme specifically designed to counter multi-account distillation while maintaining a lossless experience for benign users. Concretely, LADS derives the randomness underlying each generation from a private seed determined by the semantic content of the query and the number of times the user has queried the model. By construction, every benign user receives a response independently sampled from the original model at each visit, and thus experiences no distortion. In contrast, for a distiller, different accounts share latent randomness whenever their queries fall in the same semantic bucket. As a result, the harvested data becomes correlated, potentially reducing sample diversity and degrading generalization. Using uniform convergence theory, we show that LADS provably degrades the convergence rate of the distillers generalization gap relative to standard i.i.d. sampling in both unconditional and conditional generation settings. Experiments on image generation, mathematical reasoning, and code generation confirm that LADS substantially degrades the performance of distilled students while preserving exact statistical fidelity for individual users.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

LADS ties sampling randomness to semantic buckets and per-user query counts to correlate outputs for multi-account distillers while leaving single users untouched, but the uniform convergence claim needs explicit dependence handling to support a slower rate.

read the letter

The one or two things to know about this paper are that it proposes a sampling method to degrade distillation by sharing randomness across similar queries from different accounts, and it claims this provably slows the distiller's convergence using uniform convergence theory. What is new is the concrete mechanism: mapping queries to semantic buckets and combining that with the per-user query count to set the latent seed for sampling. This ensures that individual users always get independent samples from the model, preserving exact statistical behavior for them. For a distiller using multiple accounts, queries in the same bucket will reuse the same randomness, leading to correlated outputs and less diverse training data. The paper does well in framing a real commercial issue and offering a defense that avoids the usual quality trade-off. The experiments across image generation, mathematical reasoning, and code generation provide evidence that the distilled models suffer performance drops, which supports the practical value. The soft spots are mainly in the theoretical argument. The claim is that uniform convergence shows a degraded convergence rate for the generalization gap under this correlated sampling. However, standard uniform convergence bounds rely on independent and identically distributed samples. Here, the positive dependence introduced by the shared seeds means the usual O(1/sqrt(n)) rate may not directly translate to a slower rate without quantifying the dependence, for example through effective sample size or beta-mixing conditions. If the full paper includes such an analysis, it would be solid; based on the abstract, it appears to apply the bound directly, which could be a gap. The experimental confirmation of degradation is helpful, but more details on bucket definition, statistical tests, and controls for other factors would help rule out post-hoc effects. This work is for researchers and practitioners in machine learning security and deployment, especially those concerned with protecting proprietary models from distillation attacks. A reader interested in novel sampling strategies or defenses against model extraction would get value from the idea and the empirical results. It deserves a serious referee because the core idea is novel enough and the problem is timely, even if revisions are needed on the theory. I recommend sending it for peer review, with reviewers asked to focus on the handling of dependent samples in the proofs.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes Lossless Anti-Distillation Sampling (LADS), which derives generation randomness from a private seed based on query semantics and per-user query count. Benign users receive independent samples with no quality loss, while distillers using multiple accounts obtain positively correlated samples within semantic buckets. The central theoretical claim is that uniform convergence theory establishes a provably slower convergence rate for the distiller's generalization gap relative to i.i.d. sampling in both unconditional and conditional settings. Experiments across image generation, mathematical reasoning, and code generation are reported to confirm substantial degradation in distilled student performance.

Significance. If the dependence-adjusted uniform convergence argument holds, the work supplies a practical, zero-cost defense against multi-account distillation that preserves exact statistical fidelity for legitimate users. The three-domain experiments supply direct empirical support for the degradation effect. A notable strength is the explicit separation between user experience (lossless) and attacker utility (degraded), together with the attempt to ground the defense in a classical learning-theoretic tool.

major comments (2)

[§4] §4 (Uniform Convergence Analysis): the derivation applies standard i.i.d. uniform convergence bounds directly to the LADS-induced samples without deriving a quantitative dependence measure (e.g., β-mixing coefficients, covariance decay rate, or effective sample size n_eff < n). Consequently it is not shown that the generalization-gap rate is strictly slower than the classical O(1/√n) rather than merely carrying a worse constant.
[§4.2] §4.2 (Conditional Generation Setting): the extension claims degradation for conditional sampling but provides no explicit bound that accounts for the interaction between the conditioning variable and the semantic-bucket dependence; the argument therefore does not yet establish the claimed rate separation relative to i.i.d. conditional sampling.

minor comments (2)

[§3] The operational definition of semantic buckets (including how queries are mapped and how bucket collisions are detected across accounts) is described at a high level; a short pseudocode block or concrete example would improve clarity and reproducibility.
[§5] Table 2 and Figure 3 report performance gaps but omit standard errors or statistical significance tests; adding these would strengthen the empirical claim that degradation is robust rather than an artifact of particular seeds or splits.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive and detailed comments. We agree that the uniform convergence sections require strengthening with explicit quantitative dependence measures, and we will revise the manuscript to address both points.

read point-by-point responses

Referee: [§4] §4 (Uniform Convergence Analysis): the derivation applies standard i.i.d. uniform convergence bounds directly to the LADS-induced samples without deriving a quantitative dependence measure (e.g., β-mixing coefficients, covariance decay rate, or effective sample size n_eff < n). Consequently it is not shown that the generalization-gap rate is strictly slower than the classical O(1/√n) rather than merely carrying a worse constant.

Authors: We thank the referee for this observation. While the positive intra-bucket correlations induced by shared seeds intuitively reduce sample diversity and thereby enlarge the generalization gap, the current write-up does not quantify the dependence. In the revised manuscript we will add an explicit β-mixing analysis: we model the query sequence as a Markov chain over semantic buckets and derive a mixing coefficient β(k) that decays with the number of distinct buckets visited. From this we obtain an effective sample size n_eff = n / (1 + ρ) with ρ > 0 determined by the bucket collision probability, yielding a uniform convergence bound whose leading term is strictly larger than the classical i.i.d. O(1/√n) bound for any finite n. This establishes the claimed rate separation. revision: yes
Referee: [§4.2] §4.2 (Conditional Generation Setting): the extension claims degradation for conditional sampling but provides no explicit bound that accounts for the interaction between the conditioning variable and the semantic-bucket dependence; the argument therefore does not yet establish the claimed rate separation relative to i.i.d. conditional sampling.

Authors: We agree that the conditional case needs a more careful treatment. In the revision we will derive an explicit uniform convergence bound for the conditional setting by considering the joint measure over (query, response) pairs. The semantic-bucket dependence is now conditioned on the query embedding; we will show that the resulting mixing coefficient remains strictly positive and obtain a generalization-gap bound that is larger than the corresponding i.i.d. conditional bound by a factor that depends on the bucket collision probability under the conditional distribution. This establishes the rate separation for the conditional case as well. revision: yes

Circularity Check

0 steps flagged

No significant circularity: derivation applies external uniform convergence theory

full rationale

The paper's load-bearing theoretical step invokes standard uniform convergence theory to conclude that LADS-induced positive dependence (via semantic bucketing and shared seeds) degrades the distiller's generalization-gap convergence rate relative to i.i.d. sampling. No equations or text in the provided abstract or description reduce this claim to a quantity defined by the authors' own fits, self-citations, or ansatzes; the result is framed as a direct consequence of applying existing statistical learning bounds to the described sampling process. The derivation is therefore self-contained against external benchmarks and does not exhibit any of the enumerated circular patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim depends on the existence of a reliable semantic bucketing function and on the applicability of uniform convergence to the specific correlation structure induced by the sampling scheme.

axioms (1)

domain assumption Uniform convergence theory can be applied to bound the generalization gap of a student model trained on correlated samples produced by LADS.
Invoked to prove degraded convergence rate relative to i.i.d. sampling.

pith-pipeline@v0.9.0 · 5788 in / 1348 out tokens · 46710 ms · 2026-05-20T21:35:51.710300+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

121 extracted references · 121 canonical work pages · 13 internal anchors

[5]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

Defending against model stealing attacks with adaptive misinformation , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

work page
[6]

Advances in Neural Information Processing Systems , volume=

Watermarking makes language models radioactive , author=. Advances in Neural Information Processing Systems , volume=

work page
[7]

International Conference on Machine Learning , pages=

Protecting language generation models via invisible watermarking , author=. International Conference on Machine Learning , pages=. 2023 , organization=

work page 2023
[8]

2019 IEEE European Symposium on Security and Privacy (EuroS&P) , pages=

PRADA: protecting against DNN model stealing attacks , author=. 2019 IEEE European Symposium on Security and Privacy (EuroS&P) , pages=. 2019 , organization=

work page 2019
[9]

Proceedings of the AAAI Conference on Artificial Intelligence , volume=

Defense against model stealing based on account-aware distribution discrepancy , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=

work page
[10]

IEEE Transactions on Information Forensics and Security , year=

Queen: Query unlearning against model extraction , author=. IEEE Transactions on Information Forensics and Security , year=

work page
[11]

The Annals of Probability , pages=

Rates of convergence for empirical processes of stationary mixing sequences , author=. The Annals of Probability , pages=. 1994 , publisher=

work page 1994
[12]

Advances in neural information processing systems , volume=

Rademacher complexity bounds for non-iid processes , author=. Advances in neural information processing systems , volume=

work page
[13]

2013 , publisher=

The nature of statistical learning theory , author=. 2013 , publisher=

work page 2013
[14]

Advances in neural information processing systems , volume=

Denoising diffusion probabilistic models , author=. Advances in neural information processing systems , volume=

work page
[15]

2021 , eprint=

Score-Based Generative Modeling through Stochastic Differential Equations , author=. 2021 , eprint=

work page 2021
[16]

2022 , eprint=

Elucidating the Design Space of Diffusion-Based Generative Models , author=. 2022 , eprint=

work page 2022
[17]

1954 , publisher=

Statistical theory of extreme values and some practical applications: a series of lectures , author=. 1954 , publisher=

work page 1954
[18]

Advances in neural information processing systems , volume=

A* sampling , author=. Advances in neural information processing systems , volume=

work page
[19]

Advances in neural information processing systems , volume=

Generative modeling by estimating gradients of the data distribution , author=. Advances in neural information processing systems , volume=

work page
[20]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

Analyzing and improving the training dynamics of diffusion models , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

work page
[21]

1959 , publisher=

Individual choice behavior , author=. 1959 , publisher=

work page 1959
[22]

Journal of Mathematical Psychology , volume=

The relationship between Luce's choice axiom, Thurstone's theory of comparative judgment, and the double exponential distribution , author=. Journal of Mathematical Psychology , volume=. 1977 , publisher=

work page 1977
[23]

2011 international conference on computer vision , pages=

Perturb-and-map random fields: Using discrete optimization to learn and sample from energy models , author=. 2011 international conference on computer vision , pages=. 2011 , organization=

work page 2011
[25]

A vector-contraction inequality for

Maurer, Andreas , booktitle=. A vector-contraction inequality for. 2016 , organization=

work page 2016
[26]

Advances in Neural Information Processing Systems (NeurIPS) , year=

On the complexity of linear prediction: Risk bounds, margin bounds, and regularization , author=. Advances in Neural Information Processing Systems (NeurIPS) , year=

work page
[27]

Foundations of Machine Learning , author=

work page
[28]

and Mendelson, Shahar , journal=

Bartlett, Peter L. and Mendelson, Shahar , journal=. Rademacher and

work page
[29]

Koltchinskii, Vladimir , journal=. Local. 2006 , publisher=

work page 2006
[30]

High-Dimensional Statistics: A Non-Asymptotic Viewpoint , author=

work page
[31]

Vldb , volume=

Similarity search in high dimensions via hashing , author=. Vldb , volume=

work page
[32]

Advances in neural information processing systems , volume=

Gans trained by a two time-scale update rule converge to a local nash equilibrium , author=. Advances in neural information processing systems , volume=

work page
[33]

Advances in Neural Information Processing Systems , volume=

Dart-math: Difficulty-aware rejection tuning for mathematical problem-solving , author=. Advances in Neural Information Processing Systems , volume=

work page
[34]

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages=

Mathfusion: Enhancing mathematical problem-solving of llm through instruction fusion , author=. Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages=

work page
[37]

Shao, Zhihong and Wang, Peiyi and Zhu, Qihao and Xu, Runxin and Song, Junxiao and Zhang, Mingchuan and Li, Y. K. and Wu, Y. and Guo, Daya , title =. CoRR , volume =. 2024 , url =

work page 2024
[39]

Qwen3 Embedding: Advancing Text Embedding and Reranking Through Foundation Models

Qwen3 Embedding: Advancing Text Embedding and Reranking Through Foundation Models , author=. arXiv preprint arXiv:2506.05176 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[40]

Advances in neural information processing systems , volume=

Generative adversarial nets , author=. Advances in neural information processing systems , volume=

work page
[41]

Auto-Encoding Variational Bayes

Auto-encoding variational bayes , author=. arXiv preprint arXiv:1312.6114 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[42]

2018 , publisher=

Improving language understanding by generative pre-training , author=. 2018 , publisher=

work page 2018
[43]

NeurIPS Deep Learning and Representation Learning Workshop , year=

Distilling the Knowledge in a Neural Network , author=. NeurIPS Deep Learning and Representation Learning Workshop , year=

work page
[44]

DistilBERT, a distilled version of

Sanh, Victor and Debut, Lysandre and Chaumond, Julien and Wolf, Thomas , booktitle=. DistilBERT, a distilled version of

work page
[45]

Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing (EMNLP) , pages=

Sequence-Level Knowledge Distillation , author=. Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing (EMNLP) , pages=

work page 2016
[46]

Stealing Machine Learning Models via Prediction

Tram. Stealing Machine Learning Models via Prediction. 25th USENIX Security Symposium (USENIX Security 16) , pages=

work page
[47]

and Papernot, Nicolas and Iyyer, Mohit , booktitle=

Krishna, Kalpesh and Tomar, Gaurav Singh and Parikh, Ankur P. and Papernot, Nicolas and Iyyer, Mohit , booktitle=. Thieves on Sesame Street!

work page
[48]

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) , pages=

Imitation Attacks and Defenses for Black-box Machine Translation Systems , author=. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) , pages=

work page 2020
[49]

Romero, Adriana and Ballas, Nicolas and Kahou, Samira Ebrahimi and Chassang, Antoine and Gatta, Carlo and Bengio, Yoshua , booktitle=

work page
[50]

Proceedings of the 35th International Conference on Machine Learning (ICML) , year=

Born-Again Neural Networks , author=. Proceedings of the 35th International Conference on Machine Learning (ICML) , year=

work page
[51]

International Conference on Learning Representations (ICLR) , year=

Contrastive Representation Distillation , author=. International Conference on Learning Representations (ICLR) , year=

work page
[52]

Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL) , year=

Teaching Small Language Models to Reason , author=. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL) , year=

work page
[53]

Distilling Step-by-Step!

Hsieh, Cheng-Yu and Li, Chun-Liang and Yeh, Chih-Kuan and Nakhost, Hootan and Fujii, Yasuhisa and Ratner, Alexander and Krishna, Ranjay and Lee, Chen-Yu and Pfister, Tomas , booktitle=. Distilling Step-by-Step!

work page
[54]

Gu, Yuxian and Dong, Li and Wei, Furu and Huang, Minlie , booktitle=

work page
[55]

Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security (ASIACCS) , pages=

Practical Black-Box Attacks Against Machine Learning , author=. Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security (ASIACCS) , pages=

work page 2017
[56]

29th USENIX Security Symposium (USENIX Security 20) , pages=

High Accuracy and High Fidelity Extraction of Neural Networks , author=. 29th USENIX Security Symposium (USENIX Security 20) , pages=

work page
[57]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , pages=

Data-Free Model Extraction , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , pages=

work page
[58]

Pal, Soham and Gupta, Yash and Shukla, Aditya and Kanade, Aditya and Shevade, Shirish and Ganapathy, Vinod , booktitle=

work page
[59]

Proceedings of the 41st International Conference on Machine Learning (ICML) , year=

Stealing Part of a Production Language Model , author=. Proceedings of the 41st International Conference on Machine Learning (ICML) , year=

work page
[60]

27th USENIX Security Symposium (USENIX Security 18) , pages=

Turning Your Weakness into a Strength: Watermarking Deep Neural Networks by Backdooring , author=. 27th USENIX Security Symposium (USENIX Security 18) , pages=

work page
[61]

Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval (ICMR) , pages=

Embedding Watermarks into Deep Neural Networks , author=. Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval (ICMR) , pages=

work page 2017
[62]

30th USENIX Security Symposium (USENIX Security 21) , pages=

Entangled Watermarks as a Defense against Model Extraction , author=. 30th USENIX Security Symposium (USENIX Security 21) , pages=

work page
[63]

2009 IEEE conference on computer vision and pattern recognition , pages=

Imagenet: A large-scale hierarchical image database , author=. 2009 IEEE conference on computer vision and pattern recognition , pages=. 2009 , organization=

work page 2009
[64]

International conference on machine learning , pages=

Deep unsupervised learning using nonequilibrium thermodynamics , author=. International conference on machine learning , pages=. 2015 , organization=

work page 2015
[67]

2025 , eprint=

Qwen2.5 Technical Report , author=. 2025 , eprint=

work page 2025
[68]

Qwen2.5-Coder Technical Report

Qwen2. 5-coder technical report , author=. arXiv preprint arXiv:2409.12186 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[71]

Turning your weakness into a strength: Watermarking deep neural networks by backdooring

Yossi Adi, Carsten Baum, Moustapha Ciss \'e , Benny Pinkas, and Joseph Keshet. Turning your weakness into a strength: Watermarking deep neural networks by backdooring. In 27th USENIX Security Symposium (USENIX Security 18), pages 1615--1631, 2018

work page 2018
[72]

Bartlett and Shahar Mendelson

Peter L. Bartlett and Shahar Mendelson. Rademacher and G aussian complexities: Risk bounds and structural results. Journal of Machine Learning Research, 3: 0 463--482, 2002

work page 2002
[73]

Stealing part of a production language model

Nicholas Carlini, Daniel Paleka, Krishnamurthy Dj Dvijotham, Thomas Steinke, Jonathan Hayase, A Feder Cooper, Katherine Lee, Matthew Jagielski, Milad Nasr, Arthur Conmy, et al. Stealing part of a production language model. Proceedings of the 41st International Conference on Machine Learning (ICML), 2024

work page 2024
[74]

Queen: Query unlearning against model extraction

Huajie Chen, Tianqing Zhu, Lefeng Zhang, Bo Liu, Derui Wang, Wanlei Zhou, and Minhui Xue. Queen: Query unlearning against model extraction. IEEE Transactions on Information Forensics and Security, 2025

work page 2025
[75]

Evaluating Large Language Models Trained on Code

Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Ponde De Oliveira Pinto, Jared Kaplan, Harri Edwards, Yuri Burda, Nicholas Joseph, Greg Brockman, et al. Evaluating large language models trained on code. arXiv preprint arXiv:2107.03374, 2021

work page internal anchor Pith review Pith/arXiv arXiv 2021
[76]

Training Verifiers to Solve Math Word Problems

Karl Cobbe, Vineet Kosaraju, Mohammad Bavarian, Mark Chen, Heewoo Jun, Lukasz Kaiser, Matthias Plappert, Jerry Tworek, Jacob Hilton, Reiichiro Nakano, et al. Training verifiers to solve math word problems. arXiv preprint arXiv:2110.14168, 2021

work page internal anchor Pith review Pith/arXiv arXiv 2021
[77]

Imagenet: A large-scale hierarchical image database

Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, pages 248--255. Ieee, 2009

work page 2009
[78]

Born-again neural networks

Tommaso Furlanello, Zachary Lipton, Michael Tschannen, Laurent Itti, and Anima Anandkumar. Born-again neural networks. In Proceedings of the 35th International Conference on Machine Learning (ICML), 2018

work page 2018
[79]

Generative adversarial nets

Ian J Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Generative adversarial nets. Advances in neural information processing systems, 27, 2014

work page 2014
[80]

The Llama 3 Herd of Models

Aaron Grattafiori, Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pandey, Abhishek Kadian, Ahmad Al-Dahle, Aiesha Letman, Akhil Mathur, Alan Schelten, Alex Vaughan, et al. The llama 3 herd of models. arXiv preprint arXiv:2407.21783, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024
[81]

M ini LLM : Knowledge distillation of large language models

Yuxian Gu, Li Dong, Furu Wei, and Minlie Huang. M ini LLM : Knowledge distillation of large language models. In International Conference on Learning Representations (ICLR), 2024

work page 2024
[82]

Statistical theory of extreme values and some practical applications: a series of lectures, volume 33

Emil Julius Gumbel. Statistical theory of extreme values and some practical applications: a series of lectures, volume 33. US Government Printing Office, 1954

work page 1954
[83]

On the Partition Function and Random Maximum A-Posteriori Perturbations

Tamir Hazan and Tommi Jaakkola. On the partition function and random maximum a-posteriori perturbations. arXiv preprint arXiv:1206.6410, 2012

work page internal anchor Pith review Pith/arXiv arXiv 2012
[84]

Measuring Mathematical Problem Solving With the MATH Dataset

Dan Hendrycks, Collin Burns, Saurav Kadavath, Akul Arora, Steven Basart, Eric Tang, Dawn Song, and Jacob Steinhardt. Measuring mathematical problem solving with the math dataset. arXiv preprint arXiv:2103.03874, 2021

work page internal anchor Pith review Pith/arXiv arXiv 2021
[85]

Gans trained by a two time-scale update rule converge to a local nash equilibrium

Martin Heusel, Hubert Ramsauer, Thomas Unterthiner, Bernhard Nessler, and Sepp Hochreiter. Gans trained by a two time-scale update rule converge to a local nash equilibrium. Advances in neural information processing systems, 30, 2017

work page 2017
[86]

Distilling the knowledge in a neural network

Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. Distilling the knowledge in a neural network. In NeurIPS Deep Learning and Representation Learning Workshop, 2015

work page 2015
[87]

Denoising diffusion probabilistic models

Jonathan Ho, Ajay Jain, and Pieter Abbeel. Denoising diffusion probabilistic models. Advances in neural information processing systems, 33: 0 6840--6851, 2020

work page 2020
[88]

Distilling step-by-step! O utperforming larger language models with less training data and smaller model sizes

Cheng-Yu Hsieh, Chun-Liang Li, Chih-Kuan Yeh, Hootan Nakhost, Yasuhisa Fujii, Alexander Ratner, Ranjay Krishna, Chen-Yu Lee, and Tomas Pfister. Distilling step-by-step! O utperforming larger language models with less training data and smaller model sizes. In Findings of the Association for Computational Linguistics (ACL Findings), 2023

work page 2023
[89]

High accuracy and high fidelity extraction of neural networks

Matthew Jagielski, Nicholas Carlini, David Berthelot, Alex Kurakin, and Nicolas Papernot. High accuracy and high fidelity extraction of neural networks. In 29th USENIX Security Symposium (USENIX Security 20), pages 1345--1362, 2020

work page 2020
[90]

Choquette-Choo, Varun Chandrasekaran, and Nicolas Papernot

Hengrui Jia, Christopher A. Choquette-Choo, Varun Chandrasekaran, and Nicolas Papernot. Entangled watermarks as a defense against model extraction. In 30th USENIX Security Symposium (USENIX Security 21), pages 1937--1954, 2021

work page 1937
[91]

Prada: protecting against dnn model stealing attacks

Mika Juuti, Sebastian Szyller, Samuel Marchal, and N Asokan. Prada: protecting against dnn model stealing attacks. In 2019 IEEE European Symposium on Security and Privacy (EuroS&P), pages 512--527. IEEE, 2019

work page 2019
[92]

Defending against model stealing attacks with adaptive misinformation

Sanjay Kariyappa and Moinuddin K Qureshi. Defending against model stealing attacks with adaptive misinformation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 770--778, 2020

work page 2020

Showing first 80 references.

[1] [5]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

Defending against model stealing attacks with adaptive misinformation , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

work page

[2] [6]

Advances in Neural Information Processing Systems , volume=

Watermarking makes language models radioactive , author=. Advances in Neural Information Processing Systems , volume=

work page

[3] [7]

International Conference on Machine Learning , pages=

Protecting language generation models via invisible watermarking , author=. International Conference on Machine Learning , pages=. 2023 , organization=

work page 2023

[4] [8]

2019 IEEE European Symposium on Security and Privacy (EuroS&P) , pages=

PRADA: protecting against DNN model stealing attacks , author=. 2019 IEEE European Symposium on Security and Privacy (EuroS&P) , pages=. 2019 , organization=

work page 2019

[5] [9]

Proceedings of the AAAI Conference on Artificial Intelligence , volume=

Defense against model stealing based on account-aware distribution discrepancy , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=

work page

[6] [10]

IEEE Transactions on Information Forensics and Security , year=

Queen: Query unlearning against model extraction , author=. IEEE Transactions on Information Forensics and Security , year=

work page

[7] [11]

The Annals of Probability , pages=

Rates of convergence for empirical processes of stationary mixing sequences , author=. The Annals of Probability , pages=. 1994 , publisher=

work page 1994

[8] [12]

Advances in neural information processing systems , volume=

Rademacher complexity bounds for non-iid processes , author=. Advances in neural information processing systems , volume=

work page

[9] [13]

2013 , publisher=

The nature of statistical learning theory , author=. 2013 , publisher=

work page 2013

[10] [14]

Advances in neural information processing systems , volume=

Denoising diffusion probabilistic models , author=. Advances in neural information processing systems , volume=

work page

[11] [15]

2021 , eprint=

Score-Based Generative Modeling through Stochastic Differential Equations , author=. 2021 , eprint=

work page 2021

[12] [16]

2022 , eprint=

Elucidating the Design Space of Diffusion-Based Generative Models , author=. 2022 , eprint=

work page 2022

[13] [17]

1954 , publisher=

Statistical theory of extreme values and some practical applications: a series of lectures , author=. 1954 , publisher=

work page 1954

[14] [18]

Advances in neural information processing systems , volume=

A* sampling , author=. Advances in neural information processing systems , volume=

work page

[15] [19]

Advances in neural information processing systems , volume=

Generative modeling by estimating gradients of the data distribution , author=. Advances in neural information processing systems , volume=

work page

[16] [20]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

Analyzing and improving the training dynamics of diffusion models , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

work page

[17] [21]

1959 , publisher=

Individual choice behavior , author=. 1959 , publisher=

work page 1959

[18] [22]

Journal of Mathematical Psychology , volume=

The relationship between Luce's choice axiom, Thurstone's theory of comparative judgment, and the double exponential distribution , author=. Journal of Mathematical Psychology , volume=. 1977 , publisher=

work page 1977

[19] [23]

2011 international conference on computer vision , pages=

Perturb-and-map random fields: Using discrete optimization to learn and sample from energy models , author=. 2011 international conference on computer vision , pages=. 2011 , organization=

work page 2011

[20] [25]

A vector-contraction inequality for

Maurer, Andreas , booktitle=. A vector-contraction inequality for. 2016 , organization=

work page 2016

[21] [26]

Advances in Neural Information Processing Systems (NeurIPS) , year=

On the complexity of linear prediction: Risk bounds, margin bounds, and regularization , author=. Advances in Neural Information Processing Systems (NeurIPS) , year=

work page

[22] [27]

Foundations of Machine Learning , author=

work page

[23] [28]

and Mendelson, Shahar , journal=

Bartlett, Peter L. and Mendelson, Shahar , journal=. Rademacher and

work page

[24] [29]

Koltchinskii, Vladimir , journal=. Local. 2006 , publisher=

work page 2006

[25] [30]

High-Dimensional Statistics: A Non-Asymptotic Viewpoint , author=

work page

[26] [31]

Vldb , volume=

Similarity search in high dimensions via hashing , author=. Vldb , volume=

work page

[27] [32]

Advances in neural information processing systems , volume=

Gans trained by a two time-scale update rule converge to a local nash equilibrium , author=. Advances in neural information processing systems , volume=

work page

[28] [33]

Advances in Neural Information Processing Systems , volume=

Dart-math: Difficulty-aware rejection tuning for mathematical problem-solving , author=. Advances in Neural Information Processing Systems , volume=

work page

[29] [34]

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages=

Mathfusion: Enhancing mathematical problem-solving of llm through instruction fusion , author=. Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages=

work page

[30] [37]

Shao, Zhihong and Wang, Peiyi and Zhu, Qihao and Xu, Runxin and Song, Junxiao and Zhang, Mingchuan and Li, Y. K. and Wu, Y. and Guo, Daya , title =. CoRR , volume =. 2024 , url =

work page 2024

[31] [39]

Qwen3 Embedding: Advancing Text Embedding and Reranking Through Foundation Models

Qwen3 Embedding: Advancing Text Embedding and Reranking Through Foundation Models , author=. arXiv preprint arXiv:2506.05176 , year=

work page internal anchor Pith review Pith/arXiv arXiv

[32] [40]

Advances in neural information processing systems , volume=

Generative adversarial nets , author=. Advances in neural information processing systems , volume=

work page

[33] [41]

Auto-Encoding Variational Bayes

Auto-encoding variational bayes , author=. arXiv preprint arXiv:1312.6114 , year=

work page internal anchor Pith review Pith/arXiv arXiv

[34] [42]

2018 , publisher=

Improving language understanding by generative pre-training , author=. 2018 , publisher=

work page 2018

[35] [43]

NeurIPS Deep Learning and Representation Learning Workshop , year=

Distilling the Knowledge in a Neural Network , author=. NeurIPS Deep Learning and Representation Learning Workshop , year=

work page

[36] [44]

DistilBERT, a distilled version of

Sanh, Victor and Debut, Lysandre and Chaumond, Julien and Wolf, Thomas , booktitle=. DistilBERT, a distilled version of

work page

[37] [45]

Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing (EMNLP) , pages=

Sequence-Level Knowledge Distillation , author=. Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing (EMNLP) , pages=

work page 2016

[38] [46]

Stealing Machine Learning Models via Prediction

Tram. Stealing Machine Learning Models via Prediction. 25th USENIX Security Symposium (USENIX Security 16) , pages=

work page

[39] [47]

and Papernot, Nicolas and Iyyer, Mohit , booktitle=

Krishna, Kalpesh and Tomar, Gaurav Singh and Parikh, Ankur P. and Papernot, Nicolas and Iyyer, Mohit , booktitle=. Thieves on Sesame Street!

work page

[40] [48]

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) , pages=

Imitation Attacks and Defenses for Black-box Machine Translation Systems , author=. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) , pages=

work page 2020

[41] [49]

Romero, Adriana and Ballas, Nicolas and Kahou, Samira Ebrahimi and Chassang, Antoine and Gatta, Carlo and Bengio, Yoshua , booktitle=

work page

[42] [50]

Proceedings of the 35th International Conference on Machine Learning (ICML) , year=

Born-Again Neural Networks , author=. Proceedings of the 35th International Conference on Machine Learning (ICML) , year=

work page

[43] [51]

International Conference on Learning Representations (ICLR) , year=

Contrastive Representation Distillation , author=. International Conference on Learning Representations (ICLR) , year=

work page

[44] [52]

Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL) , year=

Teaching Small Language Models to Reason , author=. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL) , year=

work page

[45] [53]

Distilling Step-by-Step!

Hsieh, Cheng-Yu and Li, Chun-Liang and Yeh, Chih-Kuan and Nakhost, Hootan and Fujii, Yasuhisa and Ratner, Alexander and Krishna, Ranjay and Lee, Chen-Yu and Pfister, Tomas , booktitle=. Distilling Step-by-Step!

work page

[46] [54]

Gu, Yuxian and Dong, Li and Wei, Furu and Huang, Minlie , booktitle=

work page

[47] [55]

Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security (ASIACCS) , pages=

Practical Black-Box Attacks Against Machine Learning , author=. Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security (ASIACCS) , pages=

work page 2017

[48] [56]

29th USENIX Security Symposium (USENIX Security 20) , pages=

High Accuracy and High Fidelity Extraction of Neural Networks , author=. 29th USENIX Security Symposium (USENIX Security 20) , pages=

work page

[49] [57]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , pages=

Data-Free Model Extraction , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , pages=

work page

[50] [58]

Pal, Soham and Gupta, Yash and Shukla, Aditya and Kanade, Aditya and Shevade, Shirish and Ganapathy, Vinod , booktitle=

work page

[51] [59]

Proceedings of the 41st International Conference on Machine Learning (ICML) , year=

Stealing Part of a Production Language Model , author=. Proceedings of the 41st International Conference on Machine Learning (ICML) , year=

work page

[52] [60]

27th USENIX Security Symposium (USENIX Security 18) , pages=

Turning Your Weakness into a Strength: Watermarking Deep Neural Networks by Backdooring , author=. 27th USENIX Security Symposium (USENIX Security 18) , pages=

work page

[53] [61]

Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval (ICMR) , pages=

Embedding Watermarks into Deep Neural Networks , author=. Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval (ICMR) , pages=

work page 2017

[54] [62]

30th USENIX Security Symposium (USENIX Security 21) , pages=

Entangled Watermarks as a Defense against Model Extraction , author=. 30th USENIX Security Symposium (USENIX Security 21) , pages=

work page

[55] [63]

2009 IEEE conference on computer vision and pattern recognition , pages=

Imagenet: A large-scale hierarchical image database , author=. 2009 IEEE conference on computer vision and pattern recognition , pages=. 2009 , organization=

work page 2009

[56] [64]

International conference on machine learning , pages=

Deep unsupervised learning using nonequilibrium thermodynamics , author=. International conference on machine learning , pages=. 2015 , organization=

work page 2015

[57] [67]

2025 , eprint=

Qwen2.5 Technical Report , author=. 2025 , eprint=

work page 2025

[58] [68]

Qwen2.5-Coder Technical Report

Qwen2. 5-coder technical report , author=. arXiv preprint arXiv:2409.12186 , year=

work page internal anchor Pith review Pith/arXiv arXiv

[59] [71]

Turning your weakness into a strength: Watermarking deep neural networks by backdooring

Yossi Adi, Carsten Baum, Moustapha Ciss \'e , Benny Pinkas, and Joseph Keshet. Turning your weakness into a strength: Watermarking deep neural networks by backdooring. In 27th USENIX Security Symposium (USENIX Security 18), pages 1615--1631, 2018

work page 2018

[60] [72]

Bartlett and Shahar Mendelson

Peter L. Bartlett and Shahar Mendelson. Rademacher and G aussian complexities: Risk bounds and structural results. Journal of Machine Learning Research, 3: 0 463--482, 2002

work page 2002

[61] [73]

Stealing part of a production language model

Nicholas Carlini, Daniel Paleka, Krishnamurthy Dj Dvijotham, Thomas Steinke, Jonathan Hayase, A Feder Cooper, Katherine Lee, Matthew Jagielski, Milad Nasr, Arthur Conmy, et al. Stealing part of a production language model. Proceedings of the 41st International Conference on Machine Learning (ICML), 2024

work page 2024

[62] [74]

Queen: Query unlearning against model extraction

Huajie Chen, Tianqing Zhu, Lefeng Zhang, Bo Liu, Derui Wang, Wanlei Zhou, and Minhui Xue. Queen: Query unlearning against model extraction. IEEE Transactions on Information Forensics and Security, 2025

work page 2025

[63] [75]

Evaluating Large Language Models Trained on Code

Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Ponde De Oliveira Pinto, Jared Kaplan, Harri Edwards, Yuri Burda, Nicholas Joseph, Greg Brockman, et al. Evaluating large language models trained on code. arXiv preprint arXiv:2107.03374, 2021

work page internal anchor Pith review Pith/arXiv arXiv 2021

[64] [76]

Training Verifiers to Solve Math Word Problems

Karl Cobbe, Vineet Kosaraju, Mohammad Bavarian, Mark Chen, Heewoo Jun, Lukasz Kaiser, Matthias Plappert, Jerry Tworek, Jacob Hilton, Reiichiro Nakano, et al. Training verifiers to solve math word problems. arXiv preprint arXiv:2110.14168, 2021

work page internal anchor Pith review Pith/arXiv arXiv 2021

[65] [77]

Imagenet: A large-scale hierarchical image database

Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, pages 248--255. Ieee, 2009

work page 2009

[66] [78]

Born-again neural networks

Tommaso Furlanello, Zachary Lipton, Michael Tschannen, Laurent Itti, and Anima Anandkumar. Born-again neural networks. In Proceedings of the 35th International Conference on Machine Learning (ICML), 2018

work page 2018

[67] [79]

Generative adversarial nets

Ian J Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Generative adversarial nets. Advances in neural information processing systems, 27, 2014

work page 2014

[68] [80]

The Llama 3 Herd of Models

Aaron Grattafiori, Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pandey, Abhishek Kadian, Ahmad Al-Dahle, Aiesha Letman, Akhil Mathur, Alan Schelten, Alex Vaughan, et al. The llama 3 herd of models. arXiv preprint arXiv:2407.21783, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024

[69] [81]

M ini LLM : Knowledge distillation of large language models

Yuxian Gu, Li Dong, Furu Wei, and Minlie Huang. M ini LLM : Knowledge distillation of large language models. In International Conference on Learning Representations (ICLR), 2024

work page 2024

[70] [82]

Statistical theory of extreme values and some practical applications: a series of lectures, volume 33

Emil Julius Gumbel. Statistical theory of extreme values and some practical applications: a series of lectures, volume 33. US Government Printing Office, 1954

work page 1954

[71] [83]

On the Partition Function and Random Maximum A-Posteriori Perturbations

Tamir Hazan and Tommi Jaakkola. On the partition function and random maximum a-posteriori perturbations. arXiv preprint arXiv:1206.6410, 2012

work page internal anchor Pith review Pith/arXiv arXiv 2012

[72] [84]

Measuring Mathematical Problem Solving With the MATH Dataset

Dan Hendrycks, Collin Burns, Saurav Kadavath, Akul Arora, Steven Basart, Eric Tang, Dawn Song, and Jacob Steinhardt. Measuring mathematical problem solving with the math dataset. arXiv preprint arXiv:2103.03874, 2021

work page internal anchor Pith review Pith/arXiv arXiv 2021

[73] [85]

Gans trained by a two time-scale update rule converge to a local nash equilibrium

Martin Heusel, Hubert Ramsauer, Thomas Unterthiner, Bernhard Nessler, and Sepp Hochreiter. Gans trained by a two time-scale update rule converge to a local nash equilibrium. Advances in neural information processing systems, 30, 2017

work page 2017

[74] [86]

Distilling the knowledge in a neural network

Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. Distilling the knowledge in a neural network. In NeurIPS Deep Learning and Representation Learning Workshop, 2015

work page 2015

[75] [87]

Denoising diffusion probabilistic models

Jonathan Ho, Ajay Jain, and Pieter Abbeel. Denoising diffusion probabilistic models. Advances in neural information processing systems, 33: 0 6840--6851, 2020

work page 2020

[76] [88]

Distilling step-by-step! O utperforming larger language models with less training data and smaller model sizes

Cheng-Yu Hsieh, Chun-Liang Li, Chih-Kuan Yeh, Hootan Nakhost, Yasuhisa Fujii, Alexander Ratner, Ranjay Krishna, Chen-Yu Lee, and Tomas Pfister. Distilling step-by-step! O utperforming larger language models with less training data and smaller model sizes. In Findings of the Association for Computational Linguistics (ACL Findings), 2023

work page 2023

[77] [89]

High accuracy and high fidelity extraction of neural networks

Matthew Jagielski, Nicholas Carlini, David Berthelot, Alex Kurakin, and Nicolas Papernot. High accuracy and high fidelity extraction of neural networks. In 29th USENIX Security Symposium (USENIX Security 20), pages 1345--1362, 2020

work page 2020

[78] [90]

Choquette-Choo, Varun Chandrasekaran, and Nicolas Papernot

Hengrui Jia, Christopher A. Choquette-Choo, Varun Chandrasekaran, and Nicolas Papernot. Entangled watermarks as a defense against model extraction. In 30th USENIX Security Symposium (USENIX Security 21), pages 1937--1954, 2021

work page 1937

[79] [91]

Prada: protecting against dnn model stealing attacks

Mika Juuti, Sebastian Szyller, Samuel Marchal, and N Asokan. Prada: protecting against dnn model stealing attacks. In 2019 IEEE European Symposium on Security and Privacy (EuroS&P), pages 512--527. IEEE, 2019

work page 2019

[80] [92]

Defending against model stealing attacks with adaptive misinformation

Sanjay Kariyappa and Moinuddin K Qureshi. Defending against model stealing attacks with adaptive misinformation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 770--778, 2020

work page 2020