pith. sign in

arxiv: 2605.18829 · v1 · pith:K6XP26PGnew · submitted 2026-05-12 · 💻 cs.LG · cs.CR

Lossless Anti-Distillation Sampling

Pith reviewed 2026-05-20 21:35 UTC · model grok-4.3

classification 💻 cs.LG cs.CR
keywords anti-distillationsampling methodgenerative modelsmodel distillation defenseuniform convergencegeneralization gapsemantic bucketing
0
0 comments X

The pith

Tying random seeds to query semantics correlates distillation data and slows student convergence while leaving single users unaffected.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces Lossless Anti-Distillation Sampling to defend against distillation attacks on generative models. It works by making the random seed for each generation depend on the semantic content of the query and the user's query count. Benign users still get fresh independent samples each time. Distillers who spread queries across accounts end up with correlated samples in the same semantic groups. Theory based on uniform convergence shows this correlation slows the rate at which the distilled model's generalization gap shrinks. Experiments on images, math, and code back this up.

Core claim

Lossless Anti-Distillation Sampling derives the randomness for each generation from a private seed based on query semantics and query frequency. This ensures independent sampling for individual users but induces correlation across accounts that harvest similar queries, thereby degrading the sample diversity and generalization of the distilled model as proven via uniform convergence bounds in both unconditional and conditional settings.

What carries the argument

The semantic-content-and-frequency-determined private seed that controls the generation randomness.

If this is right

  • LADS maintains exact statistical fidelity for benign single-account users.
  • The harvested dataset for distillers becomes less diverse due to repeated seeds in semantic buckets.
  • Uniform convergence theory predicts a slower convergence rate for the distiller's generalization gap compared to i.i.d. sampling.
  • Empirical results show degraded performance in distilled models for image generation, mathematical reasoning, and code generation.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If semantic bucketing can be made robust, similar seed-sharing could apply to other data-harvesting threats like model extraction.
  • Distillers might try to evade by paraphrasing queries heavily, but this could reduce the effectiveness of their own training data.
  • Extending to conditional generation suggests applications in text or multimodal models where semantics are clearer.

Load-bearing premise

Queries can be reliably grouped into semantic buckets so that similar queries from different accounts share the same seed and this correlation measurably harms the distilled model's learning.

What would settle it

A distiller using many accounts to query semantically similar prompts and training a student model that achieves generalization performance comparable to one trained on standard i.i.d. samples from the original model.

Figures

Figures reproduced from arXiv: 2605.18829 by Di He, Jingchu Gai, Xinyue Ai, Zhang Zhang, Zhenyu He, Zibo Diao.

Figure 1
Figure 1. Figure 1: Overview of our approach. Under standard sampling, each query is associated with an independently [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
read the original abstract

Frontier commercial generative models face a growing threat from distillation, whereby a distiller harvests generated responses and trains a competing model of its own at drastically lower cost. Existing defenses either rely on modifying the models outputs, thereby sacrificing response quality for benign users, or on behavioral detection methods, which can be readily circumvented by distributing queries across multiple accounts. In this work, we propose Lossless Anti-Distillation Sampling (LADS), a novel sampling scheme specifically designed to counter multi-account distillation while maintaining a lossless experience for benign users. Concretely, LADS derives the randomness underlying each generation from a private seed determined by the semantic content of the query and the number of times the user has queried the model. By construction, every benign user receives a response independently sampled from the original model at each visit, and thus experiences no distortion. In contrast, for a distiller, different accounts share latent randomness whenever their queries fall in the same semantic bucket. As a result, the harvested data becomes correlated, potentially reducing sample diversity and degrading generalization. Using uniform convergence theory, we show that LADS provably degrades the convergence rate of the distillers generalization gap relative to standard i.i.d. sampling in both unconditional and conditional generation settings. Experiments on image generation, mathematical reasoning, and code generation confirm that LADS substantially degrades the performance of distilled students while preserving exact statistical fidelity for individual users.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes Lossless Anti-Distillation Sampling (LADS), which derives generation randomness from a private seed based on query semantics and per-user query count. Benign users receive independent samples with no quality loss, while distillers using multiple accounts obtain positively correlated samples within semantic buckets. The central theoretical claim is that uniform convergence theory establishes a provably slower convergence rate for the distiller's generalization gap relative to i.i.d. sampling in both unconditional and conditional settings. Experiments across image generation, mathematical reasoning, and code generation are reported to confirm substantial degradation in distilled student performance.

Significance. If the dependence-adjusted uniform convergence argument holds, the work supplies a practical, zero-cost defense against multi-account distillation that preserves exact statistical fidelity for legitimate users. The three-domain experiments supply direct empirical support for the degradation effect. A notable strength is the explicit separation between user experience (lossless) and attacker utility (degraded), together with the attempt to ground the defense in a classical learning-theoretic tool.

major comments (2)
  1. [§4] §4 (Uniform Convergence Analysis): the derivation applies standard i.i.d. uniform convergence bounds directly to the LADS-induced samples without deriving a quantitative dependence measure (e.g., β-mixing coefficients, covariance decay rate, or effective sample size n_eff < n). Consequently it is not shown that the generalization-gap rate is strictly slower than the classical O(1/√n) rather than merely carrying a worse constant.
  2. [§4.2] §4.2 (Conditional Generation Setting): the extension claims degradation for conditional sampling but provides no explicit bound that accounts for the interaction between the conditioning variable and the semantic-bucket dependence; the argument therefore does not yet establish the claimed rate separation relative to i.i.d. conditional sampling.
minor comments (2)
  1. [§3] The operational definition of semantic buckets (including how queries are mapped and how bucket collisions are detected across accounts) is described at a high level; a short pseudocode block or concrete example would improve clarity and reproducibility.
  2. [§5] Table 2 and Figure 3 report performance gaps but omit standard errors or statistical significance tests; adding these would strengthen the empirical claim that degradation is robust rather than an artifact of particular seeds or splits.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive and detailed comments. We agree that the uniform convergence sections require strengthening with explicit quantitative dependence measures, and we will revise the manuscript to address both points.

read point-by-point responses
  1. Referee: [§4] §4 (Uniform Convergence Analysis): the derivation applies standard i.i.d. uniform convergence bounds directly to the LADS-induced samples without deriving a quantitative dependence measure (e.g., β-mixing coefficients, covariance decay rate, or effective sample size n_eff < n). Consequently it is not shown that the generalization-gap rate is strictly slower than the classical O(1/√n) rather than merely carrying a worse constant.

    Authors: We thank the referee for this observation. While the positive intra-bucket correlations induced by shared seeds intuitively reduce sample diversity and thereby enlarge the generalization gap, the current write-up does not quantify the dependence. In the revised manuscript we will add an explicit β-mixing analysis: we model the query sequence as a Markov chain over semantic buckets and derive a mixing coefficient β(k) that decays with the number of distinct buckets visited. From this we obtain an effective sample size n_eff = n / (1 + ρ) with ρ > 0 determined by the bucket collision probability, yielding a uniform convergence bound whose leading term is strictly larger than the classical i.i.d. O(1/√n) bound for any finite n. This establishes the claimed rate separation. revision: yes

  2. Referee: [§4.2] §4.2 (Conditional Generation Setting): the extension claims degradation for conditional sampling but provides no explicit bound that accounts for the interaction between the conditioning variable and the semantic-bucket dependence; the argument therefore does not yet establish the claimed rate separation relative to i.i.d. conditional sampling.

    Authors: We agree that the conditional case needs a more careful treatment. In the revision we will derive an explicit uniform convergence bound for the conditional setting by considering the joint measure over (query, response) pairs. The semantic-bucket dependence is now conditioned on the query embedding; we will show that the resulting mixing coefficient remains strictly positive and obtain a generalization-gap bound that is larger than the corresponding i.i.d. conditional bound by a factor that depends on the bucket collision probability under the conditional distribution. This establishes the rate separation for the conditional case as well. revision: yes

Circularity Check

0 steps flagged

No significant circularity: derivation applies external uniform convergence theory

full rationale

The paper's load-bearing theoretical step invokes standard uniform convergence theory to conclude that LADS-induced positive dependence (via semantic bucketing and shared seeds) degrades the distiller's generalization-gap convergence rate relative to i.i.d. sampling. No equations or text in the provided abstract or description reduce this claim to a quantity defined by the authors' own fits, self-citations, or ansatzes; the result is framed as a direct consequence of applying existing statistical learning bounds to the described sampling process. The derivation is therefore self-contained against external benchmarks and does not exhibit any of the enumerated circular patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim depends on the existence of a reliable semantic bucketing function and on the applicability of uniform convergence to the specific correlation structure induced by the sampling scheme.

axioms (1)
  • domain assumption Uniform convergence theory can be applied to bound the generalization gap of a student model trained on correlated samples produced by LADS.
    Invoked to prove degraded convergence rate relative to i.i.d. sampling.

pith-pipeline@v0.9.0 · 5788 in / 1348 out tokens · 46710 ms · 2026-05-20T21:35:51.710300+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

121 extracted references · 121 canonical work pages · 13 internal anchors

  1. [5]

    Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

    Defending against model stealing attacks with adaptive misinformation , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

  2. [6]

    Advances in Neural Information Processing Systems , volume=

    Watermarking makes language models radioactive , author=. Advances in Neural Information Processing Systems , volume=

  3. [7]

    International Conference on Machine Learning , pages=

    Protecting language generation models via invisible watermarking , author=. International Conference on Machine Learning , pages=. 2023 , organization=

  4. [8]

    2019 IEEE European Symposium on Security and Privacy (EuroS&P) , pages=

    PRADA: protecting against DNN model stealing attacks , author=. 2019 IEEE European Symposium on Security and Privacy (EuroS&P) , pages=. 2019 , organization=

  5. [9]

    Proceedings of the AAAI Conference on Artificial Intelligence , volume=

    Defense against model stealing based on account-aware distribution discrepancy , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=

  6. [10]

    IEEE Transactions on Information Forensics and Security , year=

    Queen: Query unlearning against model extraction , author=. IEEE Transactions on Information Forensics and Security , year=

  7. [11]

    The Annals of Probability , pages=

    Rates of convergence for empirical processes of stationary mixing sequences , author=. The Annals of Probability , pages=. 1994 , publisher=

  8. [12]

    Advances in neural information processing systems , volume=

    Rademacher complexity bounds for non-iid processes , author=. Advances in neural information processing systems , volume=

  9. [13]

    2013 , publisher=

    The nature of statistical learning theory , author=. 2013 , publisher=

  10. [14]

    Advances in neural information processing systems , volume=

    Denoising diffusion probabilistic models , author=. Advances in neural information processing systems , volume=

  11. [15]

    2021 , eprint=

    Score-Based Generative Modeling through Stochastic Differential Equations , author=. 2021 , eprint=

  12. [16]

    2022 , eprint=

    Elucidating the Design Space of Diffusion-Based Generative Models , author=. 2022 , eprint=

  13. [17]

    1954 , publisher=

    Statistical theory of extreme values and some practical applications: a series of lectures , author=. 1954 , publisher=

  14. [18]

    Advances in neural information processing systems , volume=

    A* sampling , author=. Advances in neural information processing systems , volume=

  15. [19]

    Advances in neural information processing systems , volume=

    Generative modeling by estimating gradients of the data distribution , author=. Advances in neural information processing systems , volume=

  16. [20]

    Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

    Analyzing and improving the training dynamics of diffusion models , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

  17. [21]

    1959 , publisher=

    Individual choice behavior , author=. 1959 , publisher=

  18. [22]

    Journal of Mathematical Psychology , volume=

    The relationship between Luce's choice axiom, Thurstone's theory of comparative judgment, and the double exponential distribution , author=. Journal of Mathematical Psychology , volume=. 1977 , publisher=

  19. [23]

    2011 international conference on computer vision , pages=

    Perturb-and-map random fields: Using discrete optimization to learn and sample from energy models , author=. 2011 international conference on computer vision , pages=. 2011 , organization=

  20. [25]

    A vector-contraction inequality for

    Maurer, Andreas , booktitle=. A vector-contraction inequality for. 2016 , organization=

  21. [26]

    Advances in Neural Information Processing Systems (NeurIPS) , year=

    On the complexity of linear prediction: Risk bounds, margin bounds, and regularization , author=. Advances in Neural Information Processing Systems (NeurIPS) , year=

  22. [27]

    Foundations of Machine Learning , author=

  23. [28]

    and Mendelson, Shahar , journal=

    Bartlett, Peter L. and Mendelson, Shahar , journal=. Rademacher and

  24. [29]

    Koltchinskii, Vladimir , journal=. Local. 2006 , publisher=

  25. [30]

    High-Dimensional Statistics: A Non-Asymptotic Viewpoint , author=

  26. [31]

    Vldb , volume=

    Similarity search in high dimensions via hashing , author=. Vldb , volume=

  27. [32]

    Advances in neural information processing systems , volume=

    Gans trained by a two time-scale update rule converge to a local nash equilibrium , author=. Advances in neural information processing systems , volume=

  28. [33]

    Advances in Neural Information Processing Systems , volume=

    Dart-math: Difficulty-aware rejection tuning for mathematical problem-solving , author=. Advances in Neural Information Processing Systems , volume=

  29. [34]

    Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages=

    Mathfusion: Enhancing mathematical problem-solving of llm through instruction fusion , author=. Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages=

  30. [37]

    Shao, Zhihong and Wang, Peiyi and Zhu, Qihao and Xu, Runxin and Song, Junxiao and Zhang, Mingchuan and Li, Y. K. and Wu, Y. and Guo, Daya , title =. CoRR , volume =. 2024 , url =

  31. [39]

    Qwen3 Embedding: Advancing Text Embedding and Reranking Through Foundation Models

    Qwen3 Embedding: Advancing Text Embedding and Reranking Through Foundation Models , author=. arXiv preprint arXiv:2506.05176 , year=

  32. [40]

    Advances in neural information processing systems , volume=

    Generative adversarial nets , author=. Advances in neural information processing systems , volume=

  33. [41]

    Auto-Encoding Variational Bayes

    Auto-encoding variational bayes , author=. arXiv preprint arXiv:1312.6114 , year=

  34. [42]

    2018 , publisher=

    Improving language understanding by generative pre-training , author=. 2018 , publisher=

  35. [43]

    NeurIPS Deep Learning and Representation Learning Workshop , year=

    Distilling the Knowledge in a Neural Network , author=. NeurIPS Deep Learning and Representation Learning Workshop , year=

  36. [44]

    DistilBERT, a distilled version of

    Sanh, Victor and Debut, Lysandre and Chaumond, Julien and Wolf, Thomas , booktitle=. DistilBERT, a distilled version of

  37. [45]

    Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing (EMNLP) , pages=

    Sequence-Level Knowledge Distillation , author=. Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing (EMNLP) , pages=

  38. [46]

    Stealing Machine Learning Models via Prediction

    Tram. Stealing Machine Learning Models via Prediction. 25th USENIX Security Symposium (USENIX Security 16) , pages=

  39. [47]

    and Papernot, Nicolas and Iyyer, Mohit , booktitle=

    Krishna, Kalpesh and Tomar, Gaurav Singh and Parikh, Ankur P. and Papernot, Nicolas and Iyyer, Mohit , booktitle=. Thieves on Sesame Street!

  40. [48]

    Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) , pages=

    Imitation Attacks and Defenses for Black-box Machine Translation Systems , author=. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) , pages=

  41. [49]

    Romero, Adriana and Ballas, Nicolas and Kahou, Samira Ebrahimi and Chassang, Antoine and Gatta, Carlo and Bengio, Yoshua , booktitle=

  42. [50]

    Proceedings of the 35th International Conference on Machine Learning (ICML) , year=

    Born-Again Neural Networks , author=. Proceedings of the 35th International Conference on Machine Learning (ICML) , year=

  43. [51]

    International Conference on Learning Representations (ICLR) , year=

    Contrastive Representation Distillation , author=. International Conference on Learning Representations (ICLR) , year=

  44. [52]

    Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL) , year=

    Teaching Small Language Models to Reason , author=. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL) , year=

  45. [53]

    Distilling Step-by-Step!

    Hsieh, Cheng-Yu and Li, Chun-Liang and Yeh, Chih-Kuan and Nakhost, Hootan and Fujii, Yasuhisa and Ratner, Alexander and Krishna, Ranjay and Lee, Chen-Yu and Pfister, Tomas , booktitle=. Distilling Step-by-Step!

  46. [54]

    Gu, Yuxian and Dong, Li and Wei, Furu and Huang, Minlie , booktitle=

  47. [55]

    Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security (ASIACCS) , pages=

    Practical Black-Box Attacks Against Machine Learning , author=. Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security (ASIACCS) , pages=

  48. [56]

    29th USENIX Security Symposium (USENIX Security 20) , pages=

    High Accuracy and High Fidelity Extraction of Neural Networks , author=. 29th USENIX Security Symposium (USENIX Security 20) , pages=

  49. [57]

    Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , pages=

    Data-Free Model Extraction , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , pages=

  50. [58]

    Pal, Soham and Gupta, Yash and Shukla, Aditya and Kanade, Aditya and Shevade, Shirish and Ganapathy, Vinod , booktitle=

  51. [59]

    Proceedings of the 41st International Conference on Machine Learning (ICML) , year=

    Stealing Part of a Production Language Model , author=. Proceedings of the 41st International Conference on Machine Learning (ICML) , year=

  52. [60]

    27th USENIX Security Symposium (USENIX Security 18) , pages=

    Turning Your Weakness into a Strength: Watermarking Deep Neural Networks by Backdooring , author=. 27th USENIX Security Symposium (USENIX Security 18) , pages=

  53. [61]

    Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval (ICMR) , pages=

    Embedding Watermarks into Deep Neural Networks , author=. Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval (ICMR) , pages=

  54. [62]

    30th USENIX Security Symposium (USENIX Security 21) , pages=

    Entangled Watermarks as a Defense against Model Extraction , author=. 30th USENIX Security Symposium (USENIX Security 21) , pages=

  55. [63]

    2009 IEEE conference on computer vision and pattern recognition , pages=

    Imagenet: A large-scale hierarchical image database , author=. 2009 IEEE conference on computer vision and pattern recognition , pages=. 2009 , organization=

  56. [64]

    International conference on machine learning , pages=

    Deep unsupervised learning using nonequilibrium thermodynamics , author=. International conference on machine learning , pages=. 2015 , organization=

  57. [67]

    2025 , eprint=

    Qwen2.5 Technical Report , author=. 2025 , eprint=

  58. [68]

    Qwen2.5-Coder Technical Report

    Qwen2. 5-coder technical report , author=. arXiv preprint arXiv:2409.12186 , year=

  59. [71]

    Turning your weakness into a strength: Watermarking deep neural networks by backdooring

    Yossi Adi, Carsten Baum, Moustapha Ciss \'e , Benny Pinkas, and Joseph Keshet. Turning your weakness into a strength: Watermarking deep neural networks by backdooring. In 27th USENIX Security Symposium (USENIX Security 18), pages 1615--1631, 2018

  60. [72]

    Bartlett and Shahar Mendelson

    Peter L. Bartlett and Shahar Mendelson. Rademacher and G aussian complexities: Risk bounds and structural results. Journal of Machine Learning Research, 3: 0 463--482, 2002

  61. [73]

    Stealing part of a production language model

    Nicholas Carlini, Daniel Paleka, Krishnamurthy Dj Dvijotham, Thomas Steinke, Jonathan Hayase, A Feder Cooper, Katherine Lee, Matthew Jagielski, Milad Nasr, Arthur Conmy, et al. Stealing part of a production language model. Proceedings of the 41st International Conference on Machine Learning (ICML), 2024

  62. [74]

    Queen: Query unlearning against model extraction

    Huajie Chen, Tianqing Zhu, Lefeng Zhang, Bo Liu, Derui Wang, Wanlei Zhou, and Minhui Xue. Queen: Query unlearning against model extraction. IEEE Transactions on Information Forensics and Security, 2025

  63. [75]

    Evaluating Large Language Models Trained on Code

    Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Ponde De Oliveira Pinto, Jared Kaplan, Harri Edwards, Yuri Burda, Nicholas Joseph, Greg Brockman, et al. Evaluating large language models trained on code. arXiv preprint arXiv:2107.03374, 2021

  64. [76]

    Training Verifiers to Solve Math Word Problems

    Karl Cobbe, Vineet Kosaraju, Mohammad Bavarian, Mark Chen, Heewoo Jun, Lukasz Kaiser, Matthias Plappert, Jerry Tworek, Jacob Hilton, Reiichiro Nakano, et al. Training verifiers to solve math word problems. arXiv preprint arXiv:2110.14168, 2021

  65. [77]

    Imagenet: A large-scale hierarchical image database

    Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, pages 248--255. Ieee, 2009

  66. [78]

    Born-again neural networks

    Tommaso Furlanello, Zachary Lipton, Michael Tschannen, Laurent Itti, and Anima Anandkumar. Born-again neural networks. In Proceedings of the 35th International Conference on Machine Learning (ICML), 2018

  67. [79]

    Generative adversarial nets

    Ian J Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Generative adversarial nets. Advances in neural information processing systems, 27, 2014

  68. [80]

    The Llama 3 Herd of Models

    Aaron Grattafiori, Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pandey, Abhishek Kadian, Ahmad Al-Dahle, Aiesha Letman, Akhil Mathur, Alan Schelten, Alex Vaughan, et al. The llama 3 herd of models. arXiv preprint arXiv:2407.21783, 2024

  69. [81]

    M ini LLM : Knowledge distillation of large language models

    Yuxian Gu, Li Dong, Furu Wei, and Minlie Huang. M ini LLM : Knowledge distillation of large language models. In International Conference on Learning Representations (ICLR), 2024

  70. [82]

    Statistical theory of extreme values and some practical applications: a series of lectures, volume 33

    Emil Julius Gumbel. Statistical theory of extreme values and some practical applications: a series of lectures, volume 33. US Government Printing Office, 1954

  71. [83]

    On the Partition Function and Random Maximum A-Posteriori Perturbations

    Tamir Hazan and Tommi Jaakkola. On the partition function and random maximum a-posteriori perturbations. arXiv preprint arXiv:1206.6410, 2012

  72. [84]

    Measuring Mathematical Problem Solving With the MATH Dataset

    Dan Hendrycks, Collin Burns, Saurav Kadavath, Akul Arora, Steven Basart, Eric Tang, Dawn Song, and Jacob Steinhardt. Measuring mathematical problem solving with the math dataset. arXiv preprint arXiv:2103.03874, 2021

  73. [85]

    Gans trained by a two time-scale update rule converge to a local nash equilibrium

    Martin Heusel, Hubert Ramsauer, Thomas Unterthiner, Bernhard Nessler, and Sepp Hochreiter. Gans trained by a two time-scale update rule converge to a local nash equilibrium. Advances in neural information processing systems, 30, 2017

  74. [86]

    Distilling the knowledge in a neural network

    Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. Distilling the knowledge in a neural network. In NeurIPS Deep Learning and Representation Learning Workshop, 2015

  75. [87]

    Denoising diffusion probabilistic models

    Jonathan Ho, Ajay Jain, and Pieter Abbeel. Denoising diffusion probabilistic models. Advances in neural information processing systems, 33: 0 6840--6851, 2020

  76. [88]

    Distilling step-by-step! O utperforming larger language models with less training data and smaller model sizes

    Cheng-Yu Hsieh, Chun-Liang Li, Chih-Kuan Yeh, Hootan Nakhost, Yasuhisa Fujii, Alexander Ratner, Ranjay Krishna, Chen-Yu Lee, and Tomas Pfister. Distilling step-by-step! O utperforming larger language models with less training data and smaller model sizes. In Findings of the Association for Computational Linguistics (ACL Findings), 2023

  77. [89]

    High accuracy and high fidelity extraction of neural networks

    Matthew Jagielski, Nicholas Carlini, David Berthelot, Alex Kurakin, and Nicolas Papernot. High accuracy and high fidelity extraction of neural networks. In 29th USENIX Security Symposium (USENIX Security 20), pages 1345--1362, 2020

  78. [90]

    Choquette-Choo, Varun Chandrasekaran, and Nicolas Papernot

    Hengrui Jia, Christopher A. Choquette-Choo, Varun Chandrasekaran, and Nicolas Papernot. Entangled watermarks as a defense against model extraction. In 30th USENIX Security Symposium (USENIX Security 21), pages 1937--1954, 2021

  79. [91]

    Prada: protecting against dnn model stealing attacks

    Mika Juuti, Sebastian Szyller, Samuel Marchal, and N Asokan. Prada: protecting against dnn model stealing attacks. In 2019 IEEE European Symposium on Security and Privacy (EuroS&P), pages 512--527. IEEE, 2019

  80. [92]

    Defending against model stealing attacks with adaptive misinformation

    Sanjay Kariyappa and Moinuddin K Qureshi. Defending against model stealing attacks with adaptive misinformation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 770--778, 2020

Showing first 80 references.