Lossless Anti-Distillation Sampling
Pith reviewed 2026-05-20 21:35 UTC · model grok-4.3
The pith
Tying random seeds to query semantics correlates distillation data and slows student convergence while leaving single users unaffected.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Lossless Anti-Distillation Sampling derives the randomness for each generation from a private seed based on query semantics and query frequency. This ensures independent sampling for individual users but induces correlation across accounts that harvest similar queries, thereby degrading the sample diversity and generalization of the distilled model as proven via uniform convergence bounds in both unconditional and conditional settings.
What carries the argument
The semantic-content-and-frequency-determined private seed that controls the generation randomness.
If this is right
- LADS maintains exact statistical fidelity for benign single-account users.
- The harvested dataset for distillers becomes less diverse due to repeated seeds in semantic buckets.
- Uniform convergence theory predicts a slower convergence rate for the distiller's generalization gap compared to i.i.d. sampling.
- Empirical results show degraded performance in distilled models for image generation, mathematical reasoning, and code generation.
Where Pith is reading between the lines
- If semantic bucketing can be made robust, similar seed-sharing could apply to other data-harvesting threats like model extraction.
- Distillers might try to evade by paraphrasing queries heavily, but this could reduce the effectiveness of their own training data.
- Extending to conditional generation suggests applications in text or multimodal models where semantics are clearer.
Load-bearing premise
Queries can be reliably grouped into semantic buckets so that similar queries from different accounts share the same seed and this correlation measurably harms the distilled model's learning.
What would settle it
A distiller using many accounts to query semantically similar prompts and training a student model that achieves generalization performance comparable to one trained on standard i.i.d. samples from the original model.
Figures
read the original abstract
Frontier commercial generative models face a growing threat from distillation, whereby a distiller harvests generated responses and trains a competing model of its own at drastically lower cost. Existing defenses either rely on modifying the models outputs, thereby sacrificing response quality for benign users, or on behavioral detection methods, which can be readily circumvented by distributing queries across multiple accounts. In this work, we propose Lossless Anti-Distillation Sampling (LADS), a novel sampling scheme specifically designed to counter multi-account distillation while maintaining a lossless experience for benign users. Concretely, LADS derives the randomness underlying each generation from a private seed determined by the semantic content of the query and the number of times the user has queried the model. By construction, every benign user receives a response independently sampled from the original model at each visit, and thus experiences no distortion. In contrast, for a distiller, different accounts share latent randomness whenever their queries fall in the same semantic bucket. As a result, the harvested data becomes correlated, potentially reducing sample diversity and degrading generalization. Using uniform convergence theory, we show that LADS provably degrades the convergence rate of the distillers generalization gap relative to standard i.i.d. sampling in both unconditional and conditional generation settings. Experiments on image generation, mathematical reasoning, and code generation confirm that LADS substantially degrades the performance of distilled students while preserving exact statistical fidelity for individual users.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes Lossless Anti-Distillation Sampling (LADS), which derives generation randomness from a private seed based on query semantics and per-user query count. Benign users receive independent samples with no quality loss, while distillers using multiple accounts obtain positively correlated samples within semantic buckets. The central theoretical claim is that uniform convergence theory establishes a provably slower convergence rate for the distiller's generalization gap relative to i.i.d. sampling in both unconditional and conditional settings. Experiments across image generation, mathematical reasoning, and code generation are reported to confirm substantial degradation in distilled student performance.
Significance. If the dependence-adjusted uniform convergence argument holds, the work supplies a practical, zero-cost defense against multi-account distillation that preserves exact statistical fidelity for legitimate users. The three-domain experiments supply direct empirical support for the degradation effect. A notable strength is the explicit separation between user experience (lossless) and attacker utility (degraded), together with the attempt to ground the defense in a classical learning-theoretic tool.
major comments (2)
- [§4] §4 (Uniform Convergence Analysis): the derivation applies standard i.i.d. uniform convergence bounds directly to the LADS-induced samples without deriving a quantitative dependence measure (e.g., β-mixing coefficients, covariance decay rate, or effective sample size n_eff < n). Consequently it is not shown that the generalization-gap rate is strictly slower than the classical O(1/√n) rather than merely carrying a worse constant.
- [§4.2] §4.2 (Conditional Generation Setting): the extension claims degradation for conditional sampling but provides no explicit bound that accounts for the interaction between the conditioning variable and the semantic-bucket dependence; the argument therefore does not yet establish the claimed rate separation relative to i.i.d. conditional sampling.
minor comments (2)
- [§3] The operational definition of semantic buckets (including how queries are mapped and how bucket collisions are detected across accounts) is described at a high level; a short pseudocode block or concrete example would improve clarity and reproducibility.
- [§5] Table 2 and Figure 3 report performance gaps but omit standard errors or statistical significance tests; adding these would strengthen the empirical claim that degradation is robust rather than an artifact of particular seeds or splits.
Simulated Author's Rebuttal
We thank the referee for their constructive and detailed comments. We agree that the uniform convergence sections require strengthening with explicit quantitative dependence measures, and we will revise the manuscript to address both points.
read point-by-point responses
-
Referee: [§4] §4 (Uniform Convergence Analysis): the derivation applies standard i.i.d. uniform convergence bounds directly to the LADS-induced samples without deriving a quantitative dependence measure (e.g., β-mixing coefficients, covariance decay rate, or effective sample size n_eff < n). Consequently it is not shown that the generalization-gap rate is strictly slower than the classical O(1/√n) rather than merely carrying a worse constant.
Authors: We thank the referee for this observation. While the positive intra-bucket correlations induced by shared seeds intuitively reduce sample diversity and thereby enlarge the generalization gap, the current write-up does not quantify the dependence. In the revised manuscript we will add an explicit β-mixing analysis: we model the query sequence as a Markov chain over semantic buckets and derive a mixing coefficient β(k) that decays with the number of distinct buckets visited. From this we obtain an effective sample size n_eff = n / (1 + ρ) with ρ > 0 determined by the bucket collision probability, yielding a uniform convergence bound whose leading term is strictly larger than the classical i.i.d. O(1/√n) bound for any finite n. This establishes the claimed rate separation. revision: yes
-
Referee: [§4.2] §4.2 (Conditional Generation Setting): the extension claims degradation for conditional sampling but provides no explicit bound that accounts for the interaction between the conditioning variable and the semantic-bucket dependence; the argument therefore does not yet establish the claimed rate separation relative to i.i.d. conditional sampling.
Authors: We agree that the conditional case needs a more careful treatment. In the revision we will derive an explicit uniform convergence bound for the conditional setting by considering the joint measure over (query, response) pairs. The semantic-bucket dependence is now conditioned on the query embedding; we will show that the resulting mixing coefficient remains strictly positive and obtain a generalization-gap bound that is larger than the corresponding i.i.d. conditional bound by a factor that depends on the bucket collision probability under the conditional distribution. This establishes the rate separation for the conditional case as well. revision: yes
Circularity Check
No significant circularity: derivation applies external uniform convergence theory
full rationale
The paper's load-bearing theoretical step invokes standard uniform convergence theory to conclude that LADS-induced positive dependence (via semantic bucketing and shared seeds) degrades the distiller's generalization-gap convergence rate relative to i.i.d. sampling. No equations or text in the provided abstract or description reduce this claim to a quantity defined by the authors' own fits, self-citations, or ansatzes; the result is framed as a direct consequence of applying existing statistical learning bounds to the described sampling process. The derivation is therefore self-contained against external benchmarks and does not exhibit any of the enumerated circular patterns.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Uniform convergence theory can be applied to bound the generalization gap of a student model trained on correlated samples produced by LADS.
Reference graph
Works this paper leans on
-
[5]
Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
Defending against model stealing attacks with adaptive misinformation , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
-
[6]
Advances in Neural Information Processing Systems , volume=
Watermarking makes language models radioactive , author=. Advances in Neural Information Processing Systems , volume=
-
[7]
International Conference on Machine Learning , pages=
Protecting language generation models via invisible watermarking , author=. International Conference on Machine Learning , pages=. 2023 , organization=
work page 2023
-
[8]
2019 IEEE European Symposium on Security and Privacy (EuroS&P) , pages=
PRADA: protecting against DNN model stealing attacks , author=. 2019 IEEE European Symposium on Security and Privacy (EuroS&P) , pages=. 2019 , organization=
work page 2019
-
[9]
Proceedings of the AAAI Conference on Artificial Intelligence , volume=
Defense against model stealing based on account-aware distribution discrepancy , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=
-
[10]
IEEE Transactions on Information Forensics and Security , year=
Queen: Query unlearning against model extraction , author=. IEEE Transactions on Information Forensics and Security , year=
-
[11]
The Annals of Probability , pages=
Rates of convergence for empirical processes of stationary mixing sequences , author=. The Annals of Probability , pages=. 1994 , publisher=
work page 1994
-
[12]
Advances in neural information processing systems , volume=
Rademacher complexity bounds for non-iid processes , author=. Advances in neural information processing systems , volume=
-
[13]
The nature of statistical learning theory , author=. 2013 , publisher=
work page 2013
-
[14]
Advances in neural information processing systems , volume=
Denoising diffusion probabilistic models , author=. Advances in neural information processing systems , volume=
-
[15]
Score-Based Generative Modeling through Stochastic Differential Equations , author=. 2021 , eprint=
work page 2021
-
[16]
Elucidating the Design Space of Diffusion-Based Generative Models , author=. 2022 , eprint=
work page 2022
-
[17]
Statistical theory of extreme values and some practical applications: a series of lectures , author=. 1954 , publisher=
work page 1954
-
[18]
Advances in neural information processing systems , volume=
A* sampling , author=. Advances in neural information processing systems , volume=
-
[19]
Advances in neural information processing systems , volume=
Generative modeling by estimating gradients of the data distribution , author=. Advances in neural information processing systems , volume=
-
[20]
Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
Analyzing and improving the training dynamics of diffusion models , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
- [21]
-
[22]
Journal of Mathematical Psychology , volume=
The relationship between Luce's choice axiom, Thurstone's theory of comparative judgment, and the double exponential distribution , author=. Journal of Mathematical Psychology , volume=. 1977 , publisher=
work page 1977
-
[23]
2011 international conference on computer vision , pages=
Perturb-and-map random fields: Using discrete optimization to learn and sample from energy models , author=. 2011 international conference on computer vision , pages=. 2011 , organization=
work page 2011
-
[25]
A vector-contraction inequality for
Maurer, Andreas , booktitle=. A vector-contraction inequality for. 2016 , organization=
work page 2016
-
[26]
Advances in Neural Information Processing Systems (NeurIPS) , year=
On the complexity of linear prediction: Risk bounds, margin bounds, and regularization , author=. Advances in Neural Information Processing Systems (NeurIPS) , year=
-
[27]
Foundations of Machine Learning , author=
-
[28]
and Mendelson, Shahar , journal=
Bartlett, Peter L. and Mendelson, Shahar , journal=. Rademacher and
-
[29]
Koltchinskii, Vladimir , journal=. Local. 2006 , publisher=
work page 2006
-
[30]
High-Dimensional Statistics: A Non-Asymptotic Viewpoint , author=
- [31]
-
[32]
Advances in neural information processing systems , volume=
Gans trained by a two time-scale update rule converge to a local nash equilibrium , author=. Advances in neural information processing systems , volume=
-
[33]
Advances in Neural Information Processing Systems , volume=
Dart-math: Difficulty-aware rejection tuning for mathematical problem-solving , author=. Advances in Neural Information Processing Systems , volume=
-
[34]
Mathfusion: Enhancing mathematical problem-solving of llm through instruction fusion , author=. Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages=
-
[37]
Shao, Zhihong and Wang, Peiyi and Zhu, Qihao and Xu, Runxin and Song, Junxiao and Zhang, Mingchuan and Li, Y. K. and Wu, Y. and Guo, Daya , title =. CoRR , volume =. 2024 , url =
work page 2024
-
[39]
Qwen3 Embedding: Advancing Text Embedding and Reranking Through Foundation Models
Qwen3 Embedding: Advancing Text Embedding and Reranking Through Foundation Models , author=. arXiv preprint arXiv:2506.05176 , year=
work page internal anchor Pith review Pith/arXiv arXiv
-
[40]
Advances in neural information processing systems , volume=
Generative adversarial nets , author=. Advances in neural information processing systems , volume=
-
[41]
Auto-Encoding Variational Bayes
Auto-encoding variational bayes , author=. arXiv preprint arXiv:1312.6114 , year=
work page internal anchor Pith review Pith/arXiv arXiv
-
[42]
Improving language understanding by generative pre-training , author=. 2018 , publisher=
work page 2018
-
[43]
NeurIPS Deep Learning and Representation Learning Workshop , year=
Distilling the Knowledge in a Neural Network , author=. NeurIPS Deep Learning and Representation Learning Workshop , year=
-
[44]
DistilBERT, a distilled version of
Sanh, Victor and Debut, Lysandre and Chaumond, Julien and Wolf, Thomas , booktitle=. DistilBERT, a distilled version of
-
[45]
Sequence-Level Knowledge Distillation , author=. Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing (EMNLP) , pages=
work page 2016
-
[46]
Stealing Machine Learning Models via Prediction
Tram. Stealing Machine Learning Models via Prediction. 25th USENIX Security Symposium (USENIX Security 16) , pages=
-
[47]
and Papernot, Nicolas and Iyyer, Mohit , booktitle=
Krishna, Kalpesh and Tomar, Gaurav Singh and Parikh, Ankur P. and Papernot, Nicolas and Iyyer, Mohit , booktitle=. Thieves on Sesame Street!
-
[48]
Imitation Attacks and Defenses for Black-box Machine Translation Systems , author=. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) , pages=
work page 2020
-
[49]
Romero, Adriana and Ballas, Nicolas and Kahou, Samira Ebrahimi and Chassang, Antoine and Gatta, Carlo and Bengio, Yoshua , booktitle=
-
[50]
Proceedings of the 35th International Conference on Machine Learning (ICML) , year=
Born-Again Neural Networks , author=. Proceedings of the 35th International Conference on Machine Learning (ICML) , year=
-
[51]
International Conference on Learning Representations (ICLR) , year=
Contrastive Representation Distillation , author=. International Conference on Learning Representations (ICLR) , year=
-
[52]
Teaching Small Language Models to Reason , author=. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL) , year=
-
[53]
Hsieh, Cheng-Yu and Li, Chun-Liang and Yeh, Chih-Kuan and Nakhost, Hootan and Fujii, Yasuhisa and Ratner, Alexander and Krishna, Ranjay and Lee, Chen-Yu and Pfister, Tomas , booktitle=. Distilling Step-by-Step!
-
[54]
Gu, Yuxian and Dong, Li and Wei, Furu and Huang, Minlie , booktitle=
-
[55]
Practical Black-Box Attacks Against Machine Learning , author=. Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security (ASIACCS) , pages=
work page 2017
-
[56]
29th USENIX Security Symposium (USENIX Security 20) , pages=
High Accuracy and High Fidelity Extraction of Neural Networks , author=. 29th USENIX Security Symposium (USENIX Security 20) , pages=
-
[57]
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , pages=
Data-Free Model Extraction , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , pages=
-
[58]
Pal, Soham and Gupta, Yash and Shukla, Aditya and Kanade, Aditya and Shevade, Shirish and Ganapathy, Vinod , booktitle=
-
[59]
Proceedings of the 41st International Conference on Machine Learning (ICML) , year=
Stealing Part of a Production Language Model , author=. Proceedings of the 41st International Conference on Machine Learning (ICML) , year=
-
[60]
27th USENIX Security Symposium (USENIX Security 18) , pages=
Turning Your Weakness into a Strength: Watermarking Deep Neural Networks by Backdooring , author=. 27th USENIX Security Symposium (USENIX Security 18) , pages=
-
[61]
Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval (ICMR) , pages=
Embedding Watermarks into Deep Neural Networks , author=. Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval (ICMR) , pages=
work page 2017
-
[62]
30th USENIX Security Symposium (USENIX Security 21) , pages=
Entangled Watermarks as a Defense against Model Extraction , author=. 30th USENIX Security Symposium (USENIX Security 21) , pages=
-
[63]
2009 IEEE conference on computer vision and pattern recognition , pages=
Imagenet: A large-scale hierarchical image database , author=. 2009 IEEE conference on computer vision and pattern recognition , pages=. 2009 , organization=
work page 2009
-
[64]
International conference on machine learning , pages=
Deep unsupervised learning using nonequilibrium thermodynamics , author=. International conference on machine learning , pages=. 2015 , organization=
work page 2015
- [67]
-
[68]
Qwen2.5-Coder Technical Report
Qwen2. 5-coder technical report , author=. arXiv preprint arXiv:2409.12186 , year=
work page internal anchor Pith review Pith/arXiv arXiv
-
[71]
Turning your weakness into a strength: Watermarking deep neural networks by backdooring
Yossi Adi, Carsten Baum, Moustapha Ciss \'e , Benny Pinkas, and Joseph Keshet. Turning your weakness into a strength: Watermarking deep neural networks by backdooring. In 27th USENIX Security Symposium (USENIX Security 18), pages 1615--1631, 2018
work page 2018
-
[72]
Peter L. Bartlett and Shahar Mendelson. Rademacher and G aussian complexities: Risk bounds and structural results. Journal of Machine Learning Research, 3: 0 463--482, 2002
work page 2002
-
[73]
Stealing part of a production language model
Nicholas Carlini, Daniel Paleka, Krishnamurthy Dj Dvijotham, Thomas Steinke, Jonathan Hayase, A Feder Cooper, Katherine Lee, Matthew Jagielski, Milad Nasr, Arthur Conmy, et al. Stealing part of a production language model. Proceedings of the 41st International Conference on Machine Learning (ICML), 2024
work page 2024
-
[74]
Queen: Query unlearning against model extraction
Huajie Chen, Tianqing Zhu, Lefeng Zhang, Bo Liu, Derui Wang, Wanlei Zhou, and Minhui Xue. Queen: Query unlearning against model extraction. IEEE Transactions on Information Forensics and Security, 2025
work page 2025
-
[75]
Evaluating Large Language Models Trained on Code
Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Ponde De Oliveira Pinto, Jared Kaplan, Harri Edwards, Yuri Burda, Nicholas Joseph, Greg Brockman, et al. Evaluating large language models trained on code. arXiv preprint arXiv:2107.03374, 2021
work page internal anchor Pith review Pith/arXiv arXiv 2021
-
[76]
Training Verifiers to Solve Math Word Problems
Karl Cobbe, Vineet Kosaraju, Mohammad Bavarian, Mark Chen, Heewoo Jun, Lukasz Kaiser, Matthias Plappert, Jerry Tworek, Jacob Hilton, Reiichiro Nakano, et al. Training verifiers to solve math word problems. arXiv preprint arXiv:2110.14168, 2021
work page internal anchor Pith review Pith/arXiv arXiv 2021
-
[77]
Imagenet: A large-scale hierarchical image database
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, pages 248--255. Ieee, 2009
work page 2009
-
[78]
Tommaso Furlanello, Zachary Lipton, Michael Tschannen, Laurent Itti, and Anima Anandkumar. Born-again neural networks. In Proceedings of the 35th International Conference on Machine Learning (ICML), 2018
work page 2018
-
[79]
Ian J Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Generative adversarial nets. Advances in neural information processing systems, 27, 2014
work page 2014
-
[80]
Aaron Grattafiori, Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pandey, Abhishek Kadian, Ahmad Al-Dahle, Aiesha Letman, Akhil Mathur, Alan Schelten, Alex Vaughan, et al. The llama 3 herd of models. arXiv preprint arXiv:2407.21783, 2024
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[81]
M ini LLM : Knowledge distillation of large language models
Yuxian Gu, Li Dong, Furu Wei, and Minlie Huang. M ini LLM : Knowledge distillation of large language models. In International Conference on Learning Representations (ICLR), 2024
work page 2024
-
[82]
Emil Julius Gumbel. Statistical theory of extreme values and some practical applications: a series of lectures, volume 33. US Government Printing Office, 1954
work page 1954
-
[83]
On the Partition Function and Random Maximum A-Posteriori Perturbations
Tamir Hazan and Tommi Jaakkola. On the partition function and random maximum a-posteriori perturbations. arXiv preprint arXiv:1206.6410, 2012
work page internal anchor Pith review Pith/arXiv arXiv 2012
-
[84]
Measuring Mathematical Problem Solving With the MATH Dataset
Dan Hendrycks, Collin Burns, Saurav Kadavath, Akul Arora, Steven Basart, Eric Tang, Dawn Song, and Jacob Steinhardt. Measuring mathematical problem solving with the math dataset. arXiv preprint arXiv:2103.03874, 2021
work page internal anchor Pith review Pith/arXiv arXiv 2021
-
[85]
Gans trained by a two time-scale update rule converge to a local nash equilibrium
Martin Heusel, Hubert Ramsauer, Thomas Unterthiner, Bernhard Nessler, and Sepp Hochreiter. Gans trained by a two time-scale update rule converge to a local nash equilibrium. Advances in neural information processing systems, 30, 2017
work page 2017
-
[86]
Distilling the knowledge in a neural network
Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. Distilling the knowledge in a neural network. In NeurIPS Deep Learning and Representation Learning Workshop, 2015
work page 2015
-
[87]
Denoising diffusion probabilistic models
Jonathan Ho, Ajay Jain, and Pieter Abbeel. Denoising diffusion probabilistic models. Advances in neural information processing systems, 33: 0 6840--6851, 2020
work page 2020
-
[88]
Cheng-Yu Hsieh, Chun-Liang Li, Chih-Kuan Yeh, Hootan Nakhost, Yasuhisa Fujii, Alexander Ratner, Ranjay Krishna, Chen-Yu Lee, and Tomas Pfister. Distilling step-by-step! O utperforming larger language models with less training data and smaller model sizes. In Findings of the Association for Computational Linguistics (ACL Findings), 2023
work page 2023
-
[89]
High accuracy and high fidelity extraction of neural networks
Matthew Jagielski, Nicholas Carlini, David Berthelot, Alex Kurakin, and Nicolas Papernot. High accuracy and high fidelity extraction of neural networks. In 29th USENIX Security Symposium (USENIX Security 20), pages 1345--1362, 2020
work page 2020
-
[90]
Choquette-Choo, Varun Chandrasekaran, and Nicolas Papernot
Hengrui Jia, Christopher A. Choquette-Choo, Varun Chandrasekaran, and Nicolas Papernot. Entangled watermarks as a defense against model extraction. In 30th USENIX Security Symposium (USENIX Security 21), pages 1937--1954, 2021
work page 1937
-
[91]
Prada: protecting against dnn model stealing attacks
Mika Juuti, Sebastian Szyller, Samuel Marchal, and N Asokan. Prada: protecting against dnn model stealing attacks. In 2019 IEEE European Symposium on Security and Privacy (EuroS&P), pages 512--527. IEEE, 2019
work page 2019
-
[92]
Defending against model stealing attacks with adaptive misinformation
Sanjay Kariyappa and Moinuddin K Qureshi. Defending against model stealing attacks with adaptive misinformation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 770--778, 2020
work page 2020
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.