Towards the Anonymization of the Language Modeling
Pith reviewed 2026-05-23 06:25 UTC · model grok-4.3
The pith
Modified masking and causal training schemes let language models specialize on private data without memorizing direct or indirect identifiers.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The paper claims that its masking language modeling methodology for BERT-like models and causal language modeling methodology for GPT-like models avoid memorizing both direct and indirect identifiers during specialization on a medical dataset, delivering a favorable privacy-utility tradeoff compared with ordinary fine-tuning baselines.
What carries the argument
Masking language modeling and causal language modeling schemes that deliberately exclude direct and indirect identifying information from the training signal during model specialization.
If this is right
- Specialized models can be shared publicly while lowering the risk of leaking patient or user identifiers.
- The same schemes apply to both encoder-only and decoder-only architectures.
- Privacy gains come from changes to the training objective rather than post-hoc data filtering.
- The methods maintain downstream task performance close to unmodified fine-tuning.
Where Pith is reading between the lines
- Similar identifier-avoidance objectives could be tested on non-medical sensitive text such as legal or financial records.
- The schemes might reduce the amount of manual anonymization required before training.
- Combining the approach with existing differential privacy methods could further strengthen guarantees.
Load-bearing premise
The proposed masking and causal schemes, when run on a medical dataset and measured against baselines, will avoid identifier memorization without large drops in model utility.
What would settle it
A model trained with the proposed schemes that still produces direct or indirect identifiers from its training data or shows substantially lower accuracy on held-out medical tasks than standard fine-tuning.
Figures
read the original abstract
Rapid advances in Natural Language Processing (NLP) have revolutionized many fields, including healthcare. However, these advances raise significant privacy concerns, especially when pre-trained models fine-tuned and specialized on sensitive data can memorize and then expose and regurgitate personal information. This paper presents a privacy-preserving language modeling approach to address the problem of language models anonymization, and thus promote their sharing. Specifically, we propose both a Masking Language Modeling (MLM) methodology to specialize a BERT-like language model, and a Causal Language Modeling (CLM) methodology to specialize a GPT-like model that avoids the model from memorizing direct and indirect identifying information present in the training data. We have comprehensively evaluated our approaches using a medical dataset and compared them against different baselines. Our results indicate that by avoiding memorizing both direct and indirect identifiers during model specialization, our masking and causal language modeling schemes offer a good tradeoff for maintaining high privacy while retaining high utility.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes a Masking Language Modeling (MLM) scheme to specialize BERT-like models and a Causal Language Modeling (CLM) scheme to specialize GPT-like models. Both are designed to avoid memorizing direct and indirect identifiers during specialization on sensitive data (e.g., medical records). The central claim is that these schemes, when evaluated on a medical dataset against baselines, achieve a favorable privacy-utility tradeoff.
Significance. If the empirical results demonstrate effective avoidance of identifier memorization while preserving utility, the work would be relevant for privacy-preserving specialization of language models in regulated domains such as healthcare. The approach is empirical rather than theoretical and does not introduce new formal bounds or proofs.
major comments (2)
- [Abstract] Abstract: The text asserts that the approaches were 'comprehensively evaluated' on a medical dataset 'against different baselines' and that results show a 'good tradeoff,' yet supplies no quantitative metrics (e.g., privacy leakage rates, utility scores such as perplexity or downstream task accuracy), no named baselines, no dataset statistics or split details, and no error analysis. This absence prevents verification of the central empirical claim.
- [Abstract] Abstract (and implied evaluation section): The claim that the schemes 'avoid memorizing both direct and indirect identifiers' is presented as the outcome of the masking and causal modeling choices, but without reported measurements of memorization (e.g., via membership inference, identifier extraction attacks, or differential privacy metrics) it is impossible to determine whether the privacy side of the tradeoff holds.
minor comments (2)
- [Title] Title: 'Towards the Anonymization of the Language Modeling' is grammatically awkward; consider rephrasing to 'Towards Anonymization of Language Models' or similar.
- [Abstract] Abstract: The phrase 'language models anonymization' should be revised to 'anonymization of language models' for clarity.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback highlighting the need for greater specificity in the abstract. We agree that the abstract should better support the central claims and will revise it accordingly while preserving its concise nature. We address each major comment below.
read point-by-point responses
-
Referee: [Abstract] Abstract: The text asserts that the approaches were 'comprehensively evaluated' on a medical dataset 'against different baselines' and that results show a 'good tradeoff,' yet supplies no quantitative metrics (e.g., privacy leakage rates, utility scores such as perplexity or downstream task accuracy), no named baselines, no dataset statistics or split details, and no error analysis. This absence prevents verification of the central empirical claim.
Authors: We agree that the abstract would benefit from additional quantitative detail to facilitate immediate assessment of the results. In the revised manuscript we will incorporate key metrics (privacy leakage rates, utility scores such as perplexity and downstream accuracy), name the baselines, and briefly note dataset statistics and splits. The full evaluation, including error analysis, is already reported in the Experiments section; the abstract revision will reference these results at a summary level. revision: yes
-
Referee: [Abstract] Abstract (and implied evaluation section): The claim that the schemes 'avoid memorizing both direct and indirect identifiers' is presented as the outcome of the masking and causal modeling choices, but without reported measurements of memorization (e.g., via membership inference, identifier extraction attacks, or differential privacy metrics) it is impossible to determine whether the privacy side of the tradeoff holds.
Authors: The manuscript reports empirical measurements of memorization via identifier extraction attacks and membership inference attacks on the specialized models; these results are presented in the evaluation to quantify the privacy-utility tradeoff. We will revise the abstract to explicitly cite these measurements and the observed leakage rates so that the privacy claim is directly supported by the reported evidence. revision: yes
Circularity Check
No significant circularity identified
full rationale
The paper proposes empirical MLM and CLM specialization schemes for privacy-preserving language modeling on sensitive data, evaluated via experiments on a medical dataset against baselines. No equations, derivations, or self-citation chains appear in the abstract or described approach. Claims rest on experimental demonstration of identifier avoidance and utility retention rather than any reduction to fitted inputs, self-definitions, or imported uniqueness results. The derivation chain is absent; the work is self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
2024. GPT-4 Technical Report. arXiv:2303.08774 [cs.CL] https://arxiv.org/abs/ 2303.08774
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[2]
Brendan McMahan, Ilya Mironov, Kunal Talwar, and Li Zhang
Martin Abadi, Andy Chu, Ian Goodfellow, H. Brendan McMahan, Ilya Mironov, Kunal Talwar, and Li Zhang. 2016. Deep Learning with Differential Privacy. (oct 2016). https://doi.org/10.1145/2976749.2978318
-
[3]
Marianne Abi Kanaan, Jean-François Couchot, Christophe Guyeux, David Laiy- mani, Talar Atechian, and Rony Darazi. 2023. A methodology for emergency calls severity prediction: from pre-processing to BERT-based classifiers. In IFIP Advances in Information and Communication Technology , Vol. 675. Leon, Spain. https://doi.org/10.1007/978-3-031-34111-3_28
-
[4]
Michael Aerni, Jie Zhang, and Florian Tramèr. 2024. Evaluations of Machine Learning Privacy Defenses are Misleading. In Proceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security (Salt Lake City, UT, USA) (CCS ’24). Association for Computing Machinery, New York, NY, USA, 1271–1284. https://doi.org/10.1145/3658644.3690194
-
[5]
J Alammar. (2018). The Illustrated Transformer. Retrievedfromhttps://jalammar. github.io/illustrated-transformer/
work page 2018
-
[6]
Nicholas Carlini, Steve Chien, Milad Nasr, Shuang Song, Andreas Terzis, and Florian Tramer. 2022. Membership inference attacks from first principles. In 2022 IEEE Symposium on Security and Privacy (SP) . IEEE, 1897–1914. https: //doi.org/10.1109/SP46214.2022.9833649
-
[7]
Nicholas Carlini, Daphne Ippolito, Matthew Jagielski, Katherine Lee, Florian Tramer, and Chiyuan Zhang. 2023. Quantifying Memorization Across Neural Language Models. arXiv:2202.07646 [cs.LG] https://arxiv.org/abs/2202.07646
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[8]
Nicholas Carlini, Chang Liu, Úlfar Erlingsson, Jernej Kos, and Dawn Song. 2019. The Secret Sharer: Evaluating and Testing Unintended Memorization in Neural Networks. arXiv:1802.08232 [cs.LG] https://arxiv.org/abs/1802.08232
work page internal anchor Pith review Pith/arXiv arXiv 2019
-
[9]
Brown, Dawn Song, Úlfar Er- lingsson, Alina Oprea, and Colin Raffel
Nicholas Carlini, Florian Tramèr, Eric Wallace, Matthew Jagielski, Ariel Herbert- Voss, Katherine Lee, Adam Roberts, Tom B. Brown, Dawn Song, Úlfar Er- lingsson, Alina Oprea, and Colin Raffel. 2020. Extracting Training Data from Large Language Models. CoRR abs/2012.07805 (2020). arXiv:2012.07805 https: //arxiv.org/abs/2012.07805
- [10]
-
[11]
Ruizhe Chen, Tianxiang Hu, Yang Feng, and Zuozhu Liu. 2024. Learnable Privacy Neurons Localization in Language Models. InProceedings of the 62nd Annual Meet- ing of the Association for Computational Linguistics (Volume 2: Short Papers) , Lun- Wei Ku, Andre Martins, and Vivek Srikumar (Eds.). Association for Computational Linguistics, Bangkok, Thailand, 25...
- [12]
-
[13]
A. Feder Cooper, Katherine Lee, James Grimmelmann, Daphne Ippolito, Christo- pher Callison-Burch, Christopher A. Choquette-Choo, Niloofar Mireshghallah, Miles Brundage, David Mimno, Madiha Zahrah Choksi, Jack M. Balkin, Nicholas Carlini, Christopher De Sa, Jonathan Frankle, Deep Ganguli, Bryant Gipson, Andres Guadamuz, Swee Leng Harris, Abigail Z. Jacobs,...
-
[14]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv:1810.04805 [cs.CL] https://arxiv.org/abs/1810.04805
work page internal anchor Pith review Pith/arXiv arXiv 2019
-
[15]
Michael Duan, Anshuman Suri, Niloofar Mireshghallah, Sewon Min, Weijia Shi, Luke Zettlemoyer, Yulia Tsvetkov, Yejin Choi, David Evans, and Hannaneh Hajishirzi. 2024. Do Membership Inference Attacks Work on Large Language Models?. In Conference on Language Modeling (COLM)
work page 2024
-
[16]
Henri Duprieu and Nicolas Berkouk. 2024. Techniques d’audit des grands modèles de langage. Technical Report. Commission Nationale Informatique et Libertés (CNIL). https://hal.science/hal-04782667
work page 2024
-
[17]
Basile Dura, Charline Jean, Xavier Tannier, Alice Calliger, Romain Bey, Antoine Neuraz, and Rémi Flicoteaux. 2022. Learning structures of the French clinical language:development and validation of word embedding models using 21 million clinical reports from electronic health records. arXiv:2207.12940 [cs.CL] https: //arxiv.org/abs/2207.12940
-
[18]
Cynthia Dwork. 2006. Differential privacy. In International colloquium on au- tomata, languages, and programming . Springer, 1–12
work page 2006
-
[19]
Arcolezi, and José Maria De Fuentes
Cédric Eichler, Nathan Champeil, Nicolas Anciaux, Alexandra Bensamoun, Héber H. Arcolezi, and José Maria De Fuentes. 2025. Nob-MIAs: Non-biased Member- ship Inference Attacks Assessment on Large Language Models with Ex-Post Dataset Construction. In Web Information Systems Engineering – WISE 2024 , Mah- moud Barhamgi, Hua Wang, and Xin Wang (Eds.). Springe...
work page 2025
- [20]
-
[21]
Farshid Faal. 2022. Reinforcement Learning for Mitigating Toxicity in Neural Dialogue Systems. Ph. D. Dissertation. Concordia University
work page 2022
-
[22]
Sagar Goyal, Eti Rastogi, Sree Prasanna Rajagopal, Dong Yuan, Fen Zhao, Jai Chintagunta, Gautam Naik, and Jeff Ward. 2024. Healai: A healthcare llm for effective medical documentation. In Proceedings of the 17th ACM International Conference on Web Search and Data Mining . 1167–1168. https://doi.org/10.1145/ 3616855.3635739
-
[23]
Hamza Harkous, Sai Teja Peddinti, Rishabh Khandelwal, Animesh Srivastava, and Nina Taft. 2022. Hark: A Deep Learning System for Navigating Privacy Feedback at Scale. In 2022 IEEE Symposium on Security and Privacy (SP) . 2469–
work page 2022
-
[24]
https://doi.org/10.1109/SP46214.2022.9833729
-
[25]
Tzvika Hartman, Michael D Howell, Jeff Dean, Shlomo Hoory, Ronit Slyper, Itay Laish, Oren Gilon, Danny Vainstein, Greg Corrado, Katherine Chou, et al. 2020. Customization scenarios for de-identification of clinical notes. BMC medical informatics and decision making 20, 1 (2020), 1–9. https://doi.org/10.1186/s12911- 020-1026-2
-
[26]
Sam Henry, Kevin Buchan, Michele Filannino, Amber Stubbs, and Ozlem Uzuner
-
[27]
Journal of the American Medical Informatics Association 27, 1 (2020), 3–12
2018 n2c2 shared task on adverse drug events and medication extraction in electronic health records. Journal of the American Medical Informatics Association 27, 1 (2020), 3–12. https://doi.org/10.1093/jamia/ocz166
-
[28]
Kexin Huang, Jaan Altosaar, and Rajesh Ranganath. 2020. ClinicalBERT: Modeling Clinical Notes and Predicting Hospital Readmission. arXiv:1904.05342 [cs.CL] https://arxiv.org/abs/1904.05342
work page internal anchor Pith review Pith/arXiv arXiv 2020
- [29]
- [30]
-
[31]
Eric Lehman, Sarthak Jain, Karl Pichotta, Yoav Goldberg, and Byron Wallace. 2021. Does BERT Pretrained on Clinical Notes Reveal Sensitive Data?. InProceedings of the 2021 Conference of the North American Chapter of the Association for Computa- tional Linguistics: Human Language Technologies . Association for Computational Linguistics, Online, 946–959. htt...
-
[32]
Ximeng Liu, Lehui Xie, Yaopeng Wang, Jian Zou, Jinbo Xiong, Zuobin Ying, and Athanasios V. Vasilakos. 2021. Privacy and Security Issues in Deep Learning: A Survey. IEEE Access 9 (2021), 4566–4593. https://doi.org/10.1109/ACCESS.2020. 3045078
-
[33]
Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. RoBERTa: A Robustly Optimized BERT Pretraining Approach. arXiv:1907.11692 [cs.CL] https://arxiv.org/abs/1907.11692
work page internal anchor Pith review Pith/arXiv arXiv 2019
- [34]
- [35]
-
[36]
Scalable Extraction of Training Data from (Production) Language Models
Milad Nasr, Nicholas Carlini, Jonathan Hayase, Matthew Jagielski, A. Feder Cooper, Daphne Ippolito, Christopher A. Choquette-Choo, Eric Wallace, Florian Tramèr, and Katherine Lee. 2023. Scalable Extraction of Training Data from (Production) Language Models. arXiv:2311.17035 [cs.LG] https://arxiv.org/abs/ 2311.17035
work page internal anchor Pith review Pith/arXiv arXiv 2023
- [37]
-
[38]
Michele Panariello, Natalia Tomashenko, Xin Wang, Xiaoxiao Miao, Pierre Cham- pion, Hubert Nourtel, Massimiliano Todisco, Nicholas Evans, Emmanuel Vincent, and Junichi Yamagishi. 2024. The VoicePrivacy 2022 Challenge: Progress and Perspectives in Voice Anonymisation. IEEE/ACM Transactions on Audio, Speech, and Language Processing 32 (2024), 3477–3491. htt...
-
[39]
Yifan Peng, Shankai Yan, and Zhiyong Lu. 2019. Transfer Learning in Biomedical Natural Language Processing: An Evaluation of BERT and ELMo on Ten Bench- marking Datasets. arXiv:1906.05474 [cs.CL] https://arxiv.org/abs/1906.05474 13
work page internal anchor Pith review Pith/arXiv arXiv 2019
- [40]
-
[41]
Alec Radford, Jeff Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever. 2019. Language Models are Unsupervised Multitask Learners. https: //api.semanticscholar.org/CorpusID:160025533
work page 2019
-
[42]
Antoine Richard, François Talbot, and David Gimbert. 2023. Anonymisation de documents médicaux en texte libre et en français via réseaux de neurones. InPlate- forme Intelligence Artificielle 2023 (PFIA2023) - Journée Santé & IA . Association française pour l’Intelligence Artificielle (AfIA) and Université de Strasbourg and Association française d’Informat...
work page 2023
-
[44]
Victor Sanh, Lysandre Debut, Julien Chaumond, and Thomas Wolf. 2020. DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. arXiv:1910.01108 [cs.CL] https://arxiv.org/abs/1910.01108
work page internal anchor Pith review Pith/arXiv arXiv 2020
-
[45]
Rethinking llm memorization through the lens of adversarial compression
Avi Schwarzschild, Zhili Feng, Pratyush Maini, Zachary C. Lipton, and J. Zico Kolter. 2024. Rethinking LLM Memorization through the Lens of Adversarial Compression. arXiv:2404.15146 [cs.LG] https://arxiv.org/abs/2404.15146
-
[46]
Reza Shokri, Marco Stronati, Congzheng Song, and Vitaly Shmatikov. 2017. Mem- bership inference attacks against machine learning models. In 2017 IEEE sympo- sium on security and privacy (SP) . IEEE, 3–18. https://doi.org/10.1109/SP.2017.41
- [47]
-
[48]
Tanmay Singla, Dharun Anandayuvaraj, Kelechi G. Kalu, Taylor R. Schorlemmer, and James C. Davis. 2023. An Empirical Study on Using Large Language Models to Analyze Software Supply Chain Security Failures. In Proceedings of the 2023 Workshop on Software Supply Chain Offensive Research and Ecosystem Defenses (Copenhagen, Denmark) (SCORED ’23). Association f...
-
[49]
Till Speicher, Mohammad Aflah Khan, Qinyuan Wu, Vedant Nanda, Soumi Das, Bishwamittra Ghosh, Krishna P. Gummadi, and Evimaria Terzi. 2024. Under- standing Memorisation in LLMs: Dynamics, Influencing Factors, and Implications. arXiv:2407.19262 [cs.CL] https://arxiv.org/abs/2407.19262
- [50]
-
[51]
Latanya Sweeney. 2002. k-anonymity: A model for protecting privacy. Interna- tional journal of uncertainty, fuzziness and knowledge-based systems 10, 05 (2002), 557–570. https://doi.org/10.1142/S0218488502001648
-
[52]
Xavier Tannier, Perceval Wajsbürt, Alice Calliger, Basile Dura, Alexandre Mouchet, Martin Hilka, and Romain Bey. 2023. Development and validation of a natural language processing algorithm to pseudonymize documents in the context of a clinical data warehouse. arXiv:2303.13451 [cs.CL] https: //arxiv.org/abs/2303.13451
- [53]
-
[54]
Özlem Uzuner, Ira Goldstein, Yuan Luo, and Isaac Kohane. 2008. Identi- fying Patient Smoking Status from Medical Discharge Records. Journal of the American Medical Informatics Association 15, 1 (01 2008), 14–24. https: //doi.org/10.1197/jamia.M2408 arXiv:https://academic.oup.com/jamia/article- pdf/15/1/14/2339646/15-1-14.pdf
-
[55]
Özlem Uzuner, Yuan Luo, and Peter Szolovits. 2007. Evaluating the State-of-the- Art in Automatic De-identification. Journal of the American Medical Informatics Association 14, 5 (09 2007), 550–563. https://doi.org/10.1197/jamia.M2444
-
[56]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention Is All You Need. CoRR abs/1706.03762 (2017). arXiv:1706.03762 http://arxiv.org/abs/ 1706.03762
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[57]
Yijue Wang, Nuo Xu, Shaoyi Huang, Kaleel Mahmood, Dan Guo, Caiwen Ding, Wujie Wen, and Sanguthevar Rajasekaran. 2022. Analyzing and Defending against Membership Inference Attacks in Natural Language Processing Classification. In 2022 IEEE International Conference on Big Data (Big Data) . 5823–5832. https: //doi.org/10.1109/BigData55660.2022.10020711
-
[58]
Bernstam, Martin J Citardi, and Hua Xu
Qiang Wei, Xu Zuo, Omer Anjum, Yan Hu, Ryan Denlinger, Elmer V. Bernstam, Martin J Citardi, and Hua Xu. 2022. ClinicalLayoutLM: A Pre-trained Multi-modal Model for Understanding Scanned Document in Electronic Health Records. In 2022 IEEE International Conference on Big Data (Big Data) . 2821–2827. https: //doi.org/10.1109/BigData55660.2022.10020569
-
[59]
Laura Weidinger, Jonathan Uesato, Maribeth Rauh, Conor Griffin, Po-Sen Huang, John Mellor, Amelia Glaese, Myra Cheng, Borja Balle, Atoosa Kasirzadeh, Court- ney Biles, Sasha Brown, Zac Kenton, Will Hawkins, Tom Stepleton, Abeba Birhane, Lisa Anne Hendricks, Laura Rimell, William Isaac, Julia Haas, Sean Legassick, Geoffrey Irving, and Iason Gabriel. 2022. ...
-
[60]
Lukas Wutschitz, Huseyin A. Inan, and Andre Manoel. 2022. dp-transformers: Training transformer models with differential privacy. https://www.microsoft. com/en-us/research/project/dp-transformers
work page 2022
-
[61]
Xi Yang, Tianchen Lyu, Chih-Yin Lee, Jiang Bian, William R. Hogan, and Yonghui Wu. 2019. A Study of Deep Learning Methods for De-identification of Clini- cal Notes at Cross Institute Settings. In 2019 IEEE International Conference on Healthcare Informatics (ICHI). 1–3. https://doi.org/10.1109/ICHI.2019.8904544
-
[62]
Jiayuan Ye, Aadyaa Maddi, Sasi Kumar Murakonda, Vincent Bindschaedler, and Reza Shokri. 2022. Enhanced Membership Inference Attacks against Machine Learning Models. In Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security (Los Angeles, CA, USA) (CCS ’22). Association for Computing Machinery, New York, NY, USA, 3093–3106. ht...
-
[63]
Hangu Yeo, Elahe Khorasani, Vadim Sheinin, Irene Manotas, Ngoc Phuoc An Vo, Octavian Popescu, and Petros Zerfos. 2022. Natural Language Interface for Process Mining Queries in Healthcare. In2022 IEEE International Conference on Big Data (Big Data). 4443–4452. https://doi.org/10.1109/BigData55660.2022.10020685
-
[64]
Samuel Yeom, Irene Giacomelli, Matt Fredrikson, and Somesh Jha. 2018. Privacy Risk in Machine Learning: Analyzing the Connection to Overfitting . In2018 IEEE 31st Computer Security Foundations Symposium (CSF) . IEEE Computer Society, Los Alamitos, CA, USA, 268–282. https://doi.org/10.1109/CSF.2018.00027
-
[65]
Differentially private fine-tuning of language models
Da Yu, Saurabh Naik, Arturs Backurs, Sivakanth Gopi, Huseyin A. Inan, Gau- tam Kamath, Janardhan Kulkarni, Yin Tat Lee, Andre Manoel, Lukas Wutschitz, Sergey Yekhanin, and Huishuai Zhang. 2022. Differentially Private Fine-tuning of Language Models. arXiv:2110.06500 [cs.LG] https://arxiv.org/abs/2110.06500
- [66]
-
[67]
Chiyuan Zhang, Daphne Ippolito, Katherine Lee, Matthew Jagielski, Florian Tramèr, and Nicholas Carlini. 2021. Counterfactual Memorization in Neural Language Models. CoRR abs/2112.12938 (2021). arXiv:2112.12938 https://arxiv. org/abs/2112.12938 A PRIV ACY LEAKAGE ON DIRECT OR INDIRECT IDENTIFIERS A.1 MLM A model can leak information by predicting or regurg...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.