Privacy Leakage via Output Label Space and Differentially Private Continual Learning

Antti Honkela; Arno Solin; Marcus Klasson; Marlon Tobaben; Mikko Heikkil\"a; Talal Alrawajfeh

arxiv: 2411.04680 · v5 · submitted 2024-11-07 · 💻 cs.LG · cs.CR

Privacy Leakage via Output Label Space and Differentially Private Continual Learning

Marlon Tobaben , Talal Alrawajfeh , Marcus Klasson , Mikko Heikkil\"a , Arno Solin , Antti Honkela This is my paper

Pith reviewed 2026-05-23 17:15 UTC · model grok-4.3

classification 💻 cs.LG cs.CR

keywords differential privacycontinual learningprivacy side-channeloutput label spacemembership inferenceSplit-CIFAR-100Split-ImageNet-R

0 comments

The pith

The output label space of a classification model acts as a privacy side-channel that leaks information even when training uses differential privacy.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper identifies the set of output labels a model can predict as an observable side-channel that reveals which data the model has seen. In continual learning the label space grows over time, so an attacker who sees the labels can infer membership or other private details even if the model weights satisfy DP. The authors formalise differential privacy for continual learning to handle changing label spaces and propose two fixes: releasing labels through an optimal DP mechanism or expanding to a large public label set. They adapt pre-trained models and show higher accuracy than prior DP continual learning work on Split-CIFAR-100 and Split-ImageNet-R while using a stronger privacy definition.

Core claim

The output label space of a classification model is a privacy side-channel; a concrete attack exploits it in continual learning, and two mitigation methods (DP label release or large public label space) allow models to achieve higher accuracy under a formal DP guarantee for continual learning than previous approaches on Split-CIFAR-100 and Split-ImageNet-R.

What carries the argument

Formalisation of differential privacy for continual learning that accounts for a time-varying output label space and treats the label space itself as observable to the adversary.

If this is right

Models trained with the two proposed label-space mitigations satisfy the new DP definition for continual learning.
Accuracy under the stronger privacy model exceeds that of earlier DP continual learning methods on the two split benchmarks.
The side-channel exists whenever the label space is observable, even if weights are DP-protected.
Pre-trained models can be adapted while preserving the privacy guarantee after label-space handling.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the label space attack generalises beyond the two benchmarks, many deployed continual learners would need label-space protection regardless of weight-level DP.
Using a fixed large public label space may trade off task-specific accuracy for privacy in settings where new classes arrive frequently.
The formalisation could be extended to other side-channels such as model output confidence scores or decision thresholds.

Load-bearing premise

The formal definition of DP for continual learning correctly models the privacy risk when the set of possible output labels changes over time and remains visible to an attacker.

What would settle it

Run the described label-space attack on a DP-trained continual learner whose label space is released without the proposed mitigation and measure whether membership inference or data reconstruction succeeds at a rate above the DP bound.

Figures

Figures reproduced from arXiv: 2411.04680 by Antti Honkela, Arno Solin, Marcus Klasson, Marlon Tobaben, Mikko Heikkil\"a, Talal Alrawajfeh.

**Figure 2.** Figure 2: Lower bound of the probability that a new label is not added to the output label space Ot. Even with ϵ = 1.0 and δ = 10−7 , classes having fewer than 13 samples are discarded with at least 99% probability, and thus cannot be learned. Regarding the last point, to quantify how likely it is to drop a label, we compute a tight lower bound on this probability. Suppose y appears in k examples of Dt. Dropping y e… view at source ↗

**Figure 3.** Figure 3: Split-CIFAR-100 at ϵ=1 (left) and Split-ImageNet-R at ϵ=8 (right) at δ=10−5 : Both methods only decrease slightly in utility when greatly increasing the number of assumed labels through dummy labels. Bad label match affects Cosine Classifier. Similar observations and detailed results in App. F.1. 1. Match of output label space based on prior information O prior t and labels that occur in the data: In [PIT… view at source ↗

**Figure 4.** Figure 4: Blurry tasks on Split-CIFAR-100 (left: data distribution per task, right: final acc with [PITH_FULL_IMAGE:figures/full_fig_p009_4.png] view at source ↗

**Figure 5.** Figure 5: Split-CIFAR-100 (left) and Split-ImageNet-R (right) with [PITH_FULL_IMAGE:figures/full_fig_p009_5.png] view at source ↗

read the original abstract

Differential privacy (DP) is a formal privacy framework that enables training machine learning (ML) models while protecting individuals' data. As pointed out by prior work, ML models are part of larger systems, which can lead to so-called privacy side-channels even if the model training itself is DP. We identify the output label space of a classification model as such a privacy side-channel and show a concrete privacy attack that exploits it. The side-channel becomes highly relevant in continual learning (CL), where the output label space changes over time. To reason about privacy guarantees in CL, we introduce a formalisation of DP for CL, which also clarifies how our approach differs from existing approaches. We propose and evaluate two methods for eliminating this side-channel: applying an optimal DP mechanism to release the labels in the sensitive data, and using a large public label space. We explore the trade-offs of these methods through adapting pre-trained models. We demonstrate empirically that our models consistently achieve higher accuracy under DP than previous work over both Split-CIFAR-100 and Split-ImageNet-R, with a stronger privacy model.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper flags a label-space side-channel in DP continual learning and gives a new formalization plus mitigations, but it is unclear whether the definition actually bounds the observable label releases.

read the letter

The main point is that output label spaces can act as a side-channel even when training is DP, and this matters more in continual learning because the label set grows over tasks. They formalize DP for CL to capture that setting and test two fixes: an optimal DP mechanism on the labels themselves, or expanding to a large public label space. They adapt pre-trained models and report higher accuracy than earlier DP-CL work on Split-CIFAR-100 and Split-ImageNet-R while claiming a stronger privacy model overall. That empirical comparison is the clearest new evidence they provide. The attack description is concrete enough to be useful for anyone thinking about deployed CL systems. The formalization itself is the part that needs the most scrutiny. The abstract does not show how the definition composes across tasks or whether the released label sets are treated as part of the mechanism output that must be bounded by epsilon. If the definition only protects parameters and treats label-space release as free, then the side-channel they identify would sit outside the guarantee, which would weaken the stronger-privacy claim. The experiments use standard splits and report consistent gains, but without error bars or full attack details in the abstract it is hard to judge how robust the accuracy advantage is. This work is aimed at researchers already working on privacy-preserving continual learning rather than a broad ML audience. A reader who cares about side-channels or DP composition in sequential settings would find the attack and the two mitigation approaches worth reading. The paper is coherent on its own terms and the experiments are reproducible in principle, so it should go to peer review rather than a desk reject.

Referee Report

3 major / 2 minor

Summary. The paper identifies the output label space of classification models as a privacy side-channel that becomes especially relevant in continual learning (CL) because the label space evolves over time. It introduces a new formalization of differential privacy (DP) tailored to CL, proposes two mitigation strategies (applying an optimal DP mechanism to release labels from sensitive data, or using a large public label space), adapts pre-trained models accordingly, and reports empirical results showing higher accuracy than prior work on Split-CIFAR-100 and Split-ImageNet-R under a stronger privacy model while demonstrating a concrete attack exploiting the side-channel.

Significance. If the new DP formalization for CL correctly incorporates observable label-space releases into the mechanism output and composes across tasks, and if the accuracy gains are robust, the work would be significant for closing a previously unaddressed side-channel in privacy-preserving CL. The explicit attack construction and the two mitigation approaches provide concrete, falsifiable contributions that could influence both theory and practice in dynamic learning settings.

major comments (3)

[DP formalization section] The DP formalization for CL (introduced to reason about privacy guarantees when label space changes): the definition must treat the sequence of released label sets as part of the mechanism output distribution; otherwise the side-channel attack identified in the abstract remains outside the stated privacy bound. The abstract and skeptic note give no indication that composition across evolving label spaces is proven or that neighboring task sequences are distinguished only up to the claimed epsilon when label support is observable.
[Experimental evaluation] Empirical claims of higher accuracy under DP (Split-CIFAR-100 and Split-ImageNet-R experiments): without reported error bars, exact data splits, attack implementation details, or baseline epsilon values, it is impossible to verify that the accuracy improvement is not an artifact of weaker effective privacy or different evaluation protocols. The central claim that the new methods achieve both stronger privacy and higher accuracy rests on these results.
[Threat model and formalization] Threat model and label-space observability: the weakest assumption noted (that the DP formalization correctly captures privacy when the output label space changes over time and is observable) is load-bearing; if the definition only protects model parameters while treating label-space release as free, the paper's own attack would violate the guarantee, undermining the “stronger privacy model” assertion.

minor comments (2)

[Abstract] Abstract: the phrase “stronger privacy model” should be accompanied by a concrete comparison (e.g., specific ε values or neighboring-relation definition) rather than left qualitative.
[Methods] Notation for the two proposed methods (DP label release vs. public label space) should be introduced with explicit symbols early so that later empirical tables are immediately interpretable.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their constructive feedback. We address each major comment below, clarifying the DP formalization for continual learning and committing to improvements in the experimental reporting. Our responses aim to strengthen the manuscript while maintaining the integrity of our contributions.

read point-by-point responses

Referee: [DP formalization section] The DP formalization for CL (introduced to reason about privacy guarantees when label space changes): the definition must treat the sequence of released label sets as part of the mechanism output distribution; otherwise the side-channel attack identified in the abstract remains outside the stated privacy bound. The abstract and skeptic note give no indication that composition across evolving label spaces is proven or that neighboring task sequences are distinguished only up to the claimed epsilon when label support is observable.

Authors: We agree that the formalization must include the sequence of released label sets in the mechanism output to properly bound the side-channel. Our DP definition for CL (Section 3) explicitly defines the mechanism output as the pair consisting of the model update and the released label set for each task. The composition theorem accounts for the evolving label spaces by considering the full output distribution across the task sequence. Neighboring task sequences are distinguished only up to the claimed epsilon under this output. We will revise the abstract and add an explicit remark in Section 3 to highlight this aspect. revision: partial
Referee: [Experimental evaluation] Empirical claims of higher accuracy under DP (Split-CIFAR-100 and Split-ImageNet-R experiments): without reported error bars, exact data splits, attack implementation details, or baseline epsilon values, it is impossible to verify that the accuracy improvement is not an artifact of weaker effective privacy or different evaluation protocols. The central claim that the new methods achieve both stronger privacy and higher accuracy rests on these results.

Authors: We acknowledge that the current experimental section lacks sufficient details for full reproducibility and verification. In the revised manuscript we will report error bars over five independent runs, specify the exact data splits and preprocessing for both Split-CIFAR-100 and Split-ImageNet-R, provide implementation details (including pseudocode) for the concrete attack, and list the precise epsilon values used for all baselines and our methods. These additions will confirm that the reported accuracy gains hold under the stronger privacy model that incorporates label-space releases. revision: yes
Referee: [Threat model and formalization] Threat model and label-space observability: the weakest assumption noted (that the DP formalization correctly captures privacy when the output label space changes over time and is observable) is load-bearing; if the definition only protects model parameters while treating label-space release as free, the paper's own attack would violate the guarantee, undermining the “stronger privacy model” assertion.

Authors: Our threat model assumes full observability of the label space at each task. The DP formalization for CL is constructed precisely so that the mechanism output includes both model parameters and label-set releases; the privacy guarantee therefore applies to the combined observable output. Consequently the concrete attack lies inside the stated epsilon bound, which is what enables the claim of a stronger privacy model relative to prior work that omits this side-channel. revision: no

Circularity Check

0 steps flagged

No circularity: new DP formalization and empirical evaluation are independent of inputs

full rationale

The paper introduces a new formalization of DP for continual learning to address label-space side channels and evaluates two mitigation methods empirically on Split-CIFAR-100 and Split-ImageNet-R. No equations, derivations, or predictions in the abstract or description reduce to fitted parameters, self-definitions, or self-citation chains. The central claims rest on the introduced definition and experimental comparisons, which do not collapse to the inputs by construction. This is the expected non-finding for a paper whose core is a new definition plus benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract supplies no information on free parameters, axioms, or invented entities used in the formalization or experiments.

pith-pipeline@v0.9.0 · 5747 in / 1037 out tokens · 53748 ms · 2026-05-23T17:15:26.318279+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/AbsoluteFloorClosure.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We introduce a formalisation of DP for CL... task-wise (ϵ, δ)-DP if for all t ∈ I, all adjacent datasets Dt ≃ D′t...
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Proposition 4.1... choosing the output label space based on the sensitive data is not DP

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

96 extracted references · 96 canonical work pages · 4 internal anchors

[1]

Catastrophic interference in connectionist networks: The sequential learning problem

Michael McCloskey and Neal J Cohen. Catastrophic interference in connectionist networks: The sequential learning problem. In Psychology of Learning and Motivation, volume 24, pages 109–165. Elsevier, 1989. 1

work page 1989
[2]

A continual learning survey: Defying forgetting in classification tasks

Matthias De Lange, Rahaf Aljundi, Marc Masana, Sarah Parisot, Xu Jia, Aleš Leonardis, Gregory Slabaugh, and Tinne Tuytelaars. A continual learning survey: Defying forgetting in classification tasks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(7): 3366–3385, 2021. 3

work page 2021
[3]

A comprehensive survey of continual learning: theory, method and application

Liyuan Wang, Xingxing Zhang, Hang Su, and Jun Zhu. A comprehensive survey of continual learning: theory, method and application. IEEE Transactions on Pattern Analysis and Machine Intelligence, 46(8):5362–5383, 2024. 1

work page 2024
[4]

Catastrophic forgetting in connectionist networks

Robert M French. Catastrophic forgetting in connectionist networks. Trends in Cognitive Sciences, 3(4):128–135, 1999. 1

work page 1999
[5]

Gradient episodic memory for continual learning

David Lopez-Paz and Marc’Aurelio Ranzato. Gradient episodic memory for continual learning. Advances in Neural Information Processing Systems, 30, 2017. 1, 2

work page 2017
[6]

Calibrating noise to sensitivity in private data analysis

Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam Smith. Calibrating noise to sensitivity in private data analysis. In Theory of Cryptography, pages 265–284. Springer, 2006. 1, 2, 3

work page 2006
[7]

Membership inference attacks against machine learning models

Reza Shokri, Marco Stronati, Congzheng Song, and Vitaly Shmatikov. Membership inference attacks against machine learning models. In 2017 IEEE Symposium on Security and Privacy, SP 2017, pages 3–18. IEEE Computer Society, 2017. 2

work page 2017
[8]

Reconstructing training data with informed adversaries

Borja Balle, Giovanni Cherubin, and Jamie Hayes. Reconstructing training data with informed adversaries. In 43rd IEEE Symposium on Security and Privacy, SP 2022, pages 1138–1156. IEEE, 2022. 10

work page 2022
[9]

Reconstructing training data from trained neural networks

Niv Haim, Gal Vardi, Gilad Yehudai, Ohad Shamir, and Michal Irani. Reconstructing training data from trained neural networks. Advances in Neural Information Processing Systems, 35: 22911–22924, 2022. 2

work page 2022
[10]

A new analysis of differential privacy’s generalization guarantees (invited paper)

Christopher Jung, Katrina Ligett, Seth Neel, Aaron Roth, Saeed Sharifi-Malvajerdi, and Moshe Shenfeld. A new analysis of differential privacy’s generalization guarantees (invited paper). In Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing , STOC 2021, page 9. Association for Computing Machinery, 2021. 2

work page 2021
[11]

Revealing information while preserving privacy

Irit Dinur and Kobbi Nissim. Revealing information while preserving privacy. In Proceedings of the Twenty-Second ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pages 202–210. ACM, 2003. 2

work page 2003
[12]

Brendan McMahan, Sergei Vassilvitskii, Steve Chien, and Abhradeep Guha Thakurta

Natalia Ponomareva, Hussein Hazimeh, Alex Kurakin, Zheng Xu, Carson Denison, H. Brendan McMahan, Sergei Vassilvitskii, Steve Chien, and Abhradeep Guha Thakurta. How to dp-fy ML: A practical guide to machine learning with differential privacy. Journal of Artificial Intelligence Research, 77:1113–1201, 2023. 2

work page 2023
[13]

Toward training at imagenet scale with differential privacy

Alexey Kurakin, Steve Chien, Shuang Song, Roxana Geambasu, Andreas Terzis, and Abhradeep Thakurta. Toward training at imagenet scale with differential privacy. ArXiv preprint , abs/2201.12328, 2022. 2

work page arXiv 2022
[14]

Smith, and Borja Balle

Soham De, Leonard Berrada, Jamie Hayes, Samuel L. Smith, and Borja Balle. Unlock- ing high-accuracy differentially private image classification through scale. ArXiv preprint, abs/2204.13650, 2022. 33

work page arXiv 2022
[15]

Turner, and Antti Honkela

Marlon Tobaben, Aliaksandra Shysheya, John Bronskill, Andrew Paverd, Shruti Tople, Santi- ago Zanella Béguelin, Richard E. Turner, and Antti Honkela. On the efficacy of differentially private few-shot image classification. Transactions on Machine Learning Research, 2023. 2, 7, 32, 33

work page 2023
[16]

A simple baseline that questions the use of pretrained-models in continual learning

Paul Janson, Wenxuan Zhang, Rahaf Aljundi, and Mohamed Elhoseiny. A simple baseline that questions the use of pretrained-models in continual learning. ArXiv preprint, abs/2210.04428,

work page arXiv
[17]

Dualprompt: Complementary prompting for rehearsal-free continual learning

Zifeng Wang, Zizhao Zhang, Sayna Ebrahimi, Ruoxi Sun, Han Zhang, Chen-Yu Lee, Xiaoqi Ren, Guolong Su, Vincent Perot, Jennifer Dy, et al. Dualprompt: Complementary prompting for rehearsal-free continual learning. In European Conference on Computer Vision, pages 631–648. Springer, 2022. 2, 8, 32

work page 2022
[18]

Dy, and Tomas Pfister

Zifeng Wang, Zizhao Zhang, Chen-Yu Lee, Han Zhang, Ruoxi Sun, Xiaoqi Ren, Guolong Su, Vincent Perot, Jennifer G. Dy, and Tomas Pfister. Learning to prompt for continual learning. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, pages 139–149. IEEE, 2022. 2

work page 2022
[19]

Private set generation with discriminative information

Dingfan Chen, Raouf Kerkouche, and Mario Fritz. Private set generation with discriminative information. In Advances in Neural Information Processing Systems 35, NeurIPS 2022, 2022. 2, 3

work page 2022
[20]

Continual learning with differential privacy

Pradnya Desai, Phung Lai, NhatHai Phan, and My T Thai. Continual learning with differential privacy. In Neural Information Processing: 28th International Conference, ICONIP 2021 , pages 334–343. Springer, 2021. 2, 3, 6, 19, 20

work page 2021
[21]

Differentially Private Continual Learning

Sebastian Farquhar and Yarin Gal. Differentially private continual learning. ArXiv preprint, abs/1902.06497, 2019. 3

work page internal anchor Pith review Pith/arXiv arXiv 1902
[22]

Differential privacy preservation in robust continual learning

Ahmad Hassanpour, Majid Moradikia, Bian Yang, Ahmed Abdelhadi, Christoph Busch, and Julian Fierrez. Differential privacy preservation in robust continual learning. IEEE Access, 10: 24273–24287, 2022. 2, 3, 6, 19, 20

work page 2022
[23]

Thai, and An M

Phung Lai, Han Hu, Hai Phan, Ruoming Jin, My T. Thai, and An M. Chen. Lifelong DP: consistently bounded differential privacy in lifelong machine learning. In Conference on Lifelong Learning Agents, CoLLAs 2022 , volume 199 of Proceedings of Machine Learning Research, pages 778–797. PMLR, 2022. 2, 3, 6, 19 11

work page 2022
[24]

A differentially private stochastic gradient descent algorithm for multiparty classification

Arun Rajkumar and Shivani Agarwal. A differentially private stochastic gradient descent algorithm for multiparty classification. In Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics, AISTATS 2012, volume 22 of JMLR Proceedings, pages 933–941. JMLR.org, 2012. 2, 3

work page 2012
[25]

Shuang Song, Kamalika Chaudhuri, and Anand D. Sarwate. Stochastic gradient descent with differentially private updates. InIEEE Global Conference on Signal and Information Processing, GlobalSIP 2013, pages 245–248. IEEE, 2013

work page 2013
[26]

Goodfellow, H

Martín Abadi, Andy Chu, Ian J. Goodfellow, H. Brendan McMahan, Ilya Mironov, Kunal Talwar, and Li Zhang. Deep learning with differential privacy. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, pages 308–318. ACM, 2016. 2, 3, 34

work page 2016
[27]

Towards large scale transfer learning for differentially private image classification

Harsh Mehta, Abhradeep Guha Thakurta, Alexey Kurakin, and Ashok Cutkosky. Towards large scale transfer learning for differentially private image classification. Transactions on Machine Learning Research, 2023, 2023. 2, 33

work page 2023
[28]

Choquette-Choo, Nicolas Papernot, and Abhradeep Thakurta

Yannis Cattan, Christopher A. Choquette-Choo, Nicolas Papernot, and Abhradeep Thakurta. Fine-tuning with differential privacy necessitates an additional hyperparameter search. ArXiv preprint, abs/2210.02156, 2022

work page arXiv 2022
[29]

Large language models can be strong differentially private learners

Xuechen Li, Florian Tramèr, Percy Liang, and Tatsunori Hashimoto. Large language models can be strong differentially private learners. In The Tenth International Conference on Learning Representations, ICLR 2022. OpenReview.net, 2022

work page 2022
[30]

Inan, Gautam Kamath, Janardhan Kulkarni, Yin Tat Lee, Andre Manoel, Lukas Wutschitz, Sergey Yekhanin, and Huishuai Zhang

Da Yu, Saurabh Naik, Arturs Backurs, Sivakanth Gopi, Huseyin A. Inan, Gautam Kamath, Janardhan Kulkarni, Yin Tat Lee, Andre Manoel, Lukas Wutschitz, Sergey Yekhanin, and Huishuai Zhang. Differentially private fine-tuning of language models. In The Tenth Interna- tional Conference on Learning Representations, ICLR 2022, 2022

work page 2022
[31]

Privacy-aware document visual question answering

Rubèn Tito, Khanh Nguyen, Marlon Tobaben, Raouf Kerkouche, Mohamed Ali Souibgui, Kangsoo Jung, Joonas Jälkö, Vincent Poulain D’Andecy, Aurélie Joseph, Lei Kang, Ernest Valveny, Antti Honkela, Mario Fritz, and Dimosthenis Karatzas. Privacy-aware document visual question answering. In Document Analysis and Recognition - ICDAR 2024 - 18th International Confe...

work page 2024
[32]

Beyond the mean: Differentially private prototypes for private transfer learning

Dariush Wahdany, Matthew Jagielski, Adam Dziedzic, and Franziska Boenisch. Beyond the mean: Differentially private prototypes for private transfer learning. arXiv preprint arXiv:2406.08039, 2024. 2

work page arXiv 2024
[33]

Position: Considerations for differen- tially private learning with large-scale public pretraining

Florian Tramèr, Gautam Kamath, and Nicholas Carlini. Position: Considerations for differen- tially private learning with large-scale public pretraining. InForty-first International Conference on Machine Learning, ICML 2024, 2024. 2

work page 2024
[34]

Identifying and eliminating csam in generative ml training data and models

David Thiel. Identifying and eliminating csam in generative ml training data and models. Technical report, Technical Report. Stanford University, Palo Alto, CA., 2023. 2

work page 2023
[35]

On Tiny Episodic Memories in Continual Learning

Arslan Chaudhry, Marcus Rohrbach, Mohamed Elhoseiny, Thalaiyasingam Ajanthan, Puneet K Dokania, Philip HS Torr, and Marc’Aurelio Ranzato. On tiny episodic memories in continual learning. arXiv preprint arXiv:1902.10486, 2019. 2

work page internal anchor Pith review Pith/arXiv arXiv 1902
[36]

Learning without forgetting

Zhizhong Li and Derek Hoiem. Learning without forgetting. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(12):2935–2947, 2017. 2

work page 2017
[37]

Lifelong learning with dynamically expandable networks

Jaehong Yoon, Eunho Yang, Jeongtae Lee, and Sung Ju Hwang. Lifelong learning with dynamically expandable networks. In International Conference on Learning Representations,

work page
[38]

Continual learning with foundation models: An empirical study of latent replay

Oleksiy Ostapenko, Timothee Lesort, Pau Rodriguez, Md Rifat Arefin, Arthur Douillard, Irina Rish, and Laurent Charlin. Continual learning with foundation models: An empirical study of latent replay. In Conference on Lifelong Learning Agents, pages 60–91. PMLR, 2022. 2 12

work page 2022
[39]

RanPAC: Random projections and pre-trained models for continual learning

Mark McDonnell, Dong Gong, Amin Parvaneh, Ehsan Abbasnejad, and Anton van den Hengel. RanPAC: Random projections and pre-trained models for continual learning. In Thirty-seventh Conference on Neural Information Processing Systems, 2023. 2, 6

work page 2023
[40]

Safe: Slow and fast parameter-efficient tuning for continual learning with pre-trained models

Linglan Zhao, Xuerui Zhang, Ke Yan, Shouhong Ding, and Weiran Huang. Safe: Slow and fast parameter-efficient tuning for continual learning with pre-trained models. In The Thirty-eighth Annual Conference on Neural Information Processing Systems, 2024. 2, 6

work page 2024
[41]

Expandable subspace ensemble for pre-trained model-based class-incremental learning

Da-Wei Zhou, Hai-Long Sun, Han-Jia Ye, and De-Chuan Zhan. Expandable subspace ensemble for pre-trained model-based class-incremental learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 23554–23564, 2024

work page 2024
[42]

Revisiting class- incremental learning with pre-trained models: Generalizability and adaptivity are all you need

Da-Wei Zhou, Zi-Wen Cai, Han-Jia Ye, De-Chuan Zhan, and Ziwei Liu. Revisiting class- incremental learning with pre-trained models: Generalizability and adaptivity are all you need. International Journal of Computer Vision, pages 1–21, 08 2024. 2, 6

work page 2024
[43]

van de Ven, Tinne Tuytelaars, and Andreas S

Gido M. van de Ven, Tinne Tuytelaars, and Andreas S. Tolias. Three types of incremental learning. Nature Machine Intelligence, 4(12):1185–1197, 2022. 2, 8

work page 2022
[44]

Gradient based sample selection for online continual learning

Rahaf Aljundi, Min Lin, Baptiste Goujaud, and Yoshua Bengio. Gradient based sample selection for online continual learning. Advances in Neural Information Processing Systems, 32, 2019. 2

work page 2019
[45]

Online continual learning on class incremental blurry task configuration with anytime inference

Hyunseo Koh, Dahyun Kim, Jung-Woo Ha, and Jonghyun Choi. Online continual learning on class incremental blurry task configuration with anytime inference. In The Tenth International Conference on Learning Representations, ICLR 2022, 2022. 9, 35

work page 2022
[46]

Online class incremental learning on stochastic blurry task boundary via mask and visual prompt tuning

Jun-Yeong Moon, Keon-Hee Park, Jung Uk Kim, and Gyeong-Moon Park. Online class incremental learning on stochastic blurry task boundary via mask and visual prompt tuning. In IEEE/CVF International Conference on Computer Vision, ICCV 2023 , pages 11697–11707. IEEE, 2023. 2, 9, 35

work page 2023
[47]

Smith, Olivia Wiles, and Borja Balle

Sahra Ghalebikesabi, Leonard Berrada, Sven Gowal, Ira Ktena, Robert Stanforth, Jamie Hayes, Soham De, Samuel L Smith, Olivia Wiles, and Borja Balle. Differentially private diffusion models generate useful synthetic images. arXiv preprint arXiv:2302.13861, 2023. 3

work page arXiv 2023
[48]

Differen- tially private synthetic data via foundation model apis 1: Images

Zinan Lin, Sivakanth Gopi, Janardhan Kulkarni, Harsha Nori, and Sergey Yekhanin. Differen- tially private synthetic data via foundation model apis 1: Images. In The Twelfth International Conference on Learning Representations, ICLR 2024. OpenReview.net, 2024. 3

work page 2024
[49]

Our data, ourselves: Privacy via distributed noise generation

Cynthia Dwork, Krishnaram Kenthapadi, Frank McSherry, Ilya Mironov, and Moni Naor. Our data, ourselves: Privacy via distributed noise generation. In Advances in Cryptology - EUROCRYPT 2006, 25th Annual International Conference on the Theory and Applications of Cryptographic Techniques, volume 4004 of Lecture Notes in Computer Science, pages 486–503. Sprin...

work page 2006
[50]

Privacy integrated queries: an extensible platform for privacy-preserving data analysis

Frank McSherry. Privacy integrated queries: an extensible platform for privacy-preserving data analysis. Communications of the ACM, 53(9):89–97, 2010. 3, 6

work page 2010
[51]

Rothblum, and Salil P

Cynthia Dwork, Guy N. Rothblum, and Salil P. Vadhan. Boosting and differential privacy. In 51th Annual IEEE Symposium on Foundations of Computer Science, FOCS 2010, pages 51–60. IEEE Computer Society, 2010. 6, 31

work page 2010
[52]

Fully-adaptive composition in differential privacy

Justin Whitehouse, Aaditya Ramdas, Ryan Rogers, and Steven Wu. Fully-adaptive composition in differential privacy. In International Conference on Machine Learning, ICML 2023, volume 202 of Proceedings of Machine Learning Research, pages 36990–37007. PMLR, 2023. 3, 6, 31

work page 2023
[53]

Computing tight differential privacy guarantees using FFT

Antti Koskela, Joonas Jälkö, and Antti Honkela. Computing tight differential privacy guarantees using FFT. In The 23rd International Conference on Artificial Intelligence and Statistics, AISTATS 2020, volume 108 of Proceedings of Machine Learning Research, pages 2560–2569. PMLR, 2020. 3

work page 2020
[54]

Numerical composition of differential privacy

Sivakanth Gopi, Yin Tat Lee, and Lukas Wutschitz. Numerical composition of differential privacy. In Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, pages 11631–11642, 2021. 3, 32 13

work page 2021
[55]

The Algorithmic Foundations of Differential Privacy

Cynthia Dwork and Aaron Roth. The Algorithmic Foundations of Differential Privacy. Founda- tions and Trends in Theoretical Computer Science, 9(3-4):211–407, 2014. 6, 22, 29, 31

work page 2014
[56]

Improving the gaussian mechanism for differential privacy: Analytical calibration and optimal denoising

Borja Balle and Yu-Xiang Wang. Improving the gaussian mechanism for differential privacy: Analytical calibration and optimal denoising. In Proceedings of the 35th International Con- ference on Machine Learning, ICML 2018, volume 80 of Proceedings of Machine Learning Research, pages 403–412. PMLR, 2018. 6

work page 2018
[57]

icarl: Incremental classifier and representation learning

Sylvestre-Alvise Rebuffi, Alexander Kolesnikov, Georg Sperl, and Christoph H Lampert. icarl: Incremental classifier and representation learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2001–2010, 2017. 7

work page 2001
[58]

Parameter-efficient transfer learning for NLP

Neil Houlsby, Andrei Giurgiu, Stanislaw Jastrzebski, Bruna Morrone, Quentin de Laroussilhe, Andrea Gesmundo, Mona Attariyan, and Sylvain Gelly. Parameter-efficient transfer learning for NLP. In Proceedings of the 36th International Conference on Machine Learning, ICML 2019, volume 97 of Proceedings of Machine Learning Research, pages 2790–2799. PMLR, 2019. 7

work page 2019
[59]

Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen

Edward J. Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen. LoRA: low-rank adaptation of large language models. In The Tenth International Conference on Learning Representations, ICLR 2022, 2022. 7, 32

work page 2022
[60]

Courville

Ethan Perez, Florian Strub, Harm de Vries, Vincent Dumoulin, and Aaron C. Courville. Film: Visual reasoning with a general conditioning layer. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, (AAAI-18), the 30th innovative Applications of Artificial Intelligence (IAAI-18), and the 8th AAAI Symposium on Educational Advances i...

work page 2018
[61]

Aliaksandra Shysheya, John Bronskill, Massimiliano Patacchiola, Sebastian Nowozin, and Richard E. Turner. FiT: parameter efficient few-shot transfer learning for personalized and federated image classification. In The Eleventh International Conference on Learning Repre- sentations, ICLR 2023, 2023. 7, 32

work page 2023
[62]

SAFE: slow and fast parameter-efficient tuning for continual learning with pre-trained models

Linglan Zhao, Xuerui Zhang, Ke Yan, Shouhong Ding, and Weiran Huang. SAFE: slow and fast parameter-efficient tuning for continual learning with pre-trained models. ArXiv preprint, abs/2411.02175, 2024. 7, 33

work page arXiv 2024
[63]

An image is worth 16x16 words: Transformers for image recognition at scale

Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, and Neil Houlsby. An image is worth 16x16 words: Transformers for image recognition at scale. In 9th International Conference on Learning Representations, ICLR 2021,

work page 2021
[64]

Berg, and Li Fei-Fei

Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg, and Li Fei-Fei. ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision (IJCV), 115(3):211–252, 2015. 8, 32

work page 2015
[65]

Learning multiple layers of features from tiny images

Alex Krizhevsky. Learning multiple layers of features from tiny images. Master’s thesis, University of Toronto, 2009. 8, 32, 39

work page 2009
[66]

The many faces of robustness: A critical analysis of out-of-distribution generalization

Dan Hendrycks, Steven Basart, Norman Mu, Saurav Kadavath, Frank Wang, Evan Dorundo, Rahul Desai, Tyler Zhu, Samyak Parajuli, Mike Guo, et al. The many faces of robustness: A critical analysis of out-of-distribution generalization. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 8340–8349, 2021. 8, 32, 39

work page 2021
[67]

Adversarial continual learning

Sayna Ebrahimi, Franziska Meier, Roberto Calandra, Trevor Darrell, and Marcus Rohrbach. Adversarial continual learning. In Computer Vision - ECCV 2020 - 16th European Conference, volume 12356 of Lecture Notes in Computer Science, pages 386–402. Springer, 2020. 8

work page 2020
[68]

MNIST handwritten digit database

Yann LeCun, Corinna Cortes, and CJ Burges. MNIST handwritten digit database. ATT Labs [Online]. Available: http://yann.lecun.com/exdb/mnist, 2, 2010. 8, 32, 39 14

work page 2010
[69]

Reading digits in natural images with unsupervised feature learning

Yuval Netzer, Tao Wang, Adam Coates, Alessandro Bissacco, Bo Wu, and Andrew Y Ng. Reading digits in natural images with unsupervised feature learning. In NIPS Workshop on Deep Learning and Unsupervised Feature Learning, 2011. 8, 32, 39

work page 2011
[70]

notMNIST dataset, 2011

Yaroslav Bulatov. notMNIST dataset, 2011. 8, 32, 39

work page 2011
[71]

Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms

Han Xiao, Kashif Rasul, and Roland V ollgraf. Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms. ArXiv preprint, abs/1708.07747, 2017. 8, 32, 39

work page internal anchor Pith review Pith/arXiv arXiv 2017
[72]

Arslan Chaudhry, Puneet Kumar Dokania, Thalaiyasingam Ajanthan, and Philip H. S. Torr. Riemannian walk for incremental learning: Understanding forgetting and intransigence. In Computer Vision - ECCV 2018 - 15th European Conference, volume 11215 of Lecture Notes in Computer Science, pages 556–572. Springer, 2018. 8, 32

work page 2018
[73]

Linear mode connectivity in multitask and continual learning

Seyed Iman Mirzadeh, Mehrdad Farajtabar, Dilan Gorur, Razvan Pascanu, and Hassan Ghasemzadeh. Linear mode connectivity in multitask and continual learning. In Interna- tional Conference on Learning Representations, 2021. 8

work page 2021
[74]

Online coreset selection for rehearsal-based continual learning

Jaehong Yoon, Divyam Madaan, Eunho Yang, and Sung Ju Hwang. Online coreset selection for rehearsal-based continual learning. In International Conference on Learning Representations,

work page
[75]

Pan- Private Streaming Algorithms

Cynthia Dwork, Moni Naor, Toniann Pitassi, Guy N Rothblum, and Sergey Yekhanin. Pan- Private Streaming Algorithms. In Innovations in Computer Science (ICS), pages 66–80. Ts- inghua University Press, 2010. 19

work page 2010
[76]

Privacy Accounting and Quality Control in the Sage Differentially Private ML Platform, September

Mathias Lecuyer, Riley Spahn, Kiran V odrahalli, Roxana Geambasu, and Daniel Hsu. Privacy Accounting and Quality Control in the Sage Differentially Private ML Platform, September

work page
[77]

arXiv:1909.01502. 19

work page arXiv 1909
[78]

Choquette-Choo, Arun Ganesh, Saminul Haque, Thomas Steinke, and Abhradeep Thakurta

Christopher A. Choquette-Choo, Arun Ganesh, Saminul Haque, Thomas Steinke, and Abhradeep Thakurta. Near exact privacy amplification for matrix mechanisms. ArXiv preprint, abs/2410.06266, 2024. 19

work page arXiv 2024
[79]

Functional Mechanism: Regression Analysis under Differential Privacy

Jun Zhang, Zhenjie Zhang, Xiaokui Xiao, Yin Yang, and Marianne Winslett. Functional mechanism: Regression analysis under differential privacy. ArXiv preprint, abs/1208.0219,

work page internal anchor Pith review Pith/arXiv arXiv
[80]

Pramod Kaushik Mudrakarta, Mark Sandler, Andrey Zhmoginov, and Andrew G. Howard. K for the price of 1: Parameter-efficient multi-task and transfer learning. In 7th International Conference on Learning Representations, ICLR 2019, 2019. 32

work page 2019

Showing first 80 references.

[1] [1]

Catastrophic interference in connectionist networks: The sequential learning problem

Michael McCloskey and Neal J Cohen. Catastrophic interference in connectionist networks: The sequential learning problem. In Psychology of Learning and Motivation, volume 24, pages 109–165. Elsevier, 1989. 1

work page 1989

[2] [2]

A continual learning survey: Defying forgetting in classification tasks

Matthias De Lange, Rahaf Aljundi, Marc Masana, Sarah Parisot, Xu Jia, Aleš Leonardis, Gregory Slabaugh, and Tinne Tuytelaars. A continual learning survey: Defying forgetting in classification tasks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(7): 3366–3385, 2021. 3

work page 2021

[3] [3]

A comprehensive survey of continual learning: theory, method and application

Liyuan Wang, Xingxing Zhang, Hang Su, and Jun Zhu. A comprehensive survey of continual learning: theory, method and application. IEEE Transactions on Pattern Analysis and Machine Intelligence, 46(8):5362–5383, 2024. 1

work page 2024

[4] [4]

Catastrophic forgetting in connectionist networks

Robert M French. Catastrophic forgetting in connectionist networks. Trends in Cognitive Sciences, 3(4):128–135, 1999. 1

work page 1999

[5] [5]

Gradient episodic memory for continual learning

David Lopez-Paz and Marc’Aurelio Ranzato. Gradient episodic memory for continual learning. Advances in Neural Information Processing Systems, 30, 2017. 1, 2

work page 2017

[6] [6]

Calibrating noise to sensitivity in private data analysis

Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam Smith. Calibrating noise to sensitivity in private data analysis. In Theory of Cryptography, pages 265–284. Springer, 2006. 1, 2, 3

work page 2006

[7] [7]

Membership inference attacks against machine learning models

Reza Shokri, Marco Stronati, Congzheng Song, and Vitaly Shmatikov. Membership inference attacks against machine learning models. In 2017 IEEE Symposium on Security and Privacy, SP 2017, pages 3–18. IEEE Computer Society, 2017. 2

work page 2017

[8] [8]

Reconstructing training data with informed adversaries

Borja Balle, Giovanni Cherubin, and Jamie Hayes. Reconstructing training data with informed adversaries. In 43rd IEEE Symposium on Security and Privacy, SP 2022, pages 1138–1156. IEEE, 2022. 10

work page 2022

[9] [9]

Reconstructing training data from trained neural networks

Niv Haim, Gal Vardi, Gilad Yehudai, Ohad Shamir, and Michal Irani. Reconstructing training data from trained neural networks. Advances in Neural Information Processing Systems, 35: 22911–22924, 2022. 2

work page 2022

[10] [10]

A new analysis of differential privacy’s generalization guarantees (invited paper)

Christopher Jung, Katrina Ligett, Seth Neel, Aaron Roth, Saeed Sharifi-Malvajerdi, and Moshe Shenfeld. A new analysis of differential privacy’s generalization guarantees (invited paper). In Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing , STOC 2021, page 9. Association for Computing Machinery, 2021. 2

work page 2021

[11] [11]

Revealing information while preserving privacy

Irit Dinur and Kobbi Nissim. Revealing information while preserving privacy. In Proceedings of the Twenty-Second ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pages 202–210. ACM, 2003. 2

work page 2003

[12] [12]

Brendan McMahan, Sergei Vassilvitskii, Steve Chien, and Abhradeep Guha Thakurta

Natalia Ponomareva, Hussein Hazimeh, Alex Kurakin, Zheng Xu, Carson Denison, H. Brendan McMahan, Sergei Vassilvitskii, Steve Chien, and Abhradeep Guha Thakurta. How to dp-fy ML: A practical guide to machine learning with differential privacy. Journal of Artificial Intelligence Research, 77:1113–1201, 2023. 2

work page 2023

[13] [13]

Toward training at imagenet scale with differential privacy

Alexey Kurakin, Steve Chien, Shuang Song, Roxana Geambasu, Andreas Terzis, and Abhradeep Thakurta. Toward training at imagenet scale with differential privacy. ArXiv preprint , abs/2201.12328, 2022. 2

work page arXiv 2022

[14] [14]

Smith, and Borja Balle

Soham De, Leonard Berrada, Jamie Hayes, Samuel L. Smith, and Borja Balle. Unlock- ing high-accuracy differentially private image classification through scale. ArXiv preprint, abs/2204.13650, 2022. 33

work page arXiv 2022

[15] [15]

Turner, and Antti Honkela

Marlon Tobaben, Aliaksandra Shysheya, John Bronskill, Andrew Paverd, Shruti Tople, Santi- ago Zanella Béguelin, Richard E. Turner, and Antti Honkela. On the efficacy of differentially private few-shot image classification. Transactions on Machine Learning Research, 2023. 2, 7, 32, 33

work page 2023

[16] [16]

A simple baseline that questions the use of pretrained-models in continual learning

Paul Janson, Wenxuan Zhang, Rahaf Aljundi, and Mohamed Elhoseiny. A simple baseline that questions the use of pretrained-models in continual learning. ArXiv preprint, abs/2210.04428,

work page arXiv

[17] [17]

Dualprompt: Complementary prompting for rehearsal-free continual learning

Zifeng Wang, Zizhao Zhang, Sayna Ebrahimi, Ruoxi Sun, Han Zhang, Chen-Yu Lee, Xiaoqi Ren, Guolong Su, Vincent Perot, Jennifer Dy, et al. Dualprompt: Complementary prompting for rehearsal-free continual learning. In European Conference on Computer Vision, pages 631–648. Springer, 2022. 2, 8, 32

work page 2022

[18] [18]

Dy, and Tomas Pfister

Zifeng Wang, Zizhao Zhang, Chen-Yu Lee, Han Zhang, Ruoxi Sun, Xiaoqi Ren, Guolong Su, Vincent Perot, Jennifer G. Dy, and Tomas Pfister. Learning to prompt for continual learning. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, pages 139–149. IEEE, 2022. 2

work page 2022

[19] [19]

Private set generation with discriminative information

Dingfan Chen, Raouf Kerkouche, and Mario Fritz. Private set generation with discriminative information. In Advances in Neural Information Processing Systems 35, NeurIPS 2022, 2022. 2, 3

work page 2022

[20] [20]

Continual learning with differential privacy

Pradnya Desai, Phung Lai, NhatHai Phan, and My T Thai. Continual learning with differential privacy. In Neural Information Processing: 28th International Conference, ICONIP 2021 , pages 334–343. Springer, 2021. 2, 3, 6, 19, 20

work page 2021

[21] [21]

Differentially Private Continual Learning

Sebastian Farquhar and Yarin Gal. Differentially private continual learning. ArXiv preprint, abs/1902.06497, 2019. 3

work page internal anchor Pith review Pith/arXiv arXiv 1902

[22] [22]

Differential privacy preservation in robust continual learning

Ahmad Hassanpour, Majid Moradikia, Bian Yang, Ahmed Abdelhadi, Christoph Busch, and Julian Fierrez. Differential privacy preservation in robust continual learning. IEEE Access, 10: 24273–24287, 2022. 2, 3, 6, 19, 20

work page 2022

[23] [23]

Thai, and An M

Phung Lai, Han Hu, Hai Phan, Ruoming Jin, My T. Thai, and An M. Chen. Lifelong DP: consistently bounded differential privacy in lifelong machine learning. In Conference on Lifelong Learning Agents, CoLLAs 2022 , volume 199 of Proceedings of Machine Learning Research, pages 778–797. PMLR, 2022. 2, 3, 6, 19 11

work page 2022

[24] [24]

A differentially private stochastic gradient descent algorithm for multiparty classification

Arun Rajkumar and Shivani Agarwal. A differentially private stochastic gradient descent algorithm for multiparty classification. In Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics, AISTATS 2012, volume 22 of JMLR Proceedings, pages 933–941. JMLR.org, 2012. 2, 3

work page 2012

[25] [25]

Shuang Song, Kamalika Chaudhuri, and Anand D. Sarwate. Stochastic gradient descent with differentially private updates. InIEEE Global Conference on Signal and Information Processing, GlobalSIP 2013, pages 245–248. IEEE, 2013

work page 2013

[26] [26]

Goodfellow, H

Martín Abadi, Andy Chu, Ian J. Goodfellow, H. Brendan McMahan, Ilya Mironov, Kunal Talwar, and Li Zhang. Deep learning with differential privacy. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, pages 308–318. ACM, 2016. 2, 3, 34

work page 2016

[27] [27]

Towards large scale transfer learning for differentially private image classification

Harsh Mehta, Abhradeep Guha Thakurta, Alexey Kurakin, and Ashok Cutkosky. Towards large scale transfer learning for differentially private image classification. Transactions on Machine Learning Research, 2023, 2023. 2, 33

work page 2023

[28] [28]

Choquette-Choo, Nicolas Papernot, and Abhradeep Thakurta

Yannis Cattan, Christopher A. Choquette-Choo, Nicolas Papernot, and Abhradeep Thakurta. Fine-tuning with differential privacy necessitates an additional hyperparameter search. ArXiv preprint, abs/2210.02156, 2022

work page arXiv 2022

[29] [29]

Large language models can be strong differentially private learners

Xuechen Li, Florian Tramèr, Percy Liang, and Tatsunori Hashimoto. Large language models can be strong differentially private learners. In The Tenth International Conference on Learning Representations, ICLR 2022. OpenReview.net, 2022

work page 2022

[30] [30]

Inan, Gautam Kamath, Janardhan Kulkarni, Yin Tat Lee, Andre Manoel, Lukas Wutschitz, Sergey Yekhanin, and Huishuai Zhang

Da Yu, Saurabh Naik, Arturs Backurs, Sivakanth Gopi, Huseyin A. Inan, Gautam Kamath, Janardhan Kulkarni, Yin Tat Lee, Andre Manoel, Lukas Wutschitz, Sergey Yekhanin, and Huishuai Zhang. Differentially private fine-tuning of language models. In The Tenth Interna- tional Conference on Learning Representations, ICLR 2022, 2022

work page 2022

[31] [31]

Privacy-aware document visual question answering

Rubèn Tito, Khanh Nguyen, Marlon Tobaben, Raouf Kerkouche, Mohamed Ali Souibgui, Kangsoo Jung, Joonas Jälkö, Vincent Poulain D’Andecy, Aurélie Joseph, Lei Kang, Ernest Valveny, Antti Honkela, Mario Fritz, and Dimosthenis Karatzas. Privacy-aware document visual question answering. In Document Analysis and Recognition - ICDAR 2024 - 18th International Confe...

work page 2024

[32] [32]

Beyond the mean: Differentially private prototypes for private transfer learning

Dariush Wahdany, Matthew Jagielski, Adam Dziedzic, and Franziska Boenisch. Beyond the mean: Differentially private prototypes for private transfer learning. arXiv preprint arXiv:2406.08039, 2024. 2

work page arXiv 2024

[33] [33]

Position: Considerations for differen- tially private learning with large-scale public pretraining

Florian Tramèr, Gautam Kamath, and Nicholas Carlini. Position: Considerations for differen- tially private learning with large-scale public pretraining. InForty-first International Conference on Machine Learning, ICML 2024, 2024. 2

work page 2024

[34] [34]

Identifying and eliminating csam in generative ml training data and models

David Thiel. Identifying and eliminating csam in generative ml training data and models. Technical report, Technical Report. Stanford University, Palo Alto, CA., 2023. 2

work page 2023

[35] [35]

On Tiny Episodic Memories in Continual Learning

Arslan Chaudhry, Marcus Rohrbach, Mohamed Elhoseiny, Thalaiyasingam Ajanthan, Puneet K Dokania, Philip HS Torr, and Marc’Aurelio Ranzato. On tiny episodic memories in continual learning. arXiv preprint arXiv:1902.10486, 2019. 2

work page internal anchor Pith review Pith/arXiv arXiv 1902

[36] [36]

Learning without forgetting

Zhizhong Li and Derek Hoiem. Learning without forgetting. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(12):2935–2947, 2017. 2

work page 2017

[37] [37]

Lifelong learning with dynamically expandable networks

Jaehong Yoon, Eunho Yang, Jeongtae Lee, and Sung Ju Hwang. Lifelong learning with dynamically expandable networks. In International Conference on Learning Representations,

work page

[38] [38]

Continual learning with foundation models: An empirical study of latent replay

Oleksiy Ostapenko, Timothee Lesort, Pau Rodriguez, Md Rifat Arefin, Arthur Douillard, Irina Rish, and Laurent Charlin. Continual learning with foundation models: An empirical study of latent replay. In Conference on Lifelong Learning Agents, pages 60–91. PMLR, 2022. 2 12

work page 2022

[39] [39]

RanPAC: Random projections and pre-trained models for continual learning

Mark McDonnell, Dong Gong, Amin Parvaneh, Ehsan Abbasnejad, and Anton van den Hengel. RanPAC: Random projections and pre-trained models for continual learning. In Thirty-seventh Conference on Neural Information Processing Systems, 2023. 2, 6

work page 2023

[40] [40]

Safe: Slow and fast parameter-efficient tuning for continual learning with pre-trained models

Linglan Zhao, Xuerui Zhang, Ke Yan, Shouhong Ding, and Weiran Huang. Safe: Slow and fast parameter-efficient tuning for continual learning with pre-trained models. In The Thirty-eighth Annual Conference on Neural Information Processing Systems, 2024. 2, 6

work page 2024

[41] [41]

Expandable subspace ensemble for pre-trained model-based class-incremental learning

Da-Wei Zhou, Hai-Long Sun, Han-Jia Ye, and De-Chuan Zhan. Expandable subspace ensemble for pre-trained model-based class-incremental learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 23554–23564, 2024

work page 2024

[42] [42]

Revisiting class- incremental learning with pre-trained models: Generalizability and adaptivity are all you need

Da-Wei Zhou, Zi-Wen Cai, Han-Jia Ye, De-Chuan Zhan, and Ziwei Liu. Revisiting class- incremental learning with pre-trained models: Generalizability and adaptivity are all you need. International Journal of Computer Vision, pages 1–21, 08 2024. 2, 6

work page 2024

[43] [43]

van de Ven, Tinne Tuytelaars, and Andreas S

Gido M. van de Ven, Tinne Tuytelaars, and Andreas S. Tolias. Three types of incremental learning. Nature Machine Intelligence, 4(12):1185–1197, 2022. 2, 8

work page 2022

[44] [44]

Gradient based sample selection for online continual learning

Rahaf Aljundi, Min Lin, Baptiste Goujaud, and Yoshua Bengio. Gradient based sample selection for online continual learning. Advances in Neural Information Processing Systems, 32, 2019. 2

work page 2019

[45] [45]

Online continual learning on class incremental blurry task configuration with anytime inference

Hyunseo Koh, Dahyun Kim, Jung-Woo Ha, and Jonghyun Choi. Online continual learning on class incremental blurry task configuration with anytime inference. In The Tenth International Conference on Learning Representations, ICLR 2022, 2022. 9, 35

work page 2022

[46] [46]

Online class incremental learning on stochastic blurry task boundary via mask and visual prompt tuning

Jun-Yeong Moon, Keon-Hee Park, Jung Uk Kim, and Gyeong-Moon Park. Online class incremental learning on stochastic blurry task boundary via mask and visual prompt tuning. In IEEE/CVF International Conference on Computer Vision, ICCV 2023 , pages 11697–11707. IEEE, 2023. 2, 9, 35

work page 2023

[47] [47]

Smith, Olivia Wiles, and Borja Balle

Sahra Ghalebikesabi, Leonard Berrada, Sven Gowal, Ira Ktena, Robert Stanforth, Jamie Hayes, Soham De, Samuel L Smith, Olivia Wiles, and Borja Balle. Differentially private diffusion models generate useful synthetic images. arXiv preprint arXiv:2302.13861, 2023. 3

work page arXiv 2023

[48] [48]

Differen- tially private synthetic data via foundation model apis 1: Images

Zinan Lin, Sivakanth Gopi, Janardhan Kulkarni, Harsha Nori, and Sergey Yekhanin. Differen- tially private synthetic data via foundation model apis 1: Images. In The Twelfth International Conference on Learning Representations, ICLR 2024. OpenReview.net, 2024. 3

work page 2024

[49] [49]

Our data, ourselves: Privacy via distributed noise generation

Cynthia Dwork, Krishnaram Kenthapadi, Frank McSherry, Ilya Mironov, and Moni Naor. Our data, ourselves: Privacy via distributed noise generation. In Advances in Cryptology - EUROCRYPT 2006, 25th Annual International Conference on the Theory and Applications of Cryptographic Techniques, volume 4004 of Lecture Notes in Computer Science, pages 486–503. Sprin...

work page 2006

[50] [50]

Privacy integrated queries: an extensible platform for privacy-preserving data analysis

Frank McSherry. Privacy integrated queries: an extensible platform for privacy-preserving data analysis. Communications of the ACM, 53(9):89–97, 2010. 3, 6

work page 2010

[51] [51]

Rothblum, and Salil P

Cynthia Dwork, Guy N. Rothblum, and Salil P. Vadhan. Boosting and differential privacy. In 51th Annual IEEE Symposium on Foundations of Computer Science, FOCS 2010, pages 51–60. IEEE Computer Society, 2010. 6, 31

work page 2010

[52] [52]

Fully-adaptive composition in differential privacy

Justin Whitehouse, Aaditya Ramdas, Ryan Rogers, and Steven Wu. Fully-adaptive composition in differential privacy. In International Conference on Machine Learning, ICML 2023, volume 202 of Proceedings of Machine Learning Research, pages 36990–37007. PMLR, 2023. 3, 6, 31

work page 2023

[53] [53]

Computing tight differential privacy guarantees using FFT

Antti Koskela, Joonas Jälkö, and Antti Honkela. Computing tight differential privacy guarantees using FFT. In The 23rd International Conference on Artificial Intelligence and Statistics, AISTATS 2020, volume 108 of Proceedings of Machine Learning Research, pages 2560–2569. PMLR, 2020. 3

work page 2020

[54] [54]

Numerical composition of differential privacy

Sivakanth Gopi, Yin Tat Lee, and Lukas Wutschitz. Numerical composition of differential privacy. In Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, pages 11631–11642, 2021. 3, 32 13

work page 2021

[55] [55]

The Algorithmic Foundations of Differential Privacy

Cynthia Dwork and Aaron Roth. The Algorithmic Foundations of Differential Privacy. Founda- tions and Trends in Theoretical Computer Science, 9(3-4):211–407, 2014. 6, 22, 29, 31

work page 2014

[56] [56]

Improving the gaussian mechanism for differential privacy: Analytical calibration and optimal denoising

Borja Balle and Yu-Xiang Wang. Improving the gaussian mechanism for differential privacy: Analytical calibration and optimal denoising. In Proceedings of the 35th International Con- ference on Machine Learning, ICML 2018, volume 80 of Proceedings of Machine Learning Research, pages 403–412. PMLR, 2018. 6

work page 2018

[57] [57]

icarl: Incremental classifier and representation learning

Sylvestre-Alvise Rebuffi, Alexander Kolesnikov, Georg Sperl, and Christoph H Lampert. icarl: Incremental classifier and representation learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2001–2010, 2017. 7

work page 2001

[58] [58]

Parameter-efficient transfer learning for NLP

Neil Houlsby, Andrei Giurgiu, Stanislaw Jastrzebski, Bruna Morrone, Quentin de Laroussilhe, Andrea Gesmundo, Mona Attariyan, and Sylvain Gelly. Parameter-efficient transfer learning for NLP. In Proceedings of the 36th International Conference on Machine Learning, ICML 2019, volume 97 of Proceedings of Machine Learning Research, pages 2790–2799. PMLR, 2019. 7

work page 2019

[59] [59]

Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen

Edward J. Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen. LoRA: low-rank adaptation of large language models. In The Tenth International Conference on Learning Representations, ICLR 2022, 2022. 7, 32

work page 2022

[60] [60]

Courville

Ethan Perez, Florian Strub, Harm de Vries, Vincent Dumoulin, and Aaron C. Courville. Film: Visual reasoning with a general conditioning layer. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, (AAAI-18), the 30th innovative Applications of Artificial Intelligence (IAAI-18), and the 8th AAAI Symposium on Educational Advances i...

work page 2018

[61] [61]

Aliaksandra Shysheya, John Bronskill, Massimiliano Patacchiola, Sebastian Nowozin, and Richard E. Turner. FiT: parameter efficient few-shot transfer learning for personalized and federated image classification. In The Eleventh International Conference on Learning Repre- sentations, ICLR 2023, 2023. 7, 32

work page 2023

[62] [62]

SAFE: slow and fast parameter-efficient tuning for continual learning with pre-trained models

Linglan Zhao, Xuerui Zhang, Ke Yan, Shouhong Ding, and Weiran Huang. SAFE: slow and fast parameter-efficient tuning for continual learning with pre-trained models. ArXiv preprint, abs/2411.02175, 2024. 7, 33

work page arXiv 2024

[63] [63]

An image is worth 16x16 words: Transformers for image recognition at scale

Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, and Neil Houlsby. An image is worth 16x16 words: Transformers for image recognition at scale. In 9th International Conference on Learning Representations, ICLR 2021,

work page 2021

[64] [64]

Berg, and Li Fei-Fei

Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg, and Li Fei-Fei. ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision (IJCV), 115(3):211–252, 2015. 8, 32

work page 2015

[65] [65]

Learning multiple layers of features from tiny images

Alex Krizhevsky. Learning multiple layers of features from tiny images. Master’s thesis, University of Toronto, 2009. 8, 32, 39

work page 2009

[66] [66]

The many faces of robustness: A critical analysis of out-of-distribution generalization

Dan Hendrycks, Steven Basart, Norman Mu, Saurav Kadavath, Frank Wang, Evan Dorundo, Rahul Desai, Tyler Zhu, Samyak Parajuli, Mike Guo, et al. The many faces of robustness: A critical analysis of out-of-distribution generalization. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 8340–8349, 2021. 8, 32, 39

work page 2021

[67] [67]

Adversarial continual learning

Sayna Ebrahimi, Franziska Meier, Roberto Calandra, Trevor Darrell, and Marcus Rohrbach. Adversarial continual learning. In Computer Vision - ECCV 2020 - 16th European Conference, volume 12356 of Lecture Notes in Computer Science, pages 386–402. Springer, 2020. 8

work page 2020

[68] [68]

MNIST handwritten digit database

Yann LeCun, Corinna Cortes, and CJ Burges. MNIST handwritten digit database. ATT Labs [Online]. Available: http://yann.lecun.com/exdb/mnist, 2, 2010. 8, 32, 39 14

work page 2010

[69] [69]

Reading digits in natural images with unsupervised feature learning

Yuval Netzer, Tao Wang, Adam Coates, Alessandro Bissacco, Bo Wu, and Andrew Y Ng. Reading digits in natural images with unsupervised feature learning. In NIPS Workshop on Deep Learning and Unsupervised Feature Learning, 2011. 8, 32, 39

work page 2011

[70] [70]

notMNIST dataset, 2011

Yaroslav Bulatov. notMNIST dataset, 2011. 8, 32, 39

work page 2011

[71] [71]

Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms

Han Xiao, Kashif Rasul, and Roland V ollgraf. Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms. ArXiv preprint, abs/1708.07747, 2017. 8, 32, 39

work page internal anchor Pith review Pith/arXiv arXiv 2017

[72] [72]

Arslan Chaudhry, Puneet Kumar Dokania, Thalaiyasingam Ajanthan, and Philip H. S. Torr. Riemannian walk for incremental learning: Understanding forgetting and intransigence. In Computer Vision - ECCV 2018 - 15th European Conference, volume 11215 of Lecture Notes in Computer Science, pages 556–572. Springer, 2018. 8, 32

work page 2018

[73] [73]

Linear mode connectivity in multitask and continual learning

Seyed Iman Mirzadeh, Mehrdad Farajtabar, Dilan Gorur, Razvan Pascanu, and Hassan Ghasemzadeh. Linear mode connectivity in multitask and continual learning. In Interna- tional Conference on Learning Representations, 2021. 8

work page 2021

[74] [74]

Online coreset selection for rehearsal-based continual learning

Jaehong Yoon, Divyam Madaan, Eunho Yang, and Sung Ju Hwang. Online coreset selection for rehearsal-based continual learning. In International Conference on Learning Representations,

work page

[75] [75]

Pan- Private Streaming Algorithms

Cynthia Dwork, Moni Naor, Toniann Pitassi, Guy N Rothblum, and Sergey Yekhanin. Pan- Private Streaming Algorithms. In Innovations in Computer Science (ICS), pages 66–80. Ts- inghua University Press, 2010. 19

work page 2010

[76] [76]

Privacy Accounting and Quality Control in the Sage Differentially Private ML Platform, September

Mathias Lecuyer, Riley Spahn, Kiran V odrahalli, Roxana Geambasu, and Daniel Hsu. Privacy Accounting and Quality Control in the Sage Differentially Private ML Platform, September

work page

[77] [77]

arXiv:1909.01502. 19

work page arXiv 1909

[78] [78]

Choquette-Choo, Arun Ganesh, Saminul Haque, Thomas Steinke, and Abhradeep Thakurta

Christopher A. Choquette-Choo, Arun Ganesh, Saminul Haque, Thomas Steinke, and Abhradeep Thakurta. Near exact privacy amplification for matrix mechanisms. ArXiv preprint, abs/2410.06266, 2024. 19

work page arXiv 2024

[79] [79]

Functional Mechanism: Regression Analysis under Differential Privacy

Jun Zhang, Zhenjie Zhang, Xiaokui Xiao, Yin Yang, and Marianne Winslett. Functional mechanism: Regression analysis under differential privacy. ArXiv preprint, abs/1208.0219,

work page internal anchor Pith review Pith/arXiv arXiv

[80] [80]

Pramod Kaushik Mudrakarta, Mark Sandler, Andrey Zhmoginov, and Andrew G. Howard. K for the price of 1: Parameter-efficient multi-task and transfer learning. In 7th International Conference on Learning Representations, ICLR 2019, 2019. 32

work page 2019