pith. sign in

arxiv: 2411.04680 · v5 · submitted 2024-11-07 · 💻 cs.LG · cs.CR

Privacy Leakage via Output Label Space and Differentially Private Continual Learning

Pith reviewed 2026-05-23 17:15 UTC · model grok-4.3

classification 💻 cs.LG cs.CR
keywords differential privacycontinual learningprivacy side-channeloutput label spacemembership inferenceSplit-CIFAR-100Split-ImageNet-R
0
0 comments X

The pith

The output label space of a classification model acts as a privacy side-channel that leaks information even when training uses differential privacy.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper identifies the set of output labels a model can predict as an observable side-channel that reveals which data the model has seen. In continual learning the label space grows over time, so an attacker who sees the labels can infer membership or other private details even if the model weights satisfy DP. The authors formalise differential privacy for continual learning to handle changing label spaces and propose two fixes: releasing labels through an optimal DP mechanism or expanding to a large public label set. They adapt pre-trained models and show higher accuracy than prior DP continual learning work on Split-CIFAR-100 and Split-ImageNet-R while using a stronger privacy definition.

Core claim

The output label space of a classification model is a privacy side-channel; a concrete attack exploits it in continual learning, and two mitigation methods (DP label release or large public label space) allow models to achieve higher accuracy under a formal DP guarantee for continual learning than previous approaches on Split-CIFAR-100 and Split-ImageNet-R.

What carries the argument

Formalisation of differential privacy for continual learning that accounts for a time-varying output label space and treats the label space itself as observable to the adversary.

If this is right

  • Models trained with the two proposed label-space mitigations satisfy the new DP definition for continual learning.
  • Accuracy under the stronger privacy model exceeds that of earlier DP continual learning methods on the two split benchmarks.
  • The side-channel exists whenever the label space is observable, even if weights are DP-protected.
  • Pre-trained models can be adapted while preserving the privacy guarantee after label-space handling.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If the label space attack generalises beyond the two benchmarks, many deployed continual learners would need label-space protection regardless of weight-level DP.
  • Using a fixed large public label space may trade off task-specific accuracy for privacy in settings where new classes arrive frequently.
  • The formalisation could be extended to other side-channels such as model output confidence scores or decision thresholds.

Load-bearing premise

The formal definition of DP for continual learning correctly models the privacy risk when the set of possible output labels changes over time and remains visible to an attacker.

What would settle it

Run the described label-space attack on a DP-trained continual learner whose label space is released without the proposed mitigation and measure whether membership inference or data reconstruction succeeds at a rate above the DP bound.

Figures

Figures reproduced from arXiv: 2411.04680 by Antti Honkela, Arno Solin, Marcus Klasson, Marlon Tobaben, Mikko Heikkil\"a, Talal Alrawajfeh.

Figure 1
Figure 1. Figure 1: Attack on the output label space: The challenger uses either the dataset [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Lower bound of the probability that a new label is not added to the output label space Ot. Even with ϵ = 1.0 and δ = 10−7 , classes having fewer than 13 samples are discarded with at least 99% probability, and thus cannot be learned. Regarding the last point, to quantify how likely it is to drop a label, we compute a tight lower bound on this probability. Suppose y appears in k examples of Dt. Dropping y e… view at source ↗
Figure 3
Figure 3. Figure 3: Split-CIFAR-100 at ϵ=1 (left) and Split-ImageNet-R at ϵ=8 (right) at δ=10−5 : Both methods only decrease slightly in utility when greatly increasing the number of assumed labels through dummy labels. Bad label match affects Cosine Classifier. Similar observations and detailed results in App. F.1. 1. Match of output label space based on prior information O prior t and labels that occur in the data: In [PIT… view at source ↗
Figure 4
Figure 4. Figure 4: Blurry tasks on Split-CIFAR-100 (left: data distribution per task, right: final acc with [PITH_FULL_IMAGE:figures/full_fig_p009_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Split-CIFAR-100 (left) and Split-ImageNet-R (right) with [PITH_FULL_IMAGE:figures/full_fig_p009_5.png] view at source ↗
read the original abstract

Differential privacy (DP) is a formal privacy framework that enables training machine learning (ML) models while protecting individuals' data. As pointed out by prior work, ML models are part of larger systems, which can lead to so-called privacy side-channels even if the model training itself is DP. We identify the output label space of a classification model as such a privacy side-channel and show a concrete privacy attack that exploits it. The side-channel becomes highly relevant in continual learning (CL), where the output label space changes over time. To reason about privacy guarantees in CL, we introduce a formalisation of DP for CL, which also clarifies how our approach differs from existing approaches. We propose and evaluate two methods for eliminating this side-channel: applying an optimal DP mechanism to release the labels in the sensitive data, and using a large public label space. We explore the trade-offs of these methods through adapting pre-trained models. We demonstrate empirically that our models consistently achieve higher accuracy under DP than previous work over both Split-CIFAR-100 and Split-ImageNet-R, with a stronger privacy model.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper identifies the output label space of classification models as a privacy side-channel that becomes especially relevant in continual learning (CL) because the label space evolves over time. It introduces a new formalization of differential privacy (DP) tailored to CL, proposes two mitigation strategies (applying an optimal DP mechanism to release labels from sensitive data, or using a large public label space), adapts pre-trained models accordingly, and reports empirical results showing higher accuracy than prior work on Split-CIFAR-100 and Split-ImageNet-R under a stronger privacy model while demonstrating a concrete attack exploiting the side-channel.

Significance. If the new DP formalization for CL correctly incorporates observable label-space releases into the mechanism output and composes across tasks, and if the accuracy gains are robust, the work would be significant for closing a previously unaddressed side-channel in privacy-preserving CL. The explicit attack construction and the two mitigation approaches provide concrete, falsifiable contributions that could influence both theory and practice in dynamic learning settings.

major comments (3)
  1. [DP formalization section] The DP formalization for CL (introduced to reason about privacy guarantees when label space changes): the definition must treat the sequence of released label sets as part of the mechanism output distribution; otherwise the side-channel attack identified in the abstract remains outside the stated privacy bound. The abstract and skeptic note give no indication that composition across evolving label spaces is proven or that neighboring task sequences are distinguished only up to the claimed epsilon when label support is observable.
  2. [Experimental evaluation] Empirical claims of higher accuracy under DP (Split-CIFAR-100 and Split-ImageNet-R experiments): without reported error bars, exact data splits, attack implementation details, or baseline epsilon values, it is impossible to verify that the accuracy improvement is not an artifact of weaker effective privacy or different evaluation protocols. The central claim that the new methods achieve both stronger privacy and higher accuracy rests on these results.
  3. [Threat model and formalization] Threat model and label-space observability: the weakest assumption noted (that the DP formalization correctly captures privacy when the output label space changes over time and is observable) is load-bearing; if the definition only protects model parameters while treating label-space release as free, the paper's own attack would violate the guarantee, undermining the “stronger privacy model” assertion.
minor comments (2)
  1. [Abstract] Abstract: the phrase “stronger privacy model” should be accompanied by a concrete comparison (e.g., specific ε values or neighboring-relation definition) rather than left qualitative.
  2. [Methods] Notation for the two proposed methods (DP label release vs. public label space) should be introduced with explicit symbols early so that later empirical tables are immediately interpretable.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their constructive feedback. We address each major comment below, clarifying the DP formalization for continual learning and committing to improvements in the experimental reporting. Our responses aim to strengthen the manuscript while maintaining the integrity of our contributions.

read point-by-point responses
  1. Referee: [DP formalization section] The DP formalization for CL (introduced to reason about privacy guarantees when label space changes): the definition must treat the sequence of released label sets as part of the mechanism output distribution; otherwise the side-channel attack identified in the abstract remains outside the stated privacy bound. The abstract and skeptic note give no indication that composition across evolving label spaces is proven or that neighboring task sequences are distinguished only up to the claimed epsilon when label support is observable.

    Authors: We agree that the formalization must include the sequence of released label sets in the mechanism output to properly bound the side-channel. Our DP definition for CL (Section 3) explicitly defines the mechanism output as the pair consisting of the model update and the released label set for each task. The composition theorem accounts for the evolving label spaces by considering the full output distribution across the task sequence. Neighboring task sequences are distinguished only up to the claimed epsilon under this output. We will revise the abstract and add an explicit remark in Section 3 to highlight this aspect. revision: partial

  2. Referee: [Experimental evaluation] Empirical claims of higher accuracy under DP (Split-CIFAR-100 and Split-ImageNet-R experiments): without reported error bars, exact data splits, attack implementation details, or baseline epsilon values, it is impossible to verify that the accuracy improvement is not an artifact of weaker effective privacy or different evaluation protocols. The central claim that the new methods achieve both stronger privacy and higher accuracy rests on these results.

    Authors: We acknowledge that the current experimental section lacks sufficient details for full reproducibility and verification. In the revised manuscript we will report error bars over five independent runs, specify the exact data splits and preprocessing for both Split-CIFAR-100 and Split-ImageNet-R, provide implementation details (including pseudocode) for the concrete attack, and list the precise epsilon values used for all baselines and our methods. These additions will confirm that the reported accuracy gains hold under the stronger privacy model that incorporates label-space releases. revision: yes

  3. Referee: [Threat model and formalization] Threat model and label-space observability: the weakest assumption noted (that the DP formalization correctly captures privacy when the output label space changes over time and is observable) is load-bearing; if the definition only protects model parameters while treating label-space release as free, the paper's own attack would violate the guarantee, undermining the “stronger privacy model” assertion.

    Authors: Our threat model assumes full observability of the label space at each task. The DP formalization for CL is constructed precisely so that the mechanism output includes both model parameters and label-set releases; the privacy guarantee therefore applies to the combined observable output. Consequently the concrete attack lies inside the stated epsilon bound, which is what enables the claim of a stronger privacy model relative to prior work that omits this side-channel. revision: no

Circularity Check

0 steps flagged

No circularity: new DP formalization and empirical evaluation are independent of inputs

full rationale

The paper introduces a new formalization of DP for continual learning to address label-space side channels and evaluates two mitigation methods empirically on Split-CIFAR-100 and Split-ImageNet-R. No equations, derivations, or predictions in the abstract or description reduce to fitted parameters, self-definitions, or self-citation chains. The central claims rest on the introduced definition and experimental comparisons, which do not collapse to the inputs by construction. This is the expected non-finding for a paper whose core is a new definition plus benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract supplies no information on free parameters, axioms, or invented entities used in the formalization or experiments.

pith-pipeline@v0.9.0 · 5747 in / 1037 out tokens · 53748 ms · 2026-05-23T17:15:26.318279+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

96 extracted references · 96 canonical work pages · 4 internal anchors

  1. [1]

    Catastrophic interference in connectionist networks: The sequential learning problem

    Michael McCloskey and Neal J Cohen. Catastrophic interference in connectionist networks: The sequential learning problem. In Psychology of Learning and Motivation, volume 24, pages 109–165. Elsevier, 1989. 1

  2. [2]

    A continual learning survey: Defying forgetting in classification tasks

    Matthias De Lange, Rahaf Aljundi, Marc Masana, Sarah Parisot, Xu Jia, Aleš Leonardis, Gregory Slabaugh, and Tinne Tuytelaars. A continual learning survey: Defying forgetting in classification tasks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(7): 3366–3385, 2021. 3

  3. [3]

    A comprehensive survey of continual learning: theory, method and application

    Liyuan Wang, Xingxing Zhang, Hang Su, and Jun Zhu. A comprehensive survey of continual learning: theory, method and application. IEEE Transactions on Pattern Analysis and Machine Intelligence, 46(8):5362–5383, 2024. 1

  4. [4]

    Catastrophic forgetting in connectionist networks

    Robert M French. Catastrophic forgetting in connectionist networks. Trends in Cognitive Sciences, 3(4):128–135, 1999. 1

  5. [5]

    Gradient episodic memory for continual learning

    David Lopez-Paz and Marc’Aurelio Ranzato. Gradient episodic memory for continual learning. Advances in Neural Information Processing Systems, 30, 2017. 1, 2

  6. [6]

    Calibrating noise to sensitivity in private data analysis

    Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam Smith. Calibrating noise to sensitivity in private data analysis. In Theory of Cryptography, pages 265–284. Springer, 2006. 1, 2, 3

  7. [7]

    Membership inference attacks against machine learning models

    Reza Shokri, Marco Stronati, Congzheng Song, and Vitaly Shmatikov. Membership inference attacks against machine learning models. In 2017 IEEE Symposium on Security and Privacy, SP 2017, pages 3–18. IEEE Computer Society, 2017. 2

  8. [8]

    Reconstructing training data with informed adversaries

    Borja Balle, Giovanni Cherubin, and Jamie Hayes. Reconstructing training data with informed adversaries. In 43rd IEEE Symposium on Security and Privacy, SP 2022, pages 1138–1156. IEEE, 2022. 10

  9. [9]

    Reconstructing training data from trained neural networks

    Niv Haim, Gal Vardi, Gilad Yehudai, Ohad Shamir, and Michal Irani. Reconstructing training data from trained neural networks. Advances in Neural Information Processing Systems, 35: 22911–22924, 2022. 2

  10. [10]

    A new analysis of differential privacy’s generalization guarantees (invited paper)

    Christopher Jung, Katrina Ligett, Seth Neel, Aaron Roth, Saeed Sharifi-Malvajerdi, and Moshe Shenfeld. A new analysis of differential privacy’s generalization guarantees (invited paper). In Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing , STOC 2021, page 9. Association for Computing Machinery, 2021. 2

  11. [11]

    Revealing information while preserving privacy

    Irit Dinur and Kobbi Nissim. Revealing information while preserving privacy. In Proceedings of the Twenty-Second ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pages 202–210. ACM, 2003. 2

  12. [12]

    Brendan McMahan, Sergei Vassilvitskii, Steve Chien, and Abhradeep Guha Thakurta

    Natalia Ponomareva, Hussein Hazimeh, Alex Kurakin, Zheng Xu, Carson Denison, H. Brendan McMahan, Sergei Vassilvitskii, Steve Chien, and Abhradeep Guha Thakurta. How to dp-fy ML: A practical guide to machine learning with differential privacy. Journal of Artificial Intelligence Research, 77:1113–1201, 2023. 2

  13. [13]

    Toward training at imagenet scale with differential privacy

    Alexey Kurakin, Steve Chien, Shuang Song, Roxana Geambasu, Andreas Terzis, and Abhradeep Thakurta. Toward training at imagenet scale with differential privacy. ArXiv preprint , abs/2201.12328, 2022. 2

  14. [14]

    Smith, and Borja Balle

    Soham De, Leonard Berrada, Jamie Hayes, Samuel L. Smith, and Borja Balle. Unlock- ing high-accuracy differentially private image classification through scale. ArXiv preprint, abs/2204.13650, 2022. 33

  15. [15]

    Turner, and Antti Honkela

    Marlon Tobaben, Aliaksandra Shysheya, John Bronskill, Andrew Paverd, Shruti Tople, Santi- ago Zanella Béguelin, Richard E. Turner, and Antti Honkela. On the efficacy of differentially private few-shot image classification. Transactions on Machine Learning Research, 2023. 2, 7, 32, 33

  16. [16]

    A simple baseline that questions the use of pretrained-models in continual learning

    Paul Janson, Wenxuan Zhang, Rahaf Aljundi, and Mohamed Elhoseiny. A simple baseline that questions the use of pretrained-models in continual learning. ArXiv preprint, abs/2210.04428,

  17. [17]

    Dualprompt: Complementary prompting for rehearsal-free continual learning

    Zifeng Wang, Zizhao Zhang, Sayna Ebrahimi, Ruoxi Sun, Han Zhang, Chen-Yu Lee, Xiaoqi Ren, Guolong Su, Vincent Perot, Jennifer Dy, et al. Dualprompt: Complementary prompting for rehearsal-free continual learning. In European Conference on Computer Vision, pages 631–648. Springer, 2022. 2, 8, 32

  18. [18]

    Dy, and Tomas Pfister

    Zifeng Wang, Zizhao Zhang, Chen-Yu Lee, Han Zhang, Ruoxi Sun, Xiaoqi Ren, Guolong Su, Vincent Perot, Jennifer G. Dy, and Tomas Pfister. Learning to prompt for continual learning. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, pages 139–149. IEEE, 2022. 2

  19. [19]

    Private set generation with discriminative information

    Dingfan Chen, Raouf Kerkouche, and Mario Fritz. Private set generation with discriminative information. In Advances in Neural Information Processing Systems 35, NeurIPS 2022, 2022. 2, 3

  20. [20]

    Continual learning with differential privacy

    Pradnya Desai, Phung Lai, NhatHai Phan, and My T Thai. Continual learning with differential privacy. In Neural Information Processing: 28th International Conference, ICONIP 2021 , pages 334–343. Springer, 2021. 2, 3, 6, 19, 20

  21. [21]

    Differentially Private Continual Learning

    Sebastian Farquhar and Yarin Gal. Differentially private continual learning. ArXiv preprint, abs/1902.06497, 2019. 3

  22. [22]

    Differential privacy preservation in robust continual learning

    Ahmad Hassanpour, Majid Moradikia, Bian Yang, Ahmed Abdelhadi, Christoph Busch, and Julian Fierrez. Differential privacy preservation in robust continual learning. IEEE Access, 10: 24273–24287, 2022. 2, 3, 6, 19, 20

  23. [23]

    Thai, and An M

    Phung Lai, Han Hu, Hai Phan, Ruoming Jin, My T. Thai, and An M. Chen. Lifelong DP: consistently bounded differential privacy in lifelong machine learning. In Conference on Lifelong Learning Agents, CoLLAs 2022 , volume 199 of Proceedings of Machine Learning Research, pages 778–797. PMLR, 2022. 2, 3, 6, 19 11

  24. [24]

    A differentially private stochastic gradient descent algorithm for multiparty classification

    Arun Rajkumar and Shivani Agarwal. A differentially private stochastic gradient descent algorithm for multiparty classification. In Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics, AISTATS 2012, volume 22 of JMLR Proceedings, pages 933–941. JMLR.org, 2012. 2, 3

  25. [25]

    Shuang Song, Kamalika Chaudhuri, and Anand D. Sarwate. Stochastic gradient descent with differentially private updates. InIEEE Global Conference on Signal and Information Processing, GlobalSIP 2013, pages 245–248. IEEE, 2013

  26. [26]

    Goodfellow, H

    Martín Abadi, Andy Chu, Ian J. Goodfellow, H. Brendan McMahan, Ilya Mironov, Kunal Talwar, and Li Zhang. Deep learning with differential privacy. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, pages 308–318. ACM, 2016. 2, 3, 34

  27. [27]

    Towards large scale transfer learning for differentially private image classification

    Harsh Mehta, Abhradeep Guha Thakurta, Alexey Kurakin, and Ashok Cutkosky. Towards large scale transfer learning for differentially private image classification. Transactions on Machine Learning Research, 2023, 2023. 2, 33

  28. [28]

    Choquette-Choo, Nicolas Papernot, and Abhradeep Thakurta

    Yannis Cattan, Christopher A. Choquette-Choo, Nicolas Papernot, and Abhradeep Thakurta. Fine-tuning with differential privacy necessitates an additional hyperparameter search. ArXiv preprint, abs/2210.02156, 2022

  29. [29]

    Large language models can be strong differentially private learners

    Xuechen Li, Florian Tramèr, Percy Liang, and Tatsunori Hashimoto. Large language models can be strong differentially private learners. In The Tenth International Conference on Learning Representations, ICLR 2022. OpenReview.net, 2022

  30. [30]

    Inan, Gautam Kamath, Janardhan Kulkarni, Yin Tat Lee, Andre Manoel, Lukas Wutschitz, Sergey Yekhanin, and Huishuai Zhang

    Da Yu, Saurabh Naik, Arturs Backurs, Sivakanth Gopi, Huseyin A. Inan, Gautam Kamath, Janardhan Kulkarni, Yin Tat Lee, Andre Manoel, Lukas Wutschitz, Sergey Yekhanin, and Huishuai Zhang. Differentially private fine-tuning of language models. In The Tenth Interna- tional Conference on Learning Representations, ICLR 2022, 2022

  31. [31]

    Privacy-aware document visual question answering

    Rubèn Tito, Khanh Nguyen, Marlon Tobaben, Raouf Kerkouche, Mohamed Ali Souibgui, Kangsoo Jung, Joonas Jälkö, Vincent Poulain D’Andecy, Aurélie Joseph, Lei Kang, Ernest Valveny, Antti Honkela, Mario Fritz, and Dimosthenis Karatzas. Privacy-aware document visual question answering. In Document Analysis and Recognition - ICDAR 2024 - 18th International Confe...

  32. [32]

    Beyond the mean: Differentially private prototypes for private transfer learning

    Dariush Wahdany, Matthew Jagielski, Adam Dziedzic, and Franziska Boenisch. Beyond the mean: Differentially private prototypes for private transfer learning. arXiv preprint arXiv:2406.08039, 2024. 2

  33. [33]

    Position: Considerations for differen- tially private learning with large-scale public pretraining

    Florian Tramèr, Gautam Kamath, and Nicholas Carlini. Position: Considerations for differen- tially private learning with large-scale public pretraining. InForty-first International Conference on Machine Learning, ICML 2024, 2024. 2

  34. [34]

    Identifying and eliminating csam in generative ml training data and models

    David Thiel. Identifying and eliminating csam in generative ml training data and models. Technical report, Technical Report. Stanford University, Palo Alto, CA., 2023. 2

  35. [35]

    On Tiny Episodic Memories in Continual Learning

    Arslan Chaudhry, Marcus Rohrbach, Mohamed Elhoseiny, Thalaiyasingam Ajanthan, Puneet K Dokania, Philip HS Torr, and Marc’Aurelio Ranzato. On tiny episodic memories in continual learning. arXiv preprint arXiv:1902.10486, 2019. 2

  36. [36]

    Learning without forgetting

    Zhizhong Li and Derek Hoiem. Learning without forgetting. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(12):2935–2947, 2017. 2

  37. [37]

    Lifelong learning with dynamically expandable networks

    Jaehong Yoon, Eunho Yang, Jeongtae Lee, and Sung Ju Hwang. Lifelong learning with dynamically expandable networks. In International Conference on Learning Representations,

  38. [38]

    Continual learning with foundation models: An empirical study of latent replay

    Oleksiy Ostapenko, Timothee Lesort, Pau Rodriguez, Md Rifat Arefin, Arthur Douillard, Irina Rish, and Laurent Charlin. Continual learning with foundation models: An empirical study of latent replay. In Conference on Lifelong Learning Agents, pages 60–91. PMLR, 2022. 2 12

  39. [39]

    RanPAC: Random projections and pre-trained models for continual learning

    Mark McDonnell, Dong Gong, Amin Parvaneh, Ehsan Abbasnejad, and Anton van den Hengel. RanPAC: Random projections and pre-trained models for continual learning. In Thirty-seventh Conference on Neural Information Processing Systems, 2023. 2, 6

  40. [40]

    Safe: Slow and fast parameter-efficient tuning for continual learning with pre-trained models

    Linglan Zhao, Xuerui Zhang, Ke Yan, Shouhong Ding, and Weiran Huang. Safe: Slow and fast parameter-efficient tuning for continual learning with pre-trained models. In The Thirty-eighth Annual Conference on Neural Information Processing Systems, 2024. 2, 6

  41. [41]

    Expandable subspace ensemble for pre-trained model-based class-incremental learning

    Da-Wei Zhou, Hai-Long Sun, Han-Jia Ye, and De-Chuan Zhan. Expandable subspace ensemble for pre-trained model-based class-incremental learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 23554–23564, 2024

  42. [42]

    Revisiting class- incremental learning with pre-trained models: Generalizability and adaptivity are all you need

    Da-Wei Zhou, Zi-Wen Cai, Han-Jia Ye, De-Chuan Zhan, and Ziwei Liu. Revisiting class- incremental learning with pre-trained models: Generalizability and adaptivity are all you need. International Journal of Computer Vision, pages 1–21, 08 2024. 2, 6

  43. [43]

    van de Ven, Tinne Tuytelaars, and Andreas S

    Gido M. van de Ven, Tinne Tuytelaars, and Andreas S. Tolias. Three types of incremental learning. Nature Machine Intelligence, 4(12):1185–1197, 2022. 2, 8

  44. [44]

    Gradient based sample selection for online continual learning

    Rahaf Aljundi, Min Lin, Baptiste Goujaud, and Yoshua Bengio. Gradient based sample selection for online continual learning. Advances in Neural Information Processing Systems, 32, 2019. 2

  45. [45]

    Online continual learning on class incremental blurry task configuration with anytime inference

    Hyunseo Koh, Dahyun Kim, Jung-Woo Ha, and Jonghyun Choi. Online continual learning on class incremental blurry task configuration with anytime inference. In The Tenth International Conference on Learning Representations, ICLR 2022, 2022. 9, 35

  46. [46]

    Online class incremental learning on stochastic blurry task boundary via mask and visual prompt tuning

    Jun-Yeong Moon, Keon-Hee Park, Jung Uk Kim, and Gyeong-Moon Park. Online class incremental learning on stochastic blurry task boundary via mask and visual prompt tuning. In IEEE/CVF International Conference on Computer Vision, ICCV 2023 , pages 11697–11707. IEEE, 2023. 2, 9, 35

  47. [47]

    Smith, Olivia Wiles, and Borja Balle

    Sahra Ghalebikesabi, Leonard Berrada, Sven Gowal, Ira Ktena, Robert Stanforth, Jamie Hayes, Soham De, Samuel L Smith, Olivia Wiles, and Borja Balle. Differentially private diffusion models generate useful synthetic images. arXiv preprint arXiv:2302.13861, 2023. 3

  48. [48]

    Differen- tially private synthetic data via foundation model apis 1: Images

    Zinan Lin, Sivakanth Gopi, Janardhan Kulkarni, Harsha Nori, and Sergey Yekhanin. Differen- tially private synthetic data via foundation model apis 1: Images. In The Twelfth International Conference on Learning Representations, ICLR 2024. OpenReview.net, 2024. 3

  49. [49]

    Our data, ourselves: Privacy via distributed noise generation

    Cynthia Dwork, Krishnaram Kenthapadi, Frank McSherry, Ilya Mironov, and Moni Naor. Our data, ourselves: Privacy via distributed noise generation. In Advances in Cryptology - EUROCRYPT 2006, 25th Annual International Conference on the Theory and Applications of Cryptographic Techniques, volume 4004 of Lecture Notes in Computer Science, pages 486–503. Sprin...

  50. [50]

    Privacy integrated queries: an extensible platform for privacy-preserving data analysis

    Frank McSherry. Privacy integrated queries: an extensible platform for privacy-preserving data analysis. Communications of the ACM, 53(9):89–97, 2010. 3, 6

  51. [51]

    Rothblum, and Salil P

    Cynthia Dwork, Guy N. Rothblum, and Salil P. Vadhan. Boosting and differential privacy. In 51th Annual IEEE Symposium on Foundations of Computer Science, FOCS 2010, pages 51–60. IEEE Computer Society, 2010. 6, 31

  52. [52]

    Fully-adaptive composition in differential privacy

    Justin Whitehouse, Aaditya Ramdas, Ryan Rogers, and Steven Wu. Fully-adaptive composition in differential privacy. In International Conference on Machine Learning, ICML 2023, volume 202 of Proceedings of Machine Learning Research, pages 36990–37007. PMLR, 2023. 3, 6, 31

  53. [53]

    Computing tight differential privacy guarantees using FFT

    Antti Koskela, Joonas Jälkö, and Antti Honkela. Computing tight differential privacy guarantees using FFT. In The 23rd International Conference on Artificial Intelligence and Statistics, AISTATS 2020, volume 108 of Proceedings of Machine Learning Research, pages 2560–2569. PMLR, 2020. 3

  54. [54]

    Numerical composition of differential privacy

    Sivakanth Gopi, Yin Tat Lee, and Lukas Wutschitz. Numerical composition of differential privacy. In Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, pages 11631–11642, 2021. 3, 32 13

  55. [55]

    The Algorithmic Foundations of Differential Privacy

    Cynthia Dwork and Aaron Roth. The Algorithmic Foundations of Differential Privacy. Founda- tions and Trends in Theoretical Computer Science, 9(3-4):211–407, 2014. 6, 22, 29, 31

  56. [56]

    Improving the gaussian mechanism for differential privacy: Analytical calibration and optimal denoising

    Borja Balle and Yu-Xiang Wang. Improving the gaussian mechanism for differential privacy: Analytical calibration and optimal denoising. In Proceedings of the 35th International Con- ference on Machine Learning, ICML 2018, volume 80 of Proceedings of Machine Learning Research, pages 403–412. PMLR, 2018. 6

  57. [57]

    icarl: Incremental classifier and representation learning

    Sylvestre-Alvise Rebuffi, Alexander Kolesnikov, Georg Sperl, and Christoph H Lampert. icarl: Incremental classifier and representation learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2001–2010, 2017. 7

  58. [58]

    Parameter-efficient transfer learning for NLP

    Neil Houlsby, Andrei Giurgiu, Stanislaw Jastrzebski, Bruna Morrone, Quentin de Laroussilhe, Andrea Gesmundo, Mona Attariyan, and Sylvain Gelly. Parameter-efficient transfer learning for NLP. In Proceedings of the 36th International Conference on Machine Learning, ICML 2019, volume 97 of Proceedings of Machine Learning Research, pages 2790–2799. PMLR, 2019. 7

  59. [59]

    Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen

    Edward J. Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen. LoRA: low-rank adaptation of large language models. In The Tenth International Conference on Learning Representations, ICLR 2022, 2022. 7, 32

  60. [60]

    Courville

    Ethan Perez, Florian Strub, Harm de Vries, Vincent Dumoulin, and Aaron C. Courville. Film: Visual reasoning with a general conditioning layer. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, (AAAI-18), the 30th innovative Applications of Artificial Intelligence (IAAI-18), and the 8th AAAI Symposium on Educational Advances i...

  61. [61]

    Aliaksandra Shysheya, John Bronskill, Massimiliano Patacchiola, Sebastian Nowozin, and Richard E. Turner. FiT: parameter efficient few-shot transfer learning for personalized and federated image classification. In The Eleventh International Conference on Learning Repre- sentations, ICLR 2023, 2023. 7, 32

  62. [62]

    SAFE: slow and fast parameter-efficient tuning for continual learning with pre-trained models

    Linglan Zhao, Xuerui Zhang, Ke Yan, Shouhong Ding, and Weiran Huang. SAFE: slow and fast parameter-efficient tuning for continual learning with pre-trained models. ArXiv preprint, abs/2411.02175, 2024. 7, 33

  63. [63]

    An image is worth 16x16 words: Transformers for image recognition at scale

    Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, and Neil Houlsby. An image is worth 16x16 words: Transformers for image recognition at scale. In 9th International Conference on Learning Representations, ICLR 2021,

  64. [64]

    Berg, and Li Fei-Fei

    Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg, and Li Fei-Fei. ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision (IJCV), 115(3):211–252, 2015. 8, 32

  65. [65]

    Learning multiple layers of features from tiny images

    Alex Krizhevsky. Learning multiple layers of features from tiny images. Master’s thesis, University of Toronto, 2009. 8, 32, 39

  66. [66]

    The many faces of robustness: A critical analysis of out-of-distribution generalization

    Dan Hendrycks, Steven Basart, Norman Mu, Saurav Kadavath, Frank Wang, Evan Dorundo, Rahul Desai, Tyler Zhu, Samyak Parajuli, Mike Guo, et al. The many faces of robustness: A critical analysis of out-of-distribution generalization. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 8340–8349, 2021. 8, 32, 39

  67. [67]

    Adversarial continual learning

    Sayna Ebrahimi, Franziska Meier, Roberto Calandra, Trevor Darrell, and Marcus Rohrbach. Adversarial continual learning. In Computer Vision - ECCV 2020 - 16th European Conference, volume 12356 of Lecture Notes in Computer Science, pages 386–402. Springer, 2020. 8

  68. [68]

    MNIST handwritten digit database

    Yann LeCun, Corinna Cortes, and CJ Burges. MNIST handwritten digit database. ATT Labs [Online]. Available: http://yann.lecun.com/exdb/mnist, 2, 2010. 8, 32, 39 14

  69. [69]

    Reading digits in natural images with unsupervised feature learning

    Yuval Netzer, Tao Wang, Adam Coates, Alessandro Bissacco, Bo Wu, and Andrew Y Ng. Reading digits in natural images with unsupervised feature learning. In NIPS Workshop on Deep Learning and Unsupervised Feature Learning, 2011. 8, 32, 39

  70. [70]

    notMNIST dataset, 2011

    Yaroslav Bulatov. notMNIST dataset, 2011. 8, 32, 39

  71. [71]

    Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms

    Han Xiao, Kashif Rasul, and Roland V ollgraf. Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms. ArXiv preprint, abs/1708.07747, 2017. 8, 32, 39

  72. [72]

    Arslan Chaudhry, Puneet Kumar Dokania, Thalaiyasingam Ajanthan, and Philip H. S. Torr. Riemannian walk for incremental learning: Understanding forgetting and intransigence. In Computer Vision - ECCV 2018 - 15th European Conference, volume 11215 of Lecture Notes in Computer Science, pages 556–572. Springer, 2018. 8, 32

  73. [73]

    Linear mode connectivity in multitask and continual learning

    Seyed Iman Mirzadeh, Mehrdad Farajtabar, Dilan Gorur, Razvan Pascanu, and Hassan Ghasemzadeh. Linear mode connectivity in multitask and continual learning. In Interna- tional Conference on Learning Representations, 2021. 8

  74. [74]

    Online coreset selection for rehearsal-based continual learning

    Jaehong Yoon, Divyam Madaan, Eunho Yang, and Sung Ju Hwang. Online coreset selection for rehearsal-based continual learning. In International Conference on Learning Representations,

  75. [75]

    Pan- Private Streaming Algorithms

    Cynthia Dwork, Moni Naor, Toniann Pitassi, Guy N Rothblum, and Sergey Yekhanin. Pan- Private Streaming Algorithms. In Innovations in Computer Science (ICS), pages 66–80. Ts- inghua University Press, 2010. 19

  76. [76]

    Privacy Accounting and Quality Control in the Sage Differentially Private ML Platform, September

    Mathias Lecuyer, Riley Spahn, Kiran V odrahalli, Roxana Geambasu, and Daniel Hsu. Privacy Accounting and Quality Control in the Sage Differentially Private ML Platform, September

  77. [77]

    arXiv:1909.01502. 19

  78. [78]

    Choquette-Choo, Arun Ganesh, Saminul Haque, Thomas Steinke, and Abhradeep Thakurta

    Christopher A. Choquette-Choo, Arun Ganesh, Saminul Haque, Thomas Steinke, and Abhradeep Thakurta. Near exact privacy amplification for matrix mechanisms. ArXiv preprint, abs/2410.06266, 2024. 19

  79. [79]

    Functional Mechanism: Regression Analysis under Differential Privacy

    Jun Zhang, Zhenjie Zhang, Xiaokui Xiao, Yin Yang, and Marianne Winslett. Functional mechanism: Regression analysis under differential privacy. ArXiv preprint, abs/1208.0219,

  80. [80]

    Pramod Kaushik Mudrakarta, Mark Sandler, Andrey Zhmoginov, and Andrew G. Howard. K for the price of 1: Parameter-efficient multi-task and transfer learning. In 7th International Conference on Learning Representations, ICLR 2019, 2019. 32

Showing first 80 references.