pith. sign in

arxiv: 2606.00426 · v1 · pith:BSKTOATGnew · submitted 2026-05-29 · 💻 cs.LG

Canonicalized Stable-List Replay for Private Federated Continual Learning over Language-Model Embeddings

Pith reviewed 2026-06-28 22:52 UTC · model grok-4.3

classification 💻 cs.LG
keywords federated continual learningdifferential privacyreplaylanguage model embeddingsanchor signaturesidentifiabilityprivate aggregation
0
0 comments X

The pith

Public anchor sentences induce signatures that distinguish and align unordered private replay lists with high probability in federated continual learning.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

In federated continual learning under differential privacy, clients release only small unordered noisy lists of replay candidates drawn from language-model embeddings. The paper shows that signatures derived from a modest number of public anchor sentences can supply the identifiability needed to aggregate these lists at the server. If an observable margin exists between signatures, only O(log(N/η)/p) anchors suffice to distinguish N list elements with probability at least 1-η. Experiments on continual classification, NER, and dialogue tasks report 3.9-5.6 point gains in final average task metric over the strongest non-aligned DP baseline at ε=4. The formal privacy guarantee covers the replay-release step itself.

Core claim

Clients privately produce candidate replay distributions over a shared sentence-embedding space and the server aligns them using signatures induced by public anchor sentences. The anchors provide identifiability for aggregation rather than additional replay data. We prove that, under an observable anchor-signature margin, O(log(N/η)/p) anchors distinguish N candidate list elements with probability at least 1-η, and we give a scoped anchorless non-identifiability result for unordered-label oracle models. Across five seeds the method improves final average task metric by 3.9-5.6 points over the strongest non-CSLR DP baseline at ε=4 under the reported replay-release budget.

What carries the argument

Signatures induced by public anchor sentences on candidate replay lists, which supply the margin needed for canonicalized aggregation of unordered elements.

If this is right

  • A logarithmic number of public anchors is sufficient to achieve high-probability identifiability of list elements.
  • CSLR produces 3.9-5.6 point gains in final average task metric over non-CSLR DP baselines at ε=4.
  • The method outperforms both Hungarian and optimal-transport matchers on the reported benchmarks.
  • The privacy guarantee applies directly to the replay-release step.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Anchor-based signatures could serve as a general mechanism for list alignment in other private distributed settings where direct ordering is unavailable.
  • The anchorless non-identifiability result indicates that some form of external reference is necessary for stable aggregation of unordered private lists.
  • Empirical checks of the margin size on new language-model embeddings would test whether the logarithmic bound remains practical at larger N.

Load-bearing premise

An observable margin exists in the anchor signatures that is large enough to distinguish the unordered list elements with the claimed probability.

What would settle it

A concrete embedding space or task set in which the anchor-signature margin is smaller than required, causing the observed distinction success rate to fall below 1-η for the predicted number of anchors.

Figures

Figures reproduced from arXiv: 2606.00426 by Abu Sa-Adat Mohamed Moon-Im Al Ahsan, Anuj Sharma, Ibne Farabi Shihab.

Figure 1
Figure 1. Figure 1: CSLR pipeline. Clients release private list candidates in a shared embedding space. Public anchors convert [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Average accuracy versus anchor set size m on Split-AG-News (ε = 4). The dashed line marks the empirical saturation point m⋆ ≈ 50 for this encoder, anchor distribution, and task stream. et al., 2021), FedCIL+DP (Qi et al., 2023), and non￾private FedTA† (Yu et al., 2025) as an upper-bound reference. To isolate the role of anchors, we also evaluate CSLR-NoAnchor, CSLR-NearestMean, CSLR-Hungarian, and CSLR-Opt… view at source ↗
read the original abstract

Federated continual learning (FCL) lets distributed clients adapt language-model heads to evolving NLP tasks without sharing raw text. Under user-level differential privacy (DP), replay-based continual learning faces a structural obstacle: clients can release only small noisy lists of candidate replay summaries, and those lists are unordered across clients. We introduce Canonicalized Stable-List Replay (CSLR), where clients privately produce candidate replay distributions over a shared sentence-embedding space and the server aligns them using signatures induced by public anchor sentences. The anchors provide identifiability for aggregation rather than additional replay data. We prove that, under an observable anchor-signature margin, $O(\log(N/\eta)/p)$ anchors distinguish $N$ candidate list elements with probability at least $1-\eta$, and we give a scoped anchorless non-identifiability result for unordered-label oracle models. Across five seeds on continual classification, NER, and dialogue benchmarks, CSLR improves the final average task metric by 3.9--5.6 points over the strongest non-CSLR DP baseline at $\eps=4$ under the reported replay-release budget, while also outperforming Hungarian and optimal-transport matchers. The formal privacy guarantee covers replay release; end-to-end private training additionally requires composition with a private optimizer for task-head updates.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper introduces Canonicalized Stable-List Replay (CSLR) for user-level DP federated continual learning on LM embeddings. Clients release small noisy unordered candidate replay lists over a shared embedding space; the server aligns them via signatures from public anchor sentences. Under an observable anchor-signature margin the authors prove that O(log(N/η)/p) anchors distinguish N list elements with probability ≥1-η, and they supply a complementary anchorless non-identifiability result. On continual classification, NER and dialogue benchmarks CSLR yields 3.9–5.6 point gains in final average task metric over the strongest non-CSLR DP baseline at ε=4 under the reported replay budget, while also beating Hungarian and OT matchers. The formal privacy guarantee is scoped to replay release.

Significance. If the anchor-margin assumption is satisfied and observable in the relevant embedding spaces, the method supplies a clean identifiability mechanism that avoids extra replay data and yields a concrete sample-complexity bound. The reported empirical improvements under realistic replay budgets would then constitute a practically relevant advance for private FCL.

major comments (2)
  1. [Abstract] Abstract (and the proof of the O(log(N/η)/p) bound): the identifiability result and the justification for server-side aggregation are conditioned on the existence of an observable positive anchor-signature margin, yet the manuscript reports no measurement of signature distances, margin size, or verification that the margin is large enough to achieve the claimed probability on the sentence-embedding spaces of the classification/NER/dialogue tasks. Without this verification the link between the theoretical bound and the 3.9–5.6 point gains is unsupported.
  2. [Experimental section] The experimental section (five seeds, reported replay-release budget): the paper states that end-to-end privacy requires separate composition with a private optimizer, but provides no access to exact data splits, replay budgets per client, or the full derivation of the anchor margin used in the bound; this weakens the claim that the observed gains are attributable to the canonicalization step rather than other implementation choices.
minor comments (2)
  1. Notation for the anchor margin p and the probability η should be introduced with a dedicated definition before the bound statement.
  2. [Abstract] The abstract claims “outperforming Hungarian and optimal-transport matchers” but does not specify whether these baselines also operate under the same DP replay-release budget; a clarifying sentence would help.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive feedback. We address the major comments below, providing clarifications and committing to revisions where appropriate to strengthen the connection between theory and experiments.

read point-by-point responses
  1. Referee: [Abstract] Abstract (and the proof of the O(log(N/η)/p) bound): the identifiability result and the justification for server-side aggregation are conditioned on the existence of an observable positive anchor-signature margin, yet the manuscript reports no measurement of signature distances, margin size, or verification that the margin is large enough to achieve the claimed probability on the sentence-embedding spaces of the classification/NER/dialogue tasks. Without this verification the link between the theoretical bound and the 3.9–5.6 point gains is unsupported.

    Authors: We agree that explicit empirical verification of the anchor-signature margin would strengthen the manuscript. The theoretical result is conditioned on an observable positive margin, and while the empirical gains are consistent with the assumption holding in practice for the used embeddings, the manuscript does not report measurements of signature distances or margin sizes. In the revision, we will add a new subsection or appendix with these measurements on the relevant embedding spaces, including average inter-signature distances and estimated margins, to verify that the margin supports the claimed probability bounds. revision: yes

  2. Referee: [Experimental section] The experimental section (five seeds, reported replay-release budget): the paper states that end-to-end privacy requires separate composition with a private optimizer, but provides no access to exact data splits, replay budgets per client, or the full derivation of the anchor margin used in the bound; this weakens the claim that the observed gains are attributable to the canonicalization step rather than other implementation choices.

    Authors: We note that the paper uses publicly available benchmark datasets with standard train/test splits (details referenced in the experimental setup), and reports the overall replay-release budget. However, to address the concern about reproducibility and attribution of gains, we will expand the experimental section and add an appendix with per-client replay budgets, exact data split information, and a more detailed derivation of the specific anchor margin parameter p used in the bound application for the experiments. This will help isolate the contribution of the canonicalization step. revision: partial

Circularity Check

0 steps flagged

No significant circularity; bound is conditional and results are measured

full rationale

The paper states the O(log(N/η)/p) identifiability bound explicitly under the 'observable anchor-signature margin' assumption and supplies a complementary anchorless non-identifiability result. No equations or steps reduce the claimed result to its own inputs by construction, no parameters are fitted on a subset and then relabeled as predictions, and no self-citation chain is invoked as load-bearing justification. Empirical gains (3.9-5.6 points) are presented as measured experimental outcomes across seeds, not quantities defined by the proof parameters. The derivation chain is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption of an observable anchor-signature margin that supplies identifiability; no explicit free parameters or new invented entities are stated in the abstract.

axioms (1)
  • domain assumption There exists an observable anchor-signature margin that enables identifiability of candidate replay elements
    Invoked to establish the O(log(N/η)/p) anchor bound and to justify server-side alignment

pith-pipeline@v0.9.1-grok · 5779 in / 1392 out tokens · 29147 ms · 2026-06-28T22:52:01.219376+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

42 extracted references · 8 canonical work pages

  1. [1]

    Proceedings of the 38th International Conference on Machine Learning , series =

    Jaehong Yoon and Wonyong Jeong and Giwoong Lee and Eunho Yang and Sung Ju Hwang , title =. Proceedings of the 38th International Conference on Machine Learning , series =

  2. [2]

    International Conference on Learning Representations , year =

    Daiqing Qi and Handong Zhao and Sheng Li , title =. International Conference on Learning Representations , year =

  3. [3]

    Advances in Neural Information Processing Systems , year =

    Sara Babakniya and Zalan Fabian and Chaoyang He and Mahdi Soltanolkotabi and Salman Avestimehr , title =. Advances in Neural Information Processing Systems , year =

  4. [4]

    Advances in Neural Information Processing Systems , year =

    David Rolnick and Arun Ahuja and Jonathan Schwarz and Timothy Lillicrap and Gregory Wayne , title =. Advances in Neural Information Processing Systems , year =

  5. [6]

    Proceedings of The 36th International Conference on Algorithmic Learning Theory , series =

    Mohammad Afzali and Hassan Ashtiani and Christopher Liaw , title =. Proceedings of The 36th International Conference on Algorithmic Learning Theory , series =

  6. [7]

    Advances in Neural Information Processing Systems , volume =

    Badih Ghazi and Ravi Kumar and Pasin Manurangsi , title =. Advances in Neural Information Processing Systems , volume =

  7. [8]

    Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , year =

    Hao Yu and Xin Yang and Le Zhang and Hanlin Gu and Tianrui Li and Lixin Fan and Qiang Yang , title =. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , year =

  8. [12]

    Proceedings of the 41st International Conference on Machine Learning , series =

    Hongming Piao and Yichen Wu and Dapeng Wu and Ying Wei , title =. Proceedings of the 41st International Conference on Machine Learning , series =

  9. [13]

    2023 IEEE 98th Vehicular Technology Conference, VTC 2023-Fall - Proceedings , series =

    Junyan Ouyang and Rui Han and Chi Harold Liu , title =. 2023 IEEE 98th Vehicular Technology Conference, VTC 2023-Fall - Proceedings , series =. 2023 , doi =

  10. [14]

    Brendan McMahan and Ilya Mironov and Kunal Talwar and Li Zhang , title =

    Martin Abadi and Andy Chu and Ian Goodfellow and H. Brendan McMahan and Ilya Mironov and Kunal Talwar and Li Zhang , title =. Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security , year =

  11. [15]

    Brendan McMahan and Daniel Ramage and Kunal Talwar and Li Zhang , title =

    H. Brendan McMahan and Daniel Ramage and Kunal Talwar and Li Zhang , title =. International Conference on Learning Representations , year =

  12. [16]

    Proceedings of the 30th IEEE Computer Security Foundations Symposium , pages =

    Ilya Mironov , title =. Proceedings of the 30th IEEE Computer Security Foundations Symposium , pages =

  13. [17]

    Large Language Models Can Be Strong Differentially Private Learners , journal =

    Xuechen Li and Florian Tram\`. Large Language Models Can Be Strong Differentially Private Learners , journal =

  14. [18]

    Findings of the Association for Computational Linguistics: EMNLP , year =

    Lingjuan Lyu and Xuanli He and Yitong Li and Kim-Kwang Raymond Choo and Haibo Hu , title =. Findings of the Association for Computational Linguistics: EMNLP , year =

  15. [20]

    Proceedings of the AAAI Conference on Artificial Intelligence , year =

    Yue Tan and Guodong Long and Jie Ma and Lu Liu and Tianyi Zhou and Jing Jiang , title =. Proceedings of the AAAI Conference on Artificial Intelligence , year =

  16. [21]

    Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing , year =

    Nils Reimers and Iryna Gurevych , title =. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing , year =

  17. [22]

    Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages =

    Qinbin Li and Bingsheng He and Dawn Song , title =. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages =

  18. [23]

    Advances in Neural Information Processing Systems , volume =

    Mansheej Paul and Surya Ganguli and Gintare Karolina Dziugaite , title =. Advances in Neural Information Processing Systems , volume =

  19. [25]

    Brendan McMahan, Ilya Mironov, Kunal Talwar, and Li Zhang

    Martin Abadi, Andy Chu, Ian Goodfellow, H. Brendan McMahan, Ilya Mironov, Kunal Talwar, and Li Zhang. 2016. Deep learning with differential privacy. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security

  20. [26]

    Mohammad Afzali, Hassan Ashtiani, and Christopher Liaw. 2025. Agnostic private density estimation for GMMs via list global stability. In Proceedings of The 36th International Conference on Algorithmic Learning Theory, volume 272 of Proceedings of Machine Learning Research, pages 41--66. PMLR

  21. [27]

    Sara Babakniya, Zalan Fabian, Chaoyang He, Mahdi Soltanolkotabi, and Salman Avestimehr. 2023. A data-free approach to mitigate catastrophic forgetting in federated class incremental learning for vision tasks. In Advances in Neural Information Processing Systems

  22. [28]

    Haonan Duan, Adam Dziedzic, Nicolas Papernot, and Franziska Boenisch. 2023. https://arxiv.org/abs/2305.15594 Flocks of stochastic parrots: Differentially private prompt learning for large language models . arXiv preprint arXiv:2305.15594

  23. [29]

    Badih Ghazi, Ravi Kumar, and Pasin Manurangsi. 2021. User-level private learning via correlated sampling. In Advances in Neural Information Processing Systems, volume 34, pages 20172--20184

  24. [30]

    Qinbin Li, Bingsheng He, and Dawn Song. 2021. Model-contrastive federated learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10713--10722

  25. [31]

    Xuechen Li, Florian Tram\` e r, Percy Liang, and Tatsunori Hashimoto. 2022. Large language models can be strong differentially private learners. arXiv preprint arXiv:2110.05679

  26. [32]

    Yichen Li, Weiying Xie, Guoqing Wang, and Yunsong Li. 2025. Federated continual learning: Concepts, challenges, and solutions. arXiv preprint arXiv:2502.07059

  27. [33]

    Jinglin Liang, Jin Zhong, Hanlin Gu, Zhongqi Lu, Xingxing Tang, Gang Dai, Shuangping Huang, Lixin Fan, and Qiang Yang. 2024. Diffusion-driven data replay: A novel approach to combat forgetting in federated class continual learning. arXiv preprint arXiv:2409.01128

  28. [34]

    Lingjuan Lyu, Xuanli He, Yitong Li, Kim-Kwang Raymond Choo, and Haibo Hu. 2020. Differentially private representation for NLP : Formal guarantee and an empirical study on privacy and fairness. In Findings of the Association for Computational Linguistics: EMNLP

  29. [35]

    Brendan McMahan, Daniel Ramage, Kunal Talwar, and Li Zhang

    H. Brendan McMahan, Daniel Ramage, Kunal Talwar, and Li Zhang. 2018. Learning differentially private recurrent language models. In International Conference on Learning Representations

  30. [36]

    Chan, Christopher G

    Yongsheng Mei, Liangqi Yuan, Dong-Jun Han, Kevin S. Chan, Christopher G. Brinton, and Tian Lan. 2024. Using diffusion models as generative replay in continual federated learning -- what will happen? arXiv preprint arXiv:2411.06618

  31. [37]

    Yongsheng Mei, Linfeng Zhao, Wei Liu, Efstathios Bakolas, and Alexandros G. Dimakis. 2025. Accurate forgetting for heterogeneous federated continual learning. arXiv preprint arXiv:2502.14205

  32. [38]

    Ilya Mironov. 2017. R\' e nyi differential privacy. In Proceedings of the 30th IEEE Computer Security Foundations Symposium, pages 263--275

  33. [39]

    Junyan Ouyang, Rui Han, and Chi Harold Liu. 2023. https://doi.org/10.1109/VTC2023-Fall60731.2023.10333463 Evaluating differential privacy in federated continual learning . In 2023 IEEE 98th Vehicular Technology Conference, VTC 2023-Fall - Proceedings, IEEE Vehicular Technology Conference. IEEE

  34. [40]

    Mansheej Paul, Surya Ganguli, and Gintare Karolina Dziugaite. 2021. Deep learning on a data diet: Finding important examples early in training. In Advances in Neural Information Processing Systems, volume 34, pages 20596--20607

  35. [41]

    Hongming Piao, Yichen Wu, Dapeng Wu, and Ying Wei. 2024. Federated continual learning via prompt-based dual knowledge transfer. In Proceedings of the 41st International Conference on Machine Learning, volume 235 of Proceedings of Machine Learning Research, pages 40725--40739. PMLR

  36. [42]

    Daiqing Qi, Handong Zhao, and Sheng Li. 2023. Better generative replay for continual federated learning. In International Conference on Learning Representations

  37. [43]

    Nils Reimers and Iryna Gurevych. 2019. Sentence- BERT : Sentence embeddings using siamese BERT -networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing

  38. [44]

    David Rolnick, Arun Ahuja, Jonathan Schwarz, Timothy Lillicrap, and Gregory Wayne. 2019. Experience replay for continual learning. In Advances in Neural Information Processing Systems

  39. [45]

    Yue Tan, Guodong Long, Jie Ma, Lu Liu, Tianyi Zhou, and Jing Jiang. 2022. FedProto : Federated prototype learning across heterogeneous clients. In Proceedings of the AAAI Conference on Artificial Intelligence

  40. [46]

    Jaehong Yoon, Wonyong Jeong, Giwoong Lee, Eunho Yang, and Sung Ju Hwang. 2021. Federated continual learning with weighted inter-client transfer. In Proceedings of the 38th International Conference on Machine Learning, volume 139 of Proceedings of Machine Learning Research, pages 12073--12086. PMLR

  41. [47]

    Hao Yu, Xin Yang, Le Zhang, Hanlin Gu, Tianrui Li, Lixin Fan, and Qiang Yang. 2025. Addressing spatial-temporal data heterogeneity in federated continual learning via tail anchor. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

  42. [48]

    Inan, Xuechen Li, Girish Kumar, Julia McAnallen, Huan Sun, David Levitan, Robert Sim, and Salil Vadhan

    Xiang Yue, Huseyin A. Inan, Xuechen Li, Girish Kumar, Julia McAnallen, Huan Sun, David Levitan, Robert Sim, and Salil Vadhan. 2023. Synthetic text generation with differential privacy: A simple and practical recipe. arXiv preprint arXiv:2210.14348