pith. sign in

arxiv: 2604.20596 · v1 · submitted 2026-04-22 · 💻 cs.LG · cs.CR

Differentially Private Clustered Federated Learning with Privacy-Preserving Initialization and Normality-Driven Aggregation

Pith reviewed 2026-05-10 00:25 UTC · model grok-4.3

classification 💻 cs.LG cs.CR
keywords differentially private federated learningclustered federated learningLoRA adaptationprivacy-preserving initializationnormality-driven aggregationcross-device heterogeneitydifferential privacycompressed sketches
0
0 comments X

The pith

PINA lets clustered federated learning keep formal privacy by initializing clusters from compressed LoRA sketches before normality-driven aggregation.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents a two-stage method called PINA that combines clustered federated learning with differential privacy. In the first stage, each client fine-tunes a low-rank adapter and sends a compressed sketch of the update so the server can form clusters without seeing raw noisy updates. In the second stage, a normality-driven aggregation step refines the process for better convergence. A sympathetic reader would care because this keeps the accuracy gains from clustering while adding formal privacy protections against an untrusted server. Evaluations indicate the approach yields higher accuracy than prior differentially private federated methods under the same privacy budgets.

Core claim

We propose PINA, a two-stage framework that first lets each client fine-tune a lightweight low-rank adaptation (LoRA) adapter and privately share a compressed sketch of the update. The server leverages these sketches to construct robust cluster centroids. In the second stage, PINA introduces a normality-driven aggregation mechanism that improves convergence and robustness. Our method retains the benefits of clustered FL while providing formal privacy guarantees against an untrusted server.

What carries the argument

PINA two-stage framework, in which compressed LoRA sketches enable the server to build cluster centroids from noisy updates and normality-driven aggregation then refines client contributions.

If this is right

  • Clients can be grouped by data similarity without the server seeing individual updates in the clear.
  • Formal differential privacy holds for the entire process against an untrusted server.
  • Average accuracy improves by 2.9 percent compared with prior DP-FL algorithms when epsilon is set to 2 or 8.
  • Convergence becomes faster and more stable once clusters are initialized and normality-driven weighting is applied.
  • The separation into initialization and refinement stages allows the benefits of clustering to survive the noise required for privacy.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same sketch-based initialization could be tested with other low-rank adapters or quantization schemes beyond the LoRA variant used here.
  • Normality-driven weighting might reduce sensitivity to outlier clients in settings where data distributions shift over time.
  • The framework could be extended to vertical federated learning by applying the sketch step only to the shared feature space.
  • Performance under stricter privacy budgets (smaller epsilon) would reveal whether the sketch compression remains sufficient.

Load-bearing premise

That the compressed sketches of LoRA updates remain informative enough for the server to form accurate clusters despite the addition of differential privacy noise.

What would settle it

An experiment on a highly heterogeneous dataset where cluster assignments produced from the sketches match random grouping and final model accuracy shows no gain over standard differentially private federated learning.

read the original abstract

Federated learning (FL) enables training of a global model while keeping raw data on end-devices. Despite this, FL has shown to leak private user information and thus in practice, it is often coupled with methods such as differential privacy (DP) and secure vector sum to provide formal privacy guarantees to its participants. In realistic cross-device deployments, the data are highly heterogeneous, so vanilla federated learning converges slowly and generalizes poorly. Clustered federated learning (CFL) mitigates this by segregating users into clusters, leading to lower intra-cluster data heterogeneity. Nevertheless, coupling CFL with DP remains challenging: the injected DP noise makes individual client updates excessively noisy, and the server is unable to initialize cluster centroids with the less noisy aggregated updates. To address this challenge, we propose PINA, a two-stage framework that first lets each client fine-tune a lightweight low-rank adaptation (LoRA) adapter and privately share a compressed sketch of the update. The server leverages these sketches to construct robust cluster centroids. In the second stage, PINA introduces a normality-driven aggregation mechanism that improves convergence and robustness. Our method retains the benefits of clustered FL while providing formal privacy guarantees against an untrusted server. Extensive evaluations show that our proposed method outperforms state-of-the-art DP-FL algorithms by an average of 2.9% in accuracy for privacy budgets (epsilon in {2, 8}).

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 4 minor

Summary. The manuscript presents PINA, a two-stage framework for differentially private clustered federated learning. Clients fine-tune lightweight LoRA adapters and privately share compressed sketches of updates, allowing the server to initialize robust cluster centroids despite DP noise. The second stage applies a normality-driven aggregation rule to improve convergence and robustness. The central claims are that the approach preserves CFL benefits under formal privacy guarantees against an untrusted server and delivers an average 2.9% accuracy gain over state-of-the-art DP-FL baselines for ε ∈ {2, 8}.

Significance. If the empirical results and privacy accounting hold, the work provides a practical solution to the long-standing tension between clustering for heterogeneity and the noise introduced by DP in federated settings. The LoRA-sketch initialization and normality-driven aggregation are technically interesting integrations that could influence future DP-FL designs. Strengths include the explicit two-stage construction, formal privacy analysis, and consistent gains across evaluated datasets and budgets; these elements make the contribution substantive for both theory and deployment.

minor comments (4)
  1. Abstract: the 2.9% average gain is stated without naming the datasets, number of clients, or number of runs; adding these details would strengthen the claim for readers.
  2. §4.1: the compression ratio and sketch dimension for LoRA updates are introduced without an accompanying sensitivity analysis or ablation on how these parameters trade off clustering quality versus communication cost.
  3. Table 3: the reported accuracy improvements lack error bars or standard deviations across random seeds, making it difficult to assess whether the 2.9% margin is statistically reliable.
  4. §5.3: the normality-driven aggregation rule is motivated heuristically; a short derivation or reference showing why the chosen statistic is robust to the specific DP noise distribution would improve clarity.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive summary, recognition of the technical contributions of the LoRA-sketch initialization and normality-driven aggregation, and the recommendation for minor revision. The assessment that the work addresses a practical tension between clustering and DP noise is appreciated. No specific major comments were raised in the report.

Circularity Check

0 steps flagged

No significant circularity detected in derivation chain

full rationale

The paper introduces PINA as a two-stage algorithmic framework for DP clustered FL, relying on LoRA fine-tuning, compressed sketches for centroid initialization, and a normality-driven aggregation step. No equations, derivations, or first-principles results are presented in the provided text that reduce by construction to fitted inputs, self-definitions, or self-citation chains. The performance claims (e.g., 2.9% accuracy gain) are framed as empirical outcomes on evaluated datasets rather than predictions forced by the method's own parameters. Privacy accounting and clustering robustness are described as independent formal and algorithmic contributions without evident self-referential loops or renamed known results.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract supplies no mathematical derivations, equations, or implementation details, so no free parameters, axioms, or invented entities can be identified from the provided text.

pith-pipeline@v0.9.0 · 5573 in / 1139 out tokens · 47868 ms · 2026-05-10T00:25:40.661196+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

40 extracted references · 40 canonical work pages · 1 internal anchor

  1. [1]

    Differentially Private Clustered Federated Learning with Privacy-Preserving Initialization and Normality-Driven Aggregation

    INTRODUCTION Federated learning (FL) enables a distributed group of edge devices to collaboratively train a shared model while keeping raw user data on-device [1]. Despite this, the exchanged gradients or model up- dates can reveal statistical fingerprints that compromise user privacy [2]. Differential privacy (DP) [3] protects against such inferences by ...

  2. [2]

    Federated Learning (FL) Overview of FL: At the start of each communication roundt, a global modelW t is provided by the server and a randomly sampled user setK t is constructed

    PRELIMINARY 2.1. Federated Learning (FL) Overview of FL: At the start of each communication roundt, a global modelW t is provided by the server and a randomly sampled user setK t is constructed. Each userk∈ K t trains the model locally to obtainW t k and shares the model difference ∆t k =W t k −W t back to the server. The server aggregates the updatesW t+...

  3. [3]

    Overview Our proposed method PINA consists of two stages: (1) Cluster Model Initialization and (2) Clustered Model Training

    OUR METHOD: PINA 3.1. Overview Our proposed method PINA consists of two stages: (1) Cluster Model Initialization and (2) Clustered Model Training. In (1), we privately initialize cluster models from user updates; and in (2) we perform cluster identification and model training in a federated setting, privately updating global cluster models. The workflow o...

  4. [4]

    Fol- lowing [2, 31, 26], we simulate a cohort size of 10k with a smaller cohort size to achieve a more realistic signal-to-noise ratio which represents industry scale more closely

    EXPERIMENTS Experimental settings:We use privacy budget ofϵ∈ {2,8}which are commonly used in existing works [8, 9] andδ= 1 |K|1.1 [2]. Fol- lowing [2, 31, 26], we simulate a cohort size of 10k with a smaller cohort size to achieve a more realistic signal-to-noise ratio which represents industry scale more closely. We use rotated CIFAR-10 (C= 2), rotated F...

  5. [5]

    CONCLUSION In this work, we propose PINA, a privacy-preserving clustered FL framework that effectively mitigates data heterogeneity in DP-FL. By combining privatized client sketches for robust initialization and a normality-driven aggregation mechanism that accounts for imbal- anced contributions, PINA achieves superior performance on non- IID data withou...

  6. [6]

    Communication-efficient learning of deep net- works from decentralized data,

    Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hamp- son, et al., “Communication-efficient learning of deep net- works from decentralized data,” inAISTATS, 2017

  7. [7]

    Learning differentially private recurrent language models,

    H Brendan McMahan, Daniel Ramage, Kunal Talwar, and Li Zhang, “Learning differentially private recurrent language models,”ICLR, 2018

  8. [8]

    Calibrating noise to sensitivity in private data analy- sis,

    Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam Smith, “Calibrating noise to sensitivity in private data analy- sis,” inTheory of Cryptography, 2006

  9. [9]

    What can we learn privately?,

    Shiva Prasad Kasiviswanathan, Homin K Lee, Kobbi Nissim, Sofya Raskhodnikova, and Adam Smith, “What can we learn privately?,”SIAM Journal on Computing, 2011

  10. [10]

    Practical secure aggregation for privacy- preserving machine learning,

    Keith Bonawitz, Vladimir Ivanov, Ben Kreuter, Antonio Marcedone, et al., “Practical secure aggregation for privacy- preserving machine learning,” inCCS, 2017

  11. [11]

    Benchmarking secure sam- pling protocols for differential privacy,

    Yucheng Fu and Tianhao Wang, “Benchmarking secure sam- pling protocols for differential privacy,” inCCS, 2024

  12. [12]

    Federated learning: Challenges, methods, and future directions,

    Tian Li, Anit Kumar Sahu, Ameet Talwalkar, and Virginia Smith, “Federated learning: Challenges, methods, and future directions,”IEEE signal processing magazine, 2020

  13. [13]

    Federated learning with differential privacy: Algo- rithms and performance analysis,

    Kang Wei, Jun Li, Ming Ding, Chuan Ma, Howard H Yang, et al., “Federated learning with differential privacy: Algo- rithms and performance analysis,”IEEE TIFS, 2020

  14. [14]

    Differentially private federated learning on heterogeneous data,

    Maxence Noble, Aur ´elien Bellet, and Aymeric Dieuleveut, “Differentially private federated learning on heterogeneous data,” inAISTATS, 2022

  15. [15]

    {PrivateFL}: Accurate, differentially private federated learning via personalized data transformation,

    Yuchen Yang, Bo Hui, Haolin Yuan, Neil Gong, and Yinzhi Cao, “{PrivateFL}: Accurate, differentially private federated learning via personalized data transformation,” inUSENIX Se- curity, 2023

  16. [16]

    An efficient framework for clustered federated learning,

    Avishek Ghosh, Jichan Chung, Dong Yin, and Kannan Ram- chandran, “An efficient framework for clustered federated learning,”NeurIPS, 2020

  17. [17]

    Clustered federated learning: Model-agnostic distributed multitask optimization under privacy constraints,

    Felix Sattler, Klaus-Robert M ¨uller, and Wojciech Samek, “Clustered federated learning: Model-agnostic distributed multitask optimization under privacy constraints,”IEEE TNNLS, 2020

  18. [18]

    The algorithmic foun- dations of differential privacy,

    Cynthia Dwork, Aaron Roth, et al., “The algorithmic foun- dations of differential privacy,”Foundations and Trends® in Theoretical Computer Science, 2014

  19. [19]

    Deep learn- ing with differential privacy,

    Martin Abadi, Andy Chu, Ian Goodfellow, H Brendan McMa- han, Ilya Mironov, Kunal Talwar, and Li Zhang, “Deep learn- ing with differential privacy,” inCCS, 2016

  20. [20]

    R ´enyi differential privacy,

    Ilya Mironov, “R ´enyi differential privacy,” in2017 IEEE 30th computer security foundations symposium (CSF), 2017

  21. [21]

    Hypothesis testing interpretations and Renyi differential privacy,

    Borja Balle, Gilles Barthe, Marco Gaboardi, Justin Hsu, and Tetsuya Sato, “Hypothesis testing interpretations and Renyi differential privacy,” inInternational Conference on Artificial Intelligence and Statistics. PMLR, 2020

  22. [22]

    Privacy am- plification by subsampling: Tight analyses via couplings and divergences,

    Borja Balle, Gilles Barthe, and Marco Gaboardi, “Privacy am- plification by subsampling: Tight analyses via couplings and divergences,” inNeurIPS, 2018

  23. [23]

    LoRA: Low-rank adaptation of large language models,

    Edward J Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, et al., “LoRA: Low-rank adaptation of large language models,” inICLR, 2022

  24. [24]

    Efficient distribution similar- ity identification in clustered federated learning via principal angles between client data subspaces,

    Saeed Vahidian, Mahdi Morafah, Weijia Wang, Vyacheslav Kungurtsev, Chen Chen, et al., “Efficient distribution similar- ity identification in clustered federated learning via principal angles between client data subspaces,” inAAAI, 2023

  25. [25]

    Clustered feder- ated learning with adaptive local differential privacy on hetero- geneous IoT data,

    Zaobo He, Lintao Wang, and Zhipeng Cai, “Clustered feder- ated learning with adaptive local differential privacy on hetero- geneous IoT data,”IEEE IoT, 2024

  26. [26]

    Mitigating disparate impact of differential privacy in feder- ated learning through robust clustering,

    Saber Malekmohammadi, Afaf Taik, and Golnoosh Farnadi, “Mitigating disparate impact of differential privacy in feder- ated learning through robust clustering,”arXiv, 2024

  27. [27]

    Scaling language model size in cross-device federated learning,

    Jae Ro, Theresa Breiner, Lara McConnaughey, Mingqing Chen, Ananda Suresh, et al., “Scaling language model size in cross-device federated learning,” inFL4NLP, 2022

  28. [28]

    Secure aggregation for clus- tered federated learning,

    Hasin Us Sami and Bas ¸ak G¨uler, “Secure aggregation for clus- tered federated learning,” inISIT, 2023

  29. [29]

    Clus- terguard: Secure clustered aggregation for federated learning with robustness,

    Yulin Zhao, Zhiguo Wan, Zhangshuang Guan, et al., “Clus- terguard: Secure clustered aggregation for federated learning with robustness,”Cryptology ePrint Archive, 2024

  30. [30]

    Federated learning from pre-trained models: A contrastive learning approach,

    Yue Tan, Guodong Long, Jie Ma, Lu Liu, Tianyi Zhou, and Jing Jiang, “Federated learning from pre-trained models: A contrastive learning approach,”NeurIPS, 2022

  31. [31]

    Dp-dylora: Fine-tuning transformer-based models on-device under differentially private federated learning using dynamic low-rank adaptation

    Jie Xu, Karthikeyan Saravanan, Rogier van Dalen, Haaris Mehmood, David Tuckey, and Mete Ozay, “DP-DyLoRA: Fine-tuning transformer-based models on-device under dif- ferentially private federated learning using dynamic low-rank adaptation,”arXiv preprint arXiv:2405.06368, 2024

  32. [32]

    Rethinking architecture design for tackling data heterogeneity in federated learning,

    Liangqiong Qu, Yuyin Zhou, Paul Pu Liang, Yingda Xia, Feifei Wang, et al., “Rethinking architecture design for tackling data heterogeneity in federated learning,” inCVPR, 2022

  33. [33]

    A hybrid approach to privacy-preserving federated learning,

    Stacey Truex, Nathalie Baracaldo, Ali Anwar, Thomas Steinke, Heiko Ludwig, Rui Zhang, and Yi Zhou, “A hybrid approach to privacy-preserving federated learning,” inAISec, 2019

  34. [34]

    A comprehensive com- parison of multiparty secure additions with differential pri- vacy,

    Slawomir Goryczka and Li Xiong, “A comprehensive com- parison of multiparty secure additions with differential pri- vacy,”IEEE Transactions on Dependable and Secure Com- puting, 2017

  35. [35]

    An analysis of variance test for normality (complete samples),

    Samuel Sanford Shapiro and Martin B Wilk, “An analysis of variance test for normality (complete samples),”Biometrika, 1965

  36. [36]

    FLAIR: Federated learning annotated image repository,

    Congzheng Song, Filip Granqvist, and Kunal Talwar, “FLAIR: Federated learning annotated image repository,”NeurIPS, 2022

  37. [37]

    Federated optimiza- tion in heterogeneous networks,

    Tian Li, Anit Kumar Sahu, Manzil Zaheer, Maziar Sanjabi, Ameet Talwalkar, and Virginia Smith, “Federated optimiza- tion in heterogeneous networks,”MLSys, 2020

  38. [38]

    Tackling the objective inconsistency problem in heterogeneous federated optimization,

    Jianyu Wang, Qinghua Liu, Hao Liang, Gauri Joshi, and H Vin- cent Poor, “Tackling the objective inconsistency problem in heterogeneous federated optimization,”NeurIPS, 2020

  39. [39]

    Scaffold: Stochastic controlled aver- aging for federated learning,

    Sai Praneeth Karimireddy, Satyen Kale, Mehryar Mohri, Sashank Reddi, et al., “Scaffold: Stochastic controlled aver- aging for federated learning,” inICML, 2020

  40. [40]

    Breaking the centralized barrier for cross-device federated learning,

    Sai Praneeth Karimireddy, Martin Jaggi, Satyen Kale, Mehryar Mohri, Sashank Reddi, et al., “Breaking the centralized barrier for cross-device federated learning,”NeurIPS, 2021