Differentially Private Clustered Federated Learning with Privacy-Preserving Initialization and Normality-Driven Aggregation
Pith reviewed 2026-05-10 00:25 UTC · model grok-4.3
The pith
PINA lets clustered federated learning keep formal privacy by initializing clusters from compressed LoRA sketches before normality-driven aggregation.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We propose PINA, a two-stage framework that first lets each client fine-tune a lightweight low-rank adaptation (LoRA) adapter and privately share a compressed sketch of the update. The server leverages these sketches to construct robust cluster centroids. In the second stage, PINA introduces a normality-driven aggregation mechanism that improves convergence and robustness. Our method retains the benefits of clustered FL while providing formal privacy guarantees against an untrusted server.
What carries the argument
PINA two-stage framework, in which compressed LoRA sketches enable the server to build cluster centroids from noisy updates and normality-driven aggregation then refines client contributions.
If this is right
- Clients can be grouped by data similarity without the server seeing individual updates in the clear.
- Formal differential privacy holds for the entire process against an untrusted server.
- Average accuracy improves by 2.9 percent compared with prior DP-FL algorithms when epsilon is set to 2 or 8.
- Convergence becomes faster and more stable once clusters are initialized and normality-driven weighting is applied.
- The separation into initialization and refinement stages allows the benefits of clustering to survive the noise required for privacy.
Where Pith is reading between the lines
- The same sketch-based initialization could be tested with other low-rank adapters or quantization schemes beyond the LoRA variant used here.
- Normality-driven weighting might reduce sensitivity to outlier clients in settings where data distributions shift over time.
- The framework could be extended to vertical federated learning by applying the sketch step only to the shared feature space.
- Performance under stricter privacy budgets (smaller epsilon) would reveal whether the sketch compression remains sufficient.
Load-bearing premise
That the compressed sketches of LoRA updates remain informative enough for the server to form accurate clusters despite the addition of differential privacy noise.
What would settle it
An experiment on a highly heterogeneous dataset where cluster assignments produced from the sketches match random grouping and final model accuracy shows no gain over standard differentially private federated learning.
read the original abstract
Federated learning (FL) enables training of a global model while keeping raw data on end-devices. Despite this, FL has shown to leak private user information and thus in practice, it is often coupled with methods such as differential privacy (DP) and secure vector sum to provide formal privacy guarantees to its participants. In realistic cross-device deployments, the data are highly heterogeneous, so vanilla federated learning converges slowly and generalizes poorly. Clustered federated learning (CFL) mitigates this by segregating users into clusters, leading to lower intra-cluster data heterogeneity. Nevertheless, coupling CFL with DP remains challenging: the injected DP noise makes individual client updates excessively noisy, and the server is unable to initialize cluster centroids with the less noisy aggregated updates. To address this challenge, we propose PINA, a two-stage framework that first lets each client fine-tune a lightweight low-rank adaptation (LoRA) adapter and privately share a compressed sketch of the update. The server leverages these sketches to construct robust cluster centroids. In the second stage, PINA introduces a normality-driven aggregation mechanism that improves convergence and robustness. Our method retains the benefits of clustered FL while providing formal privacy guarantees against an untrusted server. Extensive evaluations show that our proposed method outperforms state-of-the-art DP-FL algorithms by an average of 2.9% in accuracy for privacy budgets (epsilon in {2, 8}).
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript presents PINA, a two-stage framework for differentially private clustered federated learning. Clients fine-tune lightweight LoRA adapters and privately share compressed sketches of updates, allowing the server to initialize robust cluster centroids despite DP noise. The second stage applies a normality-driven aggregation rule to improve convergence and robustness. The central claims are that the approach preserves CFL benefits under formal privacy guarantees against an untrusted server and delivers an average 2.9% accuracy gain over state-of-the-art DP-FL baselines for ε ∈ {2, 8}.
Significance. If the empirical results and privacy accounting hold, the work provides a practical solution to the long-standing tension between clustering for heterogeneity and the noise introduced by DP in federated settings. The LoRA-sketch initialization and normality-driven aggregation are technically interesting integrations that could influence future DP-FL designs. Strengths include the explicit two-stage construction, formal privacy analysis, and consistent gains across evaluated datasets and budgets; these elements make the contribution substantive for both theory and deployment.
minor comments (4)
- Abstract: the 2.9% average gain is stated without naming the datasets, number of clients, or number of runs; adding these details would strengthen the claim for readers.
- §4.1: the compression ratio and sketch dimension for LoRA updates are introduced without an accompanying sensitivity analysis or ablation on how these parameters trade off clustering quality versus communication cost.
- Table 3: the reported accuracy improvements lack error bars or standard deviations across random seeds, making it difficult to assess whether the 2.9% margin is statistically reliable.
- §5.3: the normality-driven aggregation rule is motivated heuristically; a short derivation or reference showing why the chosen statistic is robust to the specific DP noise distribution would improve clarity.
Simulated Author's Rebuttal
We thank the referee for the positive summary, recognition of the technical contributions of the LoRA-sketch initialization and normality-driven aggregation, and the recommendation for minor revision. The assessment that the work addresses a practical tension between clustering and DP noise is appreciated. No specific major comments were raised in the report.
Circularity Check
No significant circularity detected in derivation chain
full rationale
The paper introduces PINA as a two-stage algorithmic framework for DP clustered FL, relying on LoRA fine-tuning, compressed sketches for centroid initialization, and a normality-driven aggregation step. No equations, derivations, or first-principles results are presented in the provided text that reduce by construction to fitted inputs, self-definitions, or self-citation chains. The performance claims (e.g., 2.9% accuracy gain) are framed as empirical outcomes on evaluated datasets rather than predictions forced by the method's own parameters. Privacy accounting and clustering robustness are described as independent formal and algorithmic contributions without evident self-referential loops or renamed known results.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
INTRODUCTION Federated learning (FL) enables a distributed group of edge devices to collaboratively train a shared model while keeping raw user data on-device [1]. Despite this, the exchanged gradients or model up- dates can reveal statistical fingerprints that compromise user privacy [2]. Differential privacy (DP) [3] protects against such inferences by ...
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[2]
PRELIMINARY 2.1. Federated Learning (FL) Overview of FL: At the start of each communication roundt, a global modelW t is provided by the server and a randomly sampled user setK t is constructed. Each userk∈ K t trains the model locally to obtainW t k and shares the model difference ∆t k =W t k −W t back to the server. The server aggregates the updatesW t+...
-
[3]
OUR METHOD: PINA 3.1. Overview Our proposed method PINA consists of two stages: (1) Cluster Model Initialization and (2) Clustered Model Training. In (1), we privately initialize cluster models from user updates; and in (2) we perform cluster identification and model training in a federated setting, privately updating global cluster models. The workflow o...
work page 2034
-
[4]
EXPERIMENTS Experimental settings:We use privacy budget ofϵ∈ {2,8}which are commonly used in existing works [8, 9] andδ= 1 |K|1.1 [2]. Fol- lowing [2, 31, 26], we simulate a cohort size of 10k with a smaller cohort size to achieve a more realistic signal-to-noise ratio which represents industry scale more closely. We use rotated CIFAR-10 (C= 2), rotated F...
-
[5]
CONCLUSION In this work, we propose PINA, a privacy-preserving clustered FL framework that effectively mitigates data heterogeneity in DP-FL. By combining privatized client sketches for robust initialization and a normality-driven aggregation mechanism that accounts for imbal- anced contributions, PINA achieves superior performance on non- IID data withou...
-
[6]
Communication-efficient learning of deep net- works from decentralized data,
Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hamp- son, et al., “Communication-efficient learning of deep net- works from decentralized data,” inAISTATS, 2017
work page 2017
-
[7]
Learning differentially private recurrent language models,
H Brendan McMahan, Daniel Ramage, Kunal Talwar, and Li Zhang, “Learning differentially private recurrent language models,”ICLR, 2018
work page 2018
-
[8]
Calibrating noise to sensitivity in private data analy- sis,
Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam Smith, “Calibrating noise to sensitivity in private data analy- sis,” inTheory of Cryptography, 2006
work page 2006
-
[9]
Shiva Prasad Kasiviswanathan, Homin K Lee, Kobbi Nissim, Sofya Raskhodnikova, and Adam Smith, “What can we learn privately?,”SIAM Journal on Computing, 2011
work page 2011
-
[10]
Practical secure aggregation for privacy- preserving machine learning,
Keith Bonawitz, Vladimir Ivanov, Ben Kreuter, Antonio Marcedone, et al., “Practical secure aggregation for privacy- preserving machine learning,” inCCS, 2017
work page 2017
-
[11]
Benchmarking secure sam- pling protocols for differential privacy,
Yucheng Fu and Tianhao Wang, “Benchmarking secure sam- pling protocols for differential privacy,” inCCS, 2024
work page 2024
-
[12]
Federated learning: Challenges, methods, and future directions,
Tian Li, Anit Kumar Sahu, Ameet Talwalkar, and Virginia Smith, “Federated learning: Challenges, methods, and future directions,”IEEE signal processing magazine, 2020
work page 2020
-
[13]
Federated learning with differential privacy: Algo- rithms and performance analysis,
Kang Wei, Jun Li, Ming Ding, Chuan Ma, Howard H Yang, et al., “Federated learning with differential privacy: Algo- rithms and performance analysis,”IEEE TIFS, 2020
work page 2020
-
[14]
Differentially private federated learning on heterogeneous data,
Maxence Noble, Aur ´elien Bellet, and Aymeric Dieuleveut, “Differentially private federated learning on heterogeneous data,” inAISTATS, 2022
work page 2022
-
[15]
Yuchen Yang, Bo Hui, Haolin Yuan, Neil Gong, and Yinzhi Cao, “{PrivateFL}: Accurate, differentially private federated learning via personalized data transformation,” inUSENIX Se- curity, 2023
work page 2023
-
[16]
An efficient framework for clustered federated learning,
Avishek Ghosh, Jichan Chung, Dong Yin, and Kannan Ram- chandran, “An efficient framework for clustered federated learning,”NeurIPS, 2020
work page 2020
-
[17]
Felix Sattler, Klaus-Robert M ¨uller, and Wojciech Samek, “Clustered federated learning: Model-agnostic distributed multitask optimization under privacy constraints,”IEEE TNNLS, 2020
work page 2020
-
[18]
The algorithmic foun- dations of differential privacy,
Cynthia Dwork, Aaron Roth, et al., “The algorithmic foun- dations of differential privacy,”Foundations and Trends® in Theoretical Computer Science, 2014
work page 2014
-
[19]
Deep learn- ing with differential privacy,
Martin Abadi, Andy Chu, Ian Goodfellow, H Brendan McMa- han, Ilya Mironov, Kunal Talwar, and Li Zhang, “Deep learn- ing with differential privacy,” inCCS, 2016
work page 2016
-
[20]
Ilya Mironov, “R ´enyi differential privacy,” in2017 IEEE 30th computer security foundations symposium (CSF), 2017
work page 2017
-
[21]
Hypothesis testing interpretations and Renyi differential privacy,
Borja Balle, Gilles Barthe, Marco Gaboardi, Justin Hsu, and Tetsuya Sato, “Hypothesis testing interpretations and Renyi differential privacy,” inInternational Conference on Artificial Intelligence and Statistics. PMLR, 2020
work page 2020
-
[22]
Privacy am- plification by subsampling: Tight analyses via couplings and divergences,
Borja Balle, Gilles Barthe, and Marco Gaboardi, “Privacy am- plification by subsampling: Tight analyses via couplings and divergences,” inNeurIPS, 2018
work page 2018
-
[23]
LoRA: Low-rank adaptation of large language models,
Edward J Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, et al., “LoRA: Low-rank adaptation of large language models,” inICLR, 2022
work page 2022
-
[24]
Saeed Vahidian, Mahdi Morafah, Weijia Wang, Vyacheslav Kungurtsev, Chen Chen, et al., “Efficient distribution similar- ity identification in clustered federated learning via principal angles between client data subspaces,” inAAAI, 2023
work page 2023
-
[25]
Clustered feder- ated learning with adaptive local differential privacy on hetero- geneous IoT data,
Zaobo He, Lintao Wang, and Zhipeng Cai, “Clustered feder- ated learning with adaptive local differential privacy on hetero- geneous IoT data,”IEEE IoT, 2024
work page 2024
-
[26]
Saber Malekmohammadi, Afaf Taik, and Golnoosh Farnadi, “Mitigating disparate impact of differential privacy in feder- ated learning through robust clustering,”arXiv, 2024
work page 2024
-
[27]
Scaling language model size in cross-device federated learning,
Jae Ro, Theresa Breiner, Lara McConnaughey, Mingqing Chen, Ananda Suresh, et al., “Scaling language model size in cross-device federated learning,” inFL4NLP, 2022
work page 2022
-
[28]
Secure aggregation for clus- tered federated learning,
Hasin Us Sami and Bas ¸ak G¨uler, “Secure aggregation for clus- tered federated learning,” inISIT, 2023
work page 2023
-
[29]
Clus- terguard: Secure clustered aggregation for federated learning with robustness,
Yulin Zhao, Zhiguo Wan, Zhangshuang Guan, et al., “Clus- terguard: Secure clustered aggregation for federated learning with robustness,”Cryptology ePrint Archive, 2024
work page 2024
-
[30]
Federated learning from pre-trained models: A contrastive learning approach,
Yue Tan, Guodong Long, Jie Ma, Lu Liu, Tianyi Zhou, and Jing Jiang, “Federated learning from pre-trained models: A contrastive learning approach,”NeurIPS, 2022
work page 2022
-
[31]
Jie Xu, Karthikeyan Saravanan, Rogier van Dalen, Haaris Mehmood, David Tuckey, and Mete Ozay, “DP-DyLoRA: Fine-tuning transformer-based models on-device under dif- ferentially private federated learning using dynamic low-rank adaptation,”arXiv preprint arXiv:2405.06368, 2024
-
[32]
Rethinking architecture design for tackling data heterogeneity in federated learning,
Liangqiong Qu, Yuyin Zhou, Paul Pu Liang, Yingda Xia, Feifei Wang, et al., “Rethinking architecture design for tackling data heterogeneity in federated learning,” inCVPR, 2022
work page 2022
-
[33]
A hybrid approach to privacy-preserving federated learning,
Stacey Truex, Nathalie Baracaldo, Ali Anwar, Thomas Steinke, Heiko Ludwig, Rui Zhang, and Yi Zhou, “A hybrid approach to privacy-preserving federated learning,” inAISec, 2019
work page 2019
-
[34]
A comprehensive com- parison of multiparty secure additions with differential pri- vacy,
Slawomir Goryczka and Li Xiong, “A comprehensive com- parison of multiparty secure additions with differential pri- vacy,”IEEE Transactions on Dependable and Secure Com- puting, 2017
work page 2017
-
[35]
An analysis of variance test for normality (complete samples),
Samuel Sanford Shapiro and Martin B Wilk, “An analysis of variance test for normality (complete samples),”Biometrika, 1965
work page 1965
-
[36]
FLAIR: Federated learning annotated image repository,
Congzheng Song, Filip Granqvist, and Kunal Talwar, “FLAIR: Federated learning annotated image repository,”NeurIPS, 2022
work page 2022
-
[37]
Federated optimiza- tion in heterogeneous networks,
Tian Li, Anit Kumar Sahu, Manzil Zaheer, Maziar Sanjabi, Ameet Talwalkar, and Virginia Smith, “Federated optimiza- tion in heterogeneous networks,”MLSys, 2020
work page 2020
-
[38]
Tackling the objective inconsistency problem in heterogeneous federated optimization,
Jianyu Wang, Qinghua Liu, Hao Liang, Gauri Joshi, and H Vin- cent Poor, “Tackling the objective inconsistency problem in heterogeneous federated optimization,”NeurIPS, 2020
work page 2020
-
[39]
Scaffold: Stochastic controlled aver- aging for federated learning,
Sai Praneeth Karimireddy, Satyen Kale, Mehryar Mohri, Sashank Reddi, et al., “Scaffold: Stochastic controlled aver- aging for federated learning,” inICML, 2020
work page 2020
-
[40]
Breaking the centralized barrier for cross-device federated learning,
Sai Praneeth Karimireddy, Martin Jaggi, Satyen Kale, Mehryar Mohri, Sashank Reddi, et al., “Breaking the centralized barrier for cross-device federated learning,”NeurIPS, 2021
work page 2021
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.