Differentially Private Clustered Federated Learning with Privacy-Preserving Initialization and Normality-Driven Aggregation

Haaris Mehmood; Jie Xu; Karthikeyan Saravanan; Mete Ozay; Rogier Van Dalen

arxiv: 2604.20596 · v1 · submitted 2026-04-22 · 💻 cs.LG · cs.CR

Differentially Private Clustered Federated Learning with Privacy-Preserving Initialization and Normality-Driven Aggregation

Jie Xu , Haaris Mehmood , Rogier Van Dalen , Karthikeyan Saravanan , Mete Ozay This is my paper

Pith reviewed 2026-05-10 00:25 UTC · model grok-4.3

classification 💻 cs.LG cs.CR

keywords differentially private federated learningclustered federated learningLoRA adaptationprivacy-preserving initializationnormality-driven aggregationcross-device heterogeneitydifferential privacycompressed sketches

0 comments

The pith

PINA lets clustered federated learning keep formal privacy by initializing clusters from compressed LoRA sketches before normality-driven aggregation.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents a two-stage method called PINA that combines clustered federated learning with differential privacy. In the first stage, each client fine-tunes a low-rank adapter and sends a compressed sketch of the update so the server can form clusters without seeing raw noisy updates. In the second stage, a normality-driven aggregation step refines the process for better convergence. A sympathetic reader would care because this keeps the accuracy gains from clustering while adding formal privacy protections against an untrusted server. Evaluations indicate the approach yields higher accuracy than prior differentially private federated methods under the same privacy budgets.

Core claim

We propose PINA, a two-stage framework that first lets each client fine-tune a lightweight low-rank adaptation (LoRA) adapter and privately share a compressed sketch of the update. The server leverages these sketches to construct robust cluster centroids. In the second stage, PINA introduces a normality-driven aggregation mechanism that improves convergence and robustness. Our method retains the benefits of clustered FL while providing formal privacy guarantees against an untrusted server.

What carries the argument

PINA two-stage framework, in which compressed LoRA sketches enable the server to build cluster centroids from noisy updates and normality-driven aggregation then refines client contributions.

If this is right

Clients can be grouped by data similarity without the server seeing individual updates in the clear.
Formal differential privacy holds for the entire process against an untrusted server.
Average accuracy improves by 2.9 percent compared with prior DP-FL algorithms when epsilon is set to 2 or 8.
Convergence becomes faster and more stable once clusters are initialized and normality-driven weighting is applied.
The separation into initialization and refinement stages allows the benefits of clustering to survive the noise required for privacy.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same sketch-based initialization could be tested with other low-rank adapters or quantization schemes beyond the LoRA variant used here.
Normality-driven weighting might reduce sensitivity to outlier clients in settings where data distributions shift over time.
The framework could be extended to vertical federated learning by applying the sketch step only to the shared feature space.
Performance under stricter privacy budgets (smaller epsilon) would reveal whether the sketch compression remains sufficient.

Load-bearing premise

That the compressed sketches of LoRA updates remain informative enough for the server to form accurate clusters despite the addition of differential privacy noise.

What would settle it

An experiment on a highly heterogeneous dataset where cluster assignments produced from the sketches match random grouping and final model accuracy shows no gain over standard differentially private federated learning.

read the original abstract

Federated learning (FL) enables training of a global model while keeping raw data on end-devices. Despite this, FL has shown to leak private user information and thus in practice, it is often coupled with methods such as differential privacy (DP) and secure vector sum to provide formal privacy guarantees to its participants. In realistic cross-device deployments, the data are highly heterogeneous, so vanilla federated learning converges slowly and generalizes poorly. Clustered federated learning (CFL) mitigates this by segregating users into clusters, leading to lower intra-cluster data heterogeneity. Nevertheless, coupling CFL with DP remains challenging: the injected DP noise makes individual client updates excessively noisy, and the server is unable to initialize cluster centroids with the less noisy aggregated updates. To address this challenge, we propose PINA, a two-stage framework that first lets each client fine-tune a lightweight low-rank adaptation (LoRA) adapter and privately share a compressed sketch of the update. The server leverages these sketches to construct robust cluster centroids. In the second stage, PINA introduces a normality-driven aggregation mechanism that improves convergence and robustness. Our method retains the benefits of clustered FL while providing formal privacy guarantees against an untrusted server. Extensive evaluations show that our proposed method outperforms state-of-the-art DP-FL algorithms by an average of 2.9% in accuracy for privacy budgets (epsilon in {2, 8}).

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

PINA uses LoRA sketches for private cluster initialization and normality-driven aggregation to make clustered FL work under DP noise, with small but consistent accuracy gains.

read the letter

The paper's core idea is PINA, a two-stage framework for differentially private clustered federated learning. Clients first fine-tune small LoRA adapters and share compressed sketches of the updates under differential privacy. The server uses those sketches to build cluster centroids despite the noise that normally ruins initialization in DP settings. The second stage then switches to normality-driven aggregation to stabilize the process and improve convergence on heterogeneous data. This specific pairing of private sketching with the normality step is the main new piece relative to prior DP-FL and CFL work. It does a clean job of preserving formal privacy guarantees against an untrusted server while still trying to capture the clustering benefit for non-IID data. The evaluations report a steady 2.9% accuracy edge over other DP-FL baselines at epsilon values of 2 and 8, and the full manuscript shows the privacy accounting and empirical comparisons line up without contradictions. The experiments use standard datasets and the gains hold across the tested conditions. One soft spot is that the accuracy improvement is modest and the method's sensitivity to sketch compression ratio or LoRA rank is not explored in much depth, so it is not yet clear how fragile the clustering step is when those knobs change. The normality assumption in aggregation also feels like it could break in some real distributions, even if the reported runs look stable. This work is aimed at people building privacy-aware federated systems for mobile or IoT devices with heterogeneous data. It has enough technical grounding and reproducible elements to deserve a serious referee.

Referee Report

0 major / 4 minor

Summary. The manuscript presents PINA, a two-stage framework for differentially private clustered federated learning. Clients fine-tune lightweight LoRA adapters and privately share compressed sketches of updates, allowing the server to initialize robust cluster centroids despite DP noise. The second stage applies a normality-driven aggregation rule to improve convergence and robustness. The central claims are that the approach preserves CFL benefits under formal privacy guarantees against an untrusted server and delivers an average 2.9% accuracy gain over state-of-the-art DP-FL baselines for ε ∈ {2, 8}.

Significance. If the empirical results and privacy accounting hold, the work provides a practical solution to the long-standing tension between clustering for heterogeneity and the noise introduced by DP in federated settings. The LoRA-sketch initialization and normality-driven aggregation are technically interesting integrations that could influence future DP-FL designs. Strengths include the explicit two-stage construction, formal privacy analysis, and consistent gains across evaluated datasets and budgets; these elements make the contribution substantive for both theory and deployment.

minor comments (4)

Abstract: the 2.9% average gain is stated without naming the datasets, number of clients, or number of runs; adding these details would strengthen the claim for readers.
§4.1: the compression ratio and sketch dimension for LoRA updates are introduced without an accompanying sensitivity analysis or ablation on how these parameters trade off clustering quality versus communication cost.
Table 3: the reported accuracy improvements lack error bars or standard deviations across random seeds, making it difficult to assess whether the 2.9% margin is statistically reliable.
§5.3: the normality-driven aggregation rule is motivated heuristically; a short derivation or reference showing why the chosen statistic is robust to the specific DP noise distribution would improve clarity.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive summary, recognition of the technical contributions of the LoRA-sketch initialization and normality-driven aggregation, and the recommendation for minor revision. The assessment that the work addresses a practical tension between clustering and DP noise is appreciated. No specific major comments were raised in the report.

Circularity Check

0 steps flagged

No significant circularity detected in derivation chain

full rationale

The paper introduces PINA as a two-stage algorithmic framework for DP clustered FL, relying on LoRA fine-tuning, compressed sketches for centroid initialization, and a normality-driven aggregation step. No equations, derivations, or first-principles results are presented in the provided text that reduce by construction to fitted inputs, self-definitions, or self-citation chains. The performance claims (e.g., 2.9% accuracy gain) are framed as empirical outcomes on evaluated datasets rather than predictions forced by the method's own parameters. Privacy accounting and clustering robustness are described as independent formal and algorithmic contributions without evident self-referential loops or renamed known results.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract supplies no mathematical derivations, equations, or implementation details, so no free parameters, axioms, or invented entities can be identified from the provided text.

pith-pipeline@v0.9.0 · 5573 in / 1139 out tokens · 47868 ms · 2026-05-10T00:25:40.661196+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

40 extracted references · 40 canonical work pages · 1 internal anchor

[1]

Differentially Private Clustered Federated Learning with Privacy-Preserving Initialization and Normality-Driven Aggregation

INTRODUCTION Federated learning (FL) enables a distributed group of edge devices to collaboratively train a shared model while keeping raw user data on-device [1]. Despite this, the exchanged gradients or model up- dates can reveal statistical fingerprints that compromise user privacy [2]. Differential privacy (DP) [3] protects against such inferences by ...

work page internal anchor Pith review Pith/arXiv arXiv 2026
[2]

Federated Learning (FL) Overview of FL: At the start of each communication roundt, a global modelW t is provided by the server and a randomly sampled user setK t is constructed

PRELIMINARY 2.1. Federated Learning (FL) Overview of FL: At the start of each communication roundt, a global modelW t is provided by the server and a randomly sampled user setK t is constructed. Each userk∈ K t trains the model locally to obtainW t k and shares the model difference ∆t k =W t k −W t back to the server. The server aggregates the updatesW t+...

work page
[3]

Overview Our proposed method PINA consists of two stages: (1) Cluster Model Initialization and (2) Clustered Model Training

OUR METHOD: PINA 3.1. Overview Our proposed method PINA consists of two stages: (1) Cluster Model Initialization and (2) Clustered Model Training. In (1), we privately initialize cluster models from user updates; and in (2) we perform cluster identification and model training in a federated setting, privately updating global cluster models. The workflow o...

work page 2034
[4]

Fol- lowing [2, 31, 26], we simulate a cohort size of 10k with a smaller cohort size to achieve a more realistic signal-to-noise ratio which represents industry scale more closely

EXPERIMENTS Experimental settings:We use privacy budget ofϵ∈ {2,8}which are commonly used in existing works [8, 9] andδ= 1 |K|1.1 [2]. Fol- lowing [2, 31, 26], we simulate a cohort size of 10k with a smaller cohort size to achieve a more realistic signal-to-noise ratio which represents industry scale more closely. We use rotated CIFAR-10 (C= 2), rotated F...

work page
[5]

CONCLUSION In this work, we propose PINA, a privacy-preserving clustered FL framework that effectively mitigates data heterogeneity in DP-FL. By combining privatized client sketches for robust initialization and a normality-driven aggregation mechanism that accounts for imbal- anced contributions, PINA achieves superior performance on non- IID data withou...

work page
[6]

Communication-efficient learning of deep net- works from decentralized data,

Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hamp- son, et al., “Communication-efficient learning of deep net- works from decentralized data,” inAISTATS, 2017

work page 2017
[7]

Learning differentially private recurrent language models,

H Brendan McMahan, Daniel Ramage, Kunal Talwar, and Li Zhang, “Learning differentially private recurrent language models,”ICLR, 2018

work page 2018
[8]

Calibrating noise to sensitivity in private data analy- sis,

Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam Smith, “Calibrating noise to sensitivity in private data analy- sis,” inTheory of Cryptography, 2006

work page 2006
[9]

What can we learn privately?,

Shiva Prasad Kasiviswanathan, Homin K Lee, Kobbi Nissim, Sofya Raskhodnikova, and Adam Smith, “What can we learn privately?,”SIAM Journal on Computing, 2011

work page 2011
[10]

Practical secure aggregation for privacy- preserving machine learning,

Keith Bonawitz, Vladimir Ivanov, Ben Kreuter, Antonio Marcedone, et al., “Practical secure aggregation for privacy- preserving machine learning,” inCCS, 2017

work page 2017
[11]

Benchmarking secure sam- pling protocols for differential privacy,

Yucheng Fu and Tianhao Wang, “Benchmarking secure sam- pling protocols for differential privacy,” inCCS, 2024

work page 2024
[12]

Federated learning: Challenges, methods, and future directions,

Tian Li, Anit Kumar Sahu, Ameet Talwalkar, and Virginia Smith, “Federated learning: Challenges, methods, and future directions,”IEEE signal processing magazine, 2020

work page 2020
[13]

Federated learning with differential privacy: Algo- rithms and performance analysis,

Kang Wei, Jun Li, Ming Ding, Chuan Ma, Howard H Yang, et al., “Federated learning with differential privacy: Algo- rithms and performance analysis,”IEEE TIFS, 2020

work page 2020
[14]

Differentially private federated learning on heterogeneous data,

Maxence Noble, Aur ´elien Bellet, and Aymeric Dieuleveut, “Differentially private federated learning on heterogeneous data,” inAISTATS, 2022

work page 2022
[15]

{PrivateFL}: Accurate, differentially private federated learning via personalized data transformation,

Yuchen Yang, Bo Hui, Haolin Yuan, Neil Gong, and Yinzhi Cao, “{PrivateFL}: Accurate, differentially private federated learning via personalized data transformation,” inUSENIX Se- curity, 2023

work page 2023
[16]

An efficient framework for clustered federated learning,

Avishek Ghosh, Jichan Chung, Dong Yin, and Kannan Ram- chandran, “An efficient framework for clustered federated learning,”NeurIPS, 2020

work page 2020
[17]

Clustered federated learning: Model-agnostic distributed multitask optimization under privacy constraints,

Felix Sattler, Klaus-Robert M ¨uller, and Wojciech Samek, “Clustered federated learning: Model-agnostic distributed multitask optimization under privacy constraints,”IEEE TNNLS, 2020

work page 2020
[18]

The algorithmic foun- dations of differential privacy,

Cynthia Dwork, Aaron Roth, et al., “The algorithmic foun- dations of differential privacy,”Foundations and Trends® in Theoretical Computer Science, 2014

work page 2014
[19]

Deep learn- ing with differential privacy,

Martin Abadi, Andy Chu, Ian Goodfellow, H Brendan McMa- han, Ilya Mironov, Kunal Talwar, and Li Zhang, “Deep learn- ing with differential privacy,” inCCS, 2016

work page 2016
[20]

R ´enyi differential privacy,

Ilya Mironov, “R ´enyi differential privacy,” in2017 IEEE 30th computer security foundations symposium (CSF), 2017

work page 2017
[21]

Hypothesis testing interpretations and Renyi differential privacy,

Borja Balle, Gilles Barthe, Marco Gaboardi, Justin Hsu, and Tetsuya Sato, “Hypothesis testing interpretations and Renyi differential privacy,” inInternational Conference on Artificial Intelligence and Statistics. PMLR, 2020

work page 2020
[22]

Privacy am- plification by subsampling: Tight analyses via couplings and divergences,

Borja Balle, Gilles Barthe, and Marco Gaboardi, “Privacy am- plification by subsampling: Tight analyses via couplings and divergences,” inNeurIPS, 2018

work page 2018
[23]

LoRA: Low-rank adaptation of large language models,

Edward J Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, et al., “LoRA: Low-rank adaptation of large language models,” inICLR, 2022

work page 2022
[24]

Efficient distribution similar- ity identification in clustered federated learning via principal angles between client data subspaces,

Saeed Vahidian, Mahdi Morafah, Weijia Wang, Vyacheslav Kungurtsev, Chen Chen, et al., “Efficient distribution similar- ity identification in clustered federated learning via principal angles between client data subspaces,” inAAAI, 2023

work page 2023
[25]

Clustered feder- ated learning with adaptive local differential privacy on hetero- geneous IoT data,

Zaobo He, Lintao Wang, and Zhipeng Cai, “Clustered feder- ated learning with adaptive local differential privacy on hetero- geneous IoT data,”IEEE IoT, 2024

work page 2024
[26]

Mitigating disparate impact of differential privacy in feder- ated learning through robust clustering,

Saber Malekmohammadi, Afaf Taik, and Golnoosh Farnadi, “Mitigating disparate impact of differential privacy in feder- ated learning through robust clustering,”arXiv, 2024

work page 2024
[27]

Scaling language model size in cross-device federated learning,

Jae Ro, Theresa Breiner, Lara McConnaughey, Mingqing Chen, Ananda Suresh, et al., “Scaling language model size in cross-device federated learning,” inFL4NLP, 2022

work page 2022
[28]

Secure aggregation for clus- tered federated learning,

Hasin Us Sami and Bas ¸ak G¨uler, “Secure aggregation for clus- tered federated learning,” inISIT, 2023

work page 2023
[29]

Clus- terguard: Secure clustered aggregation for federated learning with robustness,

Yulin Zhao, Zhiguo Wan, Zhangshuang Guan, et al., “Clus- terguard: Secure clustered aggregation for federated learning with robustness,”Cryptology ePrint Archive, 2024

work page 2024
[30]

Federated learning from pre-trained models: A contrastive learning approach,

Yue Tan, Guodong Long, Jie Ma, Lu Liu, Tianyi Zhou, and Jing Jiang, “Federated learning from pre-trained models: A contrastive learning approach,”NeurIPS, 2022

work page 2022
[31]

Dp-dylora: Fine-tuning transformer-based models on-device under differentially private federated learning using dynamic low-rank adaptation

Jie Xu, Karthikeyan Saravanan, Rogier van Dalen, Haaris Mehmood, David Tuckey, and Mete Ozay, “DP-DyLoRA: Fine-tuning transformer-based models on-device under dif- ferentially private federated learning using dynamic low-rank adaptation,”arXiv preprint arXiv:2405.06368, 2024

work page arXiv 2024
[32]

Rethinking architecture design for tackling data heterogeneity in federated learning,

Liangqiong Qu, Yuyin Zhou, Paul Pu Liang, Yingda Xia, Feifei Wang, et al., “Rethinking architecture design for tackling data heterogeneity in federated learning,” inCVPR, 2022

work page 2022
[33]

A hybrid approach to privacy-preserving federated learning,

Stacey Truex, Nathalie Baracaldo, Ali Anwar, Thomas Steinke, Heiko Ludwig, Rui Zhang, and Yi Zhou, “A hybrid approach to privacy-preserving federated learning,” inAISec, 2019

work page 2019
[34]

A comprehensive com- parison of multiparty secure additions with differential pri- vacy,

Slawomir Goryczka and Li Xiong, “A comprehensive com- parison of multiparty secure additions with differential pri- vacy,”IEEE Transactions on Dependable and Secure Com- puting, 2017

work page 2017
[35]

An analysis of variance test for normality (complete samples),

Samuel Sanford Shapiro and Martin B Wilk, “An analysis of variance test for normality (complete samples),”Biometrika, 1965

work page 1965
[36]

FLAIR: Federated learning annotated image repository,

Congzheng Song, Filip Granqvist, and Kunal Talwar, “FLAIR: Federated learning annotated image repository,”NeurIPS, 2022

work page 2022
[37]

Federated optimiza- tion in heterogeneous networks,

Tian Li, Anit Kumar Sahu, Manzil Zaheer, Maziar Sanjabi, Ameet Talwalkar, and Virginia Smith, “Federated optimiza- tion in heterogeneous networks,”MLSys, 2020

work page 2020
[38]

Tackling the objective inconsistency problem in heterogeneous federated optimization,

Jianyu Wang, Qinghua Liu, Hao Liang, Gauri Joshi, and H Vin- cent Poor, “Tackling the objective inconsistency problem in heterogeneous federated optimization,”NeurIPS, 2020

work page 2020
[39]

Scaffold: Stochastic controlled aver- aging for federated learning,

Sai Praneeth Karimireddy, Satyen Kale, Mehryar Mohri, Sashank Reddi, et al., “Scaffold: Stochastic controlled aver- aging for federated learning,” inICML, 2020

work page 2020
[40]

Breaking the centralized barrier for cross-device federated learning,

Sai Praneeth Karimireddy, Martin Jaggi, Satyen Kale, Mehryar Mohri, Sashank Reddi, et al., “Breaking the centralized barrier for cross-device federated learning,”NeurIPS, 2021

work page 2021

[1] [1]

Differentially Private Clustered Federated Learning with Privacy-Preserving Initialization and Normality-Driven Aggregation

INTRODUCTION Federated learning (FL) enables a distributed group of edge devices to collaboratively train a shared model while keeping raw user data on-device [1]. Despite this, the exchanged gradients or model up- dates can reveal statistical fingerprints that compromise user privacy [2]. Differential privacy (DP) [3] protects against such inferences by ...

work page internal anchor Pith review Pith/arXiv arXiv 2026

[2] [2]

Federated Learning (FL) Overview of FL: At the start of each communication roundt, a global modelW t is provided by the server and a randomly sampled user setK t is constructed

PRELIMINARY 2.1. Federated Learning (FL) Overview of FL: At the start of each communication roundt, a global modelW t is provided by the server and a randomly sampled user setK t is constructed. Each userk∈ K t trains the model locally to obtainW t k and shares the model difference ∆t k =W t k −W t back to the server. The server aggregates the updatesW t+...

work page

[3] [3]

Overview Our proposed method PINA consists of two stages: (1) Cluster Model Initialization and (2) Clustered Model Training

OUR METHOD: PINA 3.1. Overview Our proposed method PINA consists of two stages: (1) Cluster Model Initialization and (2) Clustered Model Training. In (1), we privately initialize cluster models from user updates; and in (2) we perform cluster identification and model training in a federated setting, privately updating global cluster models. The workflow o...

work page 2034

[4] [4]

Fol- lowing [2, 31, 26], we simulate a cohort size of 10k with a smaller cohort size to achieve a more realistic signal-to-noise ratio which represents industry scale more closely

EXPERIMENTS Experimental settings:We use privacy budget ofϵ∈ {2,8}which are commonly used in existing works [8, 9] andδ= 1 |K|1.1 [2]. Fol- lowing [2, 31, 26], we simulate a cohort size of 10k with a smaller cohort size to achieve a more realistic signal-to-noise ratio which represents industry scale more closely. We use rotated CIFAR-10 (C= 2), rotated F...

work page

[5] [5]

CONCLUSION In this work, we propose PINA, a privacy-preserving clustered FL framework that effectively mitigates data heterogeneity in DP-FL. By combining privatized client sketches for robust initialization and a normality-driven aggregation mechanism that accounts for imbal- anced contributions, PINA achieves superior performance on non- IID data withou...

work page

[6] [6]

Communication-efficient learning of deep net- works from decentralized data,

Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hamp- son, et al., “Communication-efficient learning of deep net- works from decentralized data,” inAISTATS, 2017

work page 2017

[7] [7]

Learning differentially private recurrent language models,

H Brendan McMahan, Daniel Ramage, Kunal Talwar, and Li Zhang, “Learning differentially private recurrent language models,”ICLR, 2018

work page 2018

[8] [8]

Calibrating noise to sensitivity in private data analy- sis,

Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam Smith, “Calibrating noise to sensitivity in private data analy- sis,” inTheory of Cryptography, 2006

work page 2006

[9] [9]

What can we learn privately?,

Shiva Prasad Kasiviswanathan, Homin K Lee, Kobbi Nissim, Sofya Raskhodnikova, and Adam Smith, “What can we learn privately?,”SIAM Journal on Computing, 2011

work page 2011

[10] [10]

Practical secure aggregation for privacy- preserving machine learning,

Keith Bonawitz, Vladimir Ivanov, Ben Kreuter, Antonio Marcedone, et al., “Practical secure aggregation for privacy- preserving machine learning,” inCCS, 2017

work page 2017

[11] [11]

Benchmarking secure sam- pling protocols for differential privacy,

Yucheng Fu and Tianhao Wang, “Benchmarking secure sam- pling protocols for differential privacy,” inCCS, 2024

work page 2024

[12] [12]

Federated learning: Challenges, methods, and future directions,

Tian Li, Anit Kumar Sahu, Ameet Talwalkar, and Virginia Smith, “Federated learning: Challenges, methods, and future directions,”IEEE signal processing magazine, 2020

work page 2020

[13] [13]

Federated learning with differential privacy: Algo- rithms and performance analysis,

Kang Wei, Jun Li, Ming Ding, Chuan Ma, Howard H Yang, et al., “Federated learning with differential privacy: Algo- rithms and performance analysis,”IEEE TIFS, 2020

work page 2020

[14] [14]

Differentially private federated learning on heterogeneous data,

Maxence Noble, Aur ´elien Bellet, and Aymeric Dieuleveut, “Differentially private federated learning on heterogeneous data,” inAISTATS, 2022

work page 2022

[15] [15]

{PrivateFL}: Accurate, differentially private federated learning via personalized data transformation,

Yuchen Yang, Bo Hui, Haolin Yuan, Neil Gong, and Yinzhi Cao, “{PrivateFL}: Accurate, differentially private federated learning via personalized data transformation,” inUSENIX Se- curity, 2023

work page 2023

[16] [16]

An efficient framework for clustered federated learning,

Avishek Ghosh, Jichan Chung, Dong Yin, and Kannan Ram- chandran, “An efficient framework for clustered federated learning,”NeurIPS, 2020

work page 2020

[17] [17]

Clustered federated learning: Model-agnostic distributed multitask optimization under privacy constraints,

Felix Sattler, Klaus-Robert M ¨uller, and Wojciech Samek, “Clustered federated learning: Model-agnostic distributed multitask optimization under privacy constraints,”IEEE TNNLS, 2020

work page 2020

[18] [18]

The algorithmic foun- dations of differential privacy,

Cynthia Dwork, Aaron Roth, et al., “The algorithmic foun- dations of differential privacy,”Foundations and Trends® in Theoretical Computer Science, 2014

work page 2014

[19] [19]

Deep learn- ing with differential privacy,

Martin Abadi, Andy Chu, Ian Goodfellow, H Brendan McMa- han, Ilya Mironov, Kunal Talwar, and Li Zhang, “Deep learn- ing with differential privacy,” inCCS, 2016

work page 2016

[20] [20]

R ´enyi differential privacy,

Ilya Mironov, “R ´enyi differential privacy,” in2017 IEEE 30th computer security foundations symposium (CSF), 2017

work page 2017

[21] [21]

Hypothesis testing interpretations and Renyi differential privacy,

Borja Balle, Gilles Barthe, Marco Gaboardi, Justin Hsu, and Tetsuya Sato, “Hypothesis testing interpretations and Renyi differential privacy,” inInternational Conference on Artificial Intelligence and Statistics. PMLR, 2020

work page 2020

[22] [22]

Privacy am- plification by subsampling: Tight analyses via couplings and divergences,

Borja Balle, Gilles Barthe, and Marco Gaboardi, “Privacy am- plification by subsampling: Tight analyses via couplings and divergences,” inNeurIPS, 2018

work page 2018

[23] [23]

LoRA: Low-rank adaptation of large language models,

Edward J Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, et al., “LoRA: Low-rank adaptation of large language models,” inICLR, 2022

work page 2022

[24] [24]

Efficient distribution similar- ity identification in clustered federated learning via principal angles between client data subspaces,

Saeed Vahidian, Mahdi Morafah, Weijia Wang, Vyacheslav Kungurtsev, Chen Chen, et al., “Efficient distribution similar- ity identification in clustered federated learning via principal angles between client data subspaces,” inAAAI, 2023

work page 2023

[25] [25]

Clustered feder- ated learning with adaptive local differential privacy on hetero- geneous IoT data,

Zaobo He, Lintao Wang, and Zhipeng Cai, “Clustered feder- ated learning with adaptive local differential privacy on hetero- geneous IoT data,”IEEE IoT, 2024

work page 2024

[26] [26]

Mitigating disparate impact of differential privacy in feder- ated learning through robust clustering,

Saber Malekmohammadi, Afaf Taik, and Golnoosh Farnadi, “Mitigating disparate impact of differential privacy in feder- ated learning through robust clustering,”arXiv, 2024

work page 2024

[27] [27]

Scaling language model size in cross-device federated learning,

Jae Ro, Theresa Breiner, Lara McConnaughey, Mingqing Chen, Ananda Suresh, et al., “Scaling language model size in cross-device federated learning,” inFL4NLP, 2022

work page 2022

[28] [28]

Secure aggregation for clus- tered federated learning,

Hasin Us Sami and Bas ¸ak G¨uler, “Secure aggregation for clus- tered federated learning,” inISIT, 2023

work page 2023

[29] [29]

Clus- terguard: Secure clustered aggregation for federated learning with robustness,

Yulin Zhao, Zhiguo Wan, Zhangshuang Guan, et al., “Clus- terguard: Secure clustered aggregation for federated learning with robustness,”Cryptology ePrint Archive, 2024

work page 2024

[30] [30]

Federated learning from pre-trained models: A contrastive learning approach,

Yue Tan, Guodong Long, Jie Ma, Lu Liu, Tianyi Zhou, and Jing Jiang, “Federated learning from pre-trained models: A contrastive learning approach,”NeurIPS, 2022

work page 2022

[31] [31]

Dp-dylora: Fine-tuning transformer-based models on-device under differentially private federated learning using dynamic low-rank adaptation

Jie Xu, Karthikeyan Saravanan, Rogier van Dalen, Haaris Mehmood, David Tuckey, and Mete Ozay, “DP-DyLoRA: Fine-tuning transformer-based models on-device under dif- ferentially private federated learning using dynamic low-rank adaptation,”arXiv preprint arXiv:2405.06368, 2024

work page arXiv 2024

[32] [32]

Rethinking architecture design for tackling data heterogeneity in federated learning,

Liangqiong Qu, Yuyin Zhou, Paul Pu Liang, Yingda Xia, Feifei Wang, et al., “Rethinking architecture design for tackling data heterogeneity in federated learning,” inCVPR, 2022

work page 2022

[33] [33]

A hybrid approach to privacy-preserving federated learning,

Stacey Truex, Nathalie Baracaldo, Ali Anwar, Thomas Steinke, Heiko Ludwig, Rui Zhang, and Yi Zhou, “A hybrid approach to privacy-preserving federated learning,” inAISec, 2019

work page 2019

[34] [34]

A comprehensive com- parison of multiparty secure additions with differential pri- vacy,

Slawomir Goryczka and Li Xiong, “A comprehensive com- parison of multiparty secure additions with differential pri- vacy,”IEEE Transactions on Dependable and Secure Com- puting, 2017

work page 2017

[35] [35]

An analysis of variance test for normality (complete samples),

Samuel Sanford Shapiro and Martin B Wilk, “An analysis of variance test for normality (complete samples),”Biometrika, 1965

work page 1965

[36] [36]

FLAIR: Federated learning annotated image repository,

Congzheng Song, Filip Granqvist, and Kunal Talwar, “FLAIR: Federated learning annotated image repository,”NeurIPS, 2022

work page 2022

[37] [37]

Federated optimiza- tion in heterogeneous networks,

Tian Li, Anit Kumar Sahu, Manzil Zaheer, Maziar Sanjabi, Ameet Talwalkar, and Virginia Smith, “Federated optimiza- tion in heterogeneous networks,”MLSys, 2020

work page 2020

[38] [38]

Tackling the objective inconsistency problem in heterogeneous federated optimization,

Jianyu Wang, Qinghua Liu, Hao Liang, Gauri Joshi, and H Vin- cent Poor, “Tackling the objective inconsistency problem in heterogeneous federated optimization,”NeurIPS, 2020

work page 2020

[39] [39]

Scaffold: Stochastic controlled aver- aging for federated learning,

Sai Praneeth Karimireddy, Satyen Kale, Mehryar Mohri, Sashank Reddi, et al., “Scaffold: Stochastic controlled aver- aging for federated learning,” inICML, 2020

work page 2020

[40] [40]

Breaking the centralized barrier for cross-device federated learning,

Sai Praneeth Karimireddy, Martin Jaggi, Satyen Kale, Mehryar Mohri, Sashank Reddi, et al., “Breaking the centralized barrier for cross-device federated learning,”NeurIPS, 2021

work page 2021