pith. machine review for the scientific record. sign in

arxiv: 2605.08443 · v1 · submitted 2026-05-08 · 💻 cs.CR

Recognition: no theorem link

Improving Parameter-Efficient Federated Learning with Differentially Private Refactorization

Authors on Pith no claims yet

Pith reviewed 2026-05-12 01:08 UTC · model grok-4.3

classification 💻 cs.CR
keywords federated learningdifferential privacyLoRAlow-rank adaptationparameter-efficient fine-tuningsubspace projectioncross-silo FLprivacy-preserving machine learning
0
0 comments X

The pith

FedPower improves differentially private low-rank federated learning by reconstructing full-rank updates before noisy projection.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tries to establish that a new server-side refactorization can make differentially private low-rank federated learning accurate even with tight privacy requirements. Prior methods that add noise directly to low-rank factors often lose the signal because the noise dominates in those small subspaces. FedPower instead rebuilds the full-rank updates from the client factors, clips the aggregate to limit sensitivity, and then uses PowerDP to factorize it again while adding noise right before the orthonormalization step. This preserves the orthogonality that keeps the updates useful. Readers would care because it shows a practical path to private training on private data silos without the usual accuracy collapse.

Core claim

FedPower reshapes server-side aggregation in cross-silo federated learning by explicitly reconstructing and clipping full-rank client updates to bound sensitivity, then projects the exact aggregated update back into a secure low-rank space using PowerDP. PowerDP is a differentially private low-rank factorization based on simultaneous subspace iteration that injects calibrated DP noise prior to the final orthonormalization step. Rigorous theoretical analyses establish sensitivity bounds for subspace projections, proving that this achieves both sample-level and client-level differential privacy. Experiments on language understanding tasks demonstrate robustness against tight privacy budgets.

What carries the argument

PowerDP, the differentially private low-rank factorization mechanism based on simultaneous subspace iteration that injects calibrated DP noise prior to the final orthonormalization step to preserve matrix orthogonality

If this is right

  • FedPower achieves both sample-level and client-level differential privacy.
  • The framework is robust against tight privacy budgets while adding negligible computational overheads.
  • PowerDP improves the accuracy-privacy tradeoff compared to other noise injection schemes.
  • The approach is validated on language understanding tasks in cross-silo settings.
  • Evaluation against membership inference attacks confirms the privacy guarantees.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Similar reconstruction of full-rank updates before projection could help other low-rank methods in privacy-preserving distributed training.
  • Adding noise before orthonormalization may be useful in other DP mechanisms that rely on subspace or matrix decompositions.
  • The sensitivity bounds derived for subspace projections could apply to related projection-based techniques in private machine learning.
  • Extending the framework beyond cross-silo to cross-device federated learning would be a natural next test of its scalability.

Load-bearing premise

Reconstructing full-rank client updates from low-rank factors, clipping them, and then projecting the aggregate back via PowerDP with noise before orthonormalization will not introduce errors that overwhelm the signal in restricted low-rank subspaces.

What would settle it

If experiments under a tight privacy budget like epsilon=1 show that model accuracy on language tasks drops more than the non-private LoRA baseline or if membership inference attacks succeed above chance level, the claim would be falsified.

Figures

Figures reproduced from arXiv: 2605.08443 by Ana Milanova, Linh Tran, Stacy Patterson.

Figure 1
Figure 1. Figure 1: An illustration of the FL algorithms with LoRA in a global DP setting: existing FedLoRA and our proposed method [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Accuracy as a function of global rounds (a and b) and transferred bits (c and d) on QQP with [PITH_FULL_IMAGE:figures/full_fig_p009_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Success rates of three different membership infer [PITH_FULL_IMAGE:figures/full_fig_p011_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: True positive rate vs. False positive rate of each membership inference attack across all privacy budgets. [PITH_FULL_IMAGE:figures/full_fig_p012_4.png] view at source ↗
read the original abstract

Federated Learning (FL) with parameter-efficient fine-tuning, such as Low-Rank Adaptation (LoRA), enables scalable model training on distributed data. However, when combined with Differential Privacy (DP), LoRA often introduces errors during global aggregation and amplifies the negative effect of DP noise. Existing cross-silo FL approaches mitigate the aggregation error by freezing one LoRA module and applying output perturbation. However, in a restricted low-rank subspaces, this additive noise frequently overwhelms the signals of the weight matrices, leading to suboptimal accuracy. To address this vulnerability, we propose FedPower, a differentially private cross-silo FL framework that reshapes server-side aggregation. Instead of perturbing mismatched low-rank factors, FedPower explicitly reconstructs and clips full-rank client updates to bound the sensitivity. The server then projects the exact aggregated update back into a secure low-rank space using PowerDP, a novel differentially private low-rank factorization mechanism. Based on simultaneous subspace iteration, PowerDP injects calibrated DP noise prior to the final orthonormalization step, effectively mitigates the negative effect of DP noise by preserving matrix orthogonality. We provide rigorous theoretical analyses establishing sensitivity bounds for subspace projections, proving that FedPower achieves both sample-level and client-level DP. Extensive experiments on various language understanding tasks in cross-silo FL settings show that FedPower is robust against tight privacy budgets while adding negligible computational overheads. Additional empirical study on different DP noise injection schemes validates the effectiveness of PowerDP in improving the tradeoff in accuracy and privacy. Evaluation on three different membership inference attacks validates the robustness and privacy-preserving capability of the proposed framework.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes FedPower, a cross-silo federated learning framework for differentially private LoRA fine-tuning. Client updates are reconstructed to full-rank form (ΔW = BA), clipped to bound sensitivity, aggregated at the server, and then projected back to low-rank via the new PowerDP mechanism, which applies simultaneous subspace iteration and injects calibrated DP noise before the final orthonormalization step. The authors claim rigorous sensitivity bounds establishing both sample-level and client-level DP, along with empirical results on language understanding tasks showing improved robustness to tight privacy budgets and negligible overhead compared to prior output-perturbation baselines.

Significance. If the sensitivity bounds hold and the utility gains are reproducible, the work would meaningfully advance DP parameter-efficient FL by mitigating noise dominance in restricted low-rank subspaces. The provision of theoretical analyses for the combined reconstruction-plus-projection pipeline and extensive experiments across multiple tasks and DP schemes are strengths that support the central claim.

major comments (2)
  1. [§4] §4 (Theoretical Analysis): The claimed sensitivity bounds for subspace projections in PowerDP must explicitly compose the sensitivity introduced by full-rank reconstruction (ΔW = BA) and clipping with the subsequent noisy iteration and orthonormalization; the current separation of these steps leaves open whether the effective sensitivity for client-level DP remains bounded as stated, particularly for the tight ε values used in the experiments.
  2. [§5.2] §5.2 (Experiments on language tasks): The reported accuracy improvements of FedPower over output-perturbation baselines under ε ≤ 1 rely on the assumption that post-projection error does not overwhelm the low-rank signal; without an ablation isolating the reconstruction-plus-PowerDP error term or reporting per-run variance, it is unclear whether the gains are robust or could be explained by the specific LoRA rank and clipping choices.
minor comments (2)
  1. [Figure 2] Figure 2: The diagram of the PowerDP iteration would benefit from explicit annotation of where the DP noise is added relative to the orthonormalization step.
  2. The notation distinguishing sample-level versus client-level sensitivity bounds could be made more consistent between the theorem statements and the experimental privacy budget settings.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. The comments highlight important aspects of the theoretical composition and experimental robustness, which we address point by point below. We have revised the manuscript to incorporate clarifications and additional analyses where appropriate.

read point-by-point responses
  1. Referee: §4 (Theoretical Analysis): The claimed sensitivity bounds for subspace projections in PowerDP must explicitly compose the sensitivity introduced by full-rank reconstruction (ΔW = BA) and clipping with the subsequent noisy iteration and orthonormalization; the current separation of these steps leaves open whether the effective sensitivity for client-level DP remains bounded as stated, particularly for the tight ε values used in the experiments.

    Authors: We appreciate the referee's point on explicit composition. Section 4.1 establishes that full-rank reconstruction followed by clipping bounds the L2 sensitivity of each client's update to the clipping threshold C. PowerDP then receives this bounded aggregate and injects noise calibrated to the resulting sensitivity before orthonormalization. Orthonormalization is post-processing and preserves the DP guarantee. To address the concern directly, we have added a new paragraph in §4.2 that walks through the sequential composition: (i) clipping bounds per-client sensitivity, (ii) averaging scales sensitivity by 1/K, and (iii) PowerDP's Gaussian noise is scaled to this composed bound. The revised analysis confirms that client-level (ε,δ)-DP holds for the reported ε ≤ 1 values, with the proof details moved to the appendix for clarity. revision: yes

  2. Referee: §5.2 (Experiments on language tasks): The reported accuracy improvements of FedPower over output-perturbation baselines under ε ≤ 1 rely on the assumption that post-projection error does not overwhelm the low-rank signal; without an ablation isolating the reconstruction-plus-PowerDP error term or reporting per-run variance, it is unclear whether the gains are robust or could be explained by the specific LoRA rank and clipping choices.

    Authors: We agree that isolating the projection error and reporting variance would strengthen the empirical claims. In the revised §5.2 we have added an ablation that compares FedPower against a non-private reconstruction baseline and a direct low-rank perturbation variant, reporting the Frobenius norm of the post-projection residual. The results show that PowerDP's error remains below the noise level of output-perturbation baselines. We now also report mean accuracy ± standard deviation over five independent runs for all tasks and privacy budgets. Additional tables vary LoRA rank (r=4,8,16) and clipping norm (C=1,5), confirming that the accuracy gains persist across these choices and are not explained by specific hyperparameter settings. revision: yes

Circularity Check

0 steps flagged

No circularity: sensitivity bounds and PowerDP derived from standard DP definitions and subspace iteration without reduction to inputs or self-citations

full rationale

The paper's central claims rest on explicit reconstruction of full-rank updates from LoRA factors, clipping for sensitivity bounding, aggregation, and then PowerDP projection via simultaneous subspace iteration with noise injected before orthonormalization. Theoretical sensitivity bounds for the subspace projections are presented as derived analyses proving sample- and client-level DP; these follow from the mechanism definition and standard DP composition rather than any fitted quantity or self-referential loop. No equations reduce the claimed DP guarantees or utility improvements to tautologies by construction, and the abstract and mechanism description invoke no load-bearing self-citations or uniqueness theorems from prior author work. The framework is self-contained against external DP and LoRA benchmarks.

Axiom & Free-Parameter Ledger

2 free parameters · 2 axioms · 2 invented entities

The central claims rest on standard differential privacy definitions, the assumption that LoRA low-rank updates can be accurately reconstructed to full rank without loss, and the effectiveness of the new PowerDP timing for noise injection; no machine-checked proofs or external benchmarks are referenced.

free parameters (2)
  • LoRA rank r
    Hyperparameter controlling the dimension of the low-rank adaptation matrices; chosen per task and affects reconstruction accuracy.
  • DP privacy budget (epsilon, delta)
    User-specified parameters that determine the noise scale in PowerDP; calibrated to sensitivity bounds.
axioms (2)
  • standard math Standard (epsilon, delta)-differential privacy definitions apply to both sample-level and client-level settings
    Invoked to prove the privacy guarantees of FedPower and PowerDP.
  • domain assumption Simultaneous subspace iteration can be adapted to inject noise while preserving orthogonality after projection
    Core assumption underlying the design of PowerDP to mitigate noise effects.
invented entities (2)
  • PowerDP mechanism no independent evidence
    purpose: Differentially private low-rank factorization that adds noise before the final orthonormalization step
    Newly proposed technique to address noise overwhelming low-rank signals.
  • FedPower framework no independent evidence
    purpose: Server-side aggregation that reconstructs and clips full-rank updates before low-rank projection
    New overall architecture combining reconstruction, clipping, and PowerDP.

pith-pipeline@v0.9.0 · 5588 in / 1800 out tokens · 69633 ms · 2026-05-12T01:08:37.040488+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

36 extracted references · 36 canonical work pages · 2 internal anchors

  1. [1]

    Martin Abadi, Andy Chu, Ian Goodfellow, H Brendan McMahan, Ilya Mironov, Kunal Talwar, and Li Zhang. 2016. Deep learning with differential privacy. In ACM Conf. Comput. Commun. Secur. (CCS). ACM, 308–318

  2. [2]

    Maria-Florina Balcan, Simon Shaolei Du, Yining Wang, and Adams Wei Yu. 2016. An improved gap-dependency analysis of the noisy power method. InConf. Learn. Theory (COLT). PMLR, 284–309

  3. [3]

    Jeremiah Blocki, Avrim Blum, Anupam Datta, and Or Sheffet. 2012. The Johnson- Lindenstrauss transform itself preserves differential privacy. InIEEE Annu. Symp. Found. Comput. Sci. (FOCS). IEEE, 410–419

  4. [4]

    Cynthia Dwork and Aaron Roth. 2014. The algorithmic foundations of differential privacy.Found. Trends Theor. Comput. Sci.9, 3–4 (Aug. 2014), 211–487

  5. [5]

    Jonas Geiping, Hartmut Bauermeister, Hannah Dröge, and Michael Moeller. 2020. Inverting gradients – How easy is it to break privacy in federated learning?. In Adv. Neural Inf. Process. Syst. (NeurIPS). Curran Associates, Inc., 16937–16947

  6. [6]

    Moritz Hardt and Eric Price. 2014. The noisy power method: A meta algorithm with applications. InAdv. Neural Inf. Process. Syst. (NeurIPS), Vol. 27. Curran Associates, Inc

  7. [7]

    Moritz Hardt and Aaron Roth. 2012. Beating randomized response on incoherent matrices. InACM Symp. Theory Comput. (STOC). ACM, 1255–1268

  8. [8]

    Edward J Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen. 2021. LoRA: Low-rank adaptation of large language models.arXiv:2106.09685

  9. [9]

    Peter Kairouz, Monica Ribero Diaz, Keith Rush, and Abhradeep Thakurta. 2021. (Nearly) dimension independent private ERM with AdaGrad rates via publicly estimated subspaces. InConf. Learn. Theory (COLT). PMLR, 2717–2746

  10. [10]

    Tianqu Kang, Zixin Wang, Hengtao He, Jun Zhang, Shenghui Song, and Khaled B Letaief. 2025. Federated low-rank adaptation with differential privacy for wireless networks. InIEEE Int. Mediterr. Conf. Commun. Netw. (MeditCom). IEEE, 1–6

  11. [11]

    Michael Kapralov and Kunal Talwar. 2013. On differentially private low rank approximation. InACM-SIAM Symp. Discrete Algorithms (SODA). SIAM, 1395– 1414

  12. [12]

    Xuechen Li, Florian Tramer, Percy Liang, and Tatsunori Hashimoto. 2022. Large language models can be strong differentially private learners. InInt. Conf. Learn. Represent. (ICLR). OpenReview.net

  13. [13]

    Xiao-Yang Liu, Rongyi Zhu, Daochen Zha, Jiechao Gao, Shan Zhong, Matt White, and Meikang Qiu. 2025. Differentially private low-rank adaptation of large Linh Tran, Ana Milanova, and Stacy Patterson (a) Shadow Model Attack. (b) Loss Based Attack. (c) Calibration Loss Attack. Figure 4: True positive rate vs. False positive rate of each membership inference a...

  14. [14]

    Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. RoBERTa: A robustly optimized BERT pretraining approach.arXiv:1907.11692

  15. [15]

    Aravindh Mahendran and Andrea Vedaldi. 2015. Understanding deep image rep- resentations by inverting them. InIEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR). IEEE, 5188–5196

  16. [16]

    Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, and Blaise Aguera y Arcas. 2017. Communication-efficient learning of deep networks from decentralized data. InInt. Conf. Artif. Intell. Statist. (AISTATS). PMLR, 1273–1282

  17. [17]

    H Brendan McMahan, Daniel Ramage, Kunal Talwar, and Li Zhang. 2018. Learn- ing differentially private recurrent language models. InInt. Conf. Learn. Represent. (ICLR). OpenReview.net

  18. [18]

    Lakshay Sharma, Laura Graesser, Nikita Nangia, and Utku Evci. 2019. Natural language understanding with the Quora question pairs dataset.arXiv:1907.01041

  19. [19]

    Reza Shokri, Marco Stronati, Congzheng Song, and Vitaly Shmatikov. 2017. Membership inference attacks against machine learning models. InIEEE Symp. Secur. Privacy (S&P). IEEE, 3–18

  20. [20]

    Raghav Singhal, Kaustubh Ponkshe, and Praneeth Vepakomma. 2025. FedEx- LoRA: Exact aggregation for federated and efficient fine-tuning of large language models. InAnnu. Meet. Assoc. Comput. Linguist. (ACL). Association for Computa- tional Linguistics, 1316–1336

  21. [21]

    Richard Socher, Alex Perelygin, Jean Wu, Jason Chuang, Christopher D Manning, Andrew Y Ng, and Christopher Potts. 2013. Recursive deep models for semantic compositionality over a sentiment treebank. InConf. Empir. Methods Nat. Lang. Process. (EMNLP). Association for Computational Linguistics, 1631–1642

  22. [22]

    William J Stewart and Alan Jennings. 1981. A simultaneous iteration algorithm for real matrices.ACM Trans. Math. Softw.7, 2 (Jun. 1981), 184–198

  23. [23]

    Youbang Sun, Zitao Li, Yaliang Li, and Bolin Ding. 2024. Improving LoRA in privacy-preserving federated learning. InInt. Conf. Learn. Represent. (ICLR). OpenReview.net

  24. [24]

    Linh Tran, Wei Sun, Stacy Patterson, and Ana Milanova. 2025. Privacy-preserving personalized federated prompt learning for multimodal large language models. InInt. Conf. Learn. Represent. (ICLR). OpenReview.net

  25. [25]

    Thijs Vogels, Sai Praneeth Karimireddy, and Martin Jaggi. 2019. PowerSGD: Practical low-rank gradient compression for distributed optimization. InAdv. Neural Inf. Process. Syst. (NeurIPS). Curran Associates, Inc., 14259–14268

  26. [26]

    Alex Wang, Amanpreet Singh, Julian Michael, Felix Hill, Omer Levy, and Samuel Bowman. 2018. GLUE: A multi-task benchmark and analysis platform for natu- ral language understanding. InEMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. 353–355

  27. [27]

    Ziyao Wang, Zheyu Shen, Yexiao He, Guoheng Sun, Hongyi Wang, Lingjuan Lyu, and Ang Li. 2024. FLoRA: Federated fine-tuning large language models with heterogeneous low-rank adaptations. InAdv. Neural Inf. Process. Syst. (NeurIPS), Vol. 37. Curran Associates, Inc., 22513–22533

  28. [28]

    Lauren Watson, Chuan Guo, Graham Cormode, and Alexandre Sablayrolles. 2021. On the importance of difficulty calibration in membership inference attacks. In Int. Conf. Learn. Represent. (ICLR). OpenReview.net

  29. [29]

    Adina Williams, Nikita Nangia, and Samuel Bowman. 2018. A broad-coverage challenge corpus for sentence understanding through inference. InConf. N. Am. Chapter Assoc. Comput. Linguist.: Hum. Lang. Technol. (NAACL-HLT). Association for Computational Linguistics, 1112–1122

  30. [30]

    Jie Xu, Karthikeyan Saravanan, Rogier van Dalen, Haaris Mehmood, David Tuckey, and Mete Ozay. 2024. DP-DyLoRA: Fine-tuning transformer-based models on-device under differentially private federated learning using dynamic low-rank adaptation.arXiv:2405.06368

  31. [31]

    Qianren Yang, Yong Li, and Tao Zhao. 2025. FL-DPLoRA: An integrated and efficient privacy-preserving training framework for large language models in privacy-critical applications. InInt. Conf. Softw. Qual., Reliab., Secur. Companion (QRS-C). IEEE, 53–62

  32. [32]

    Samuel Yeom, Irene Giacomelli, Matt Fredrikson, and Somesh Jha. 2018. Privacy risk in machine learning: Analyzing the connection to overfitting. InIEEE Comput. Secur. Found. Symp. (CSF). IEEE, 268–282

  33. [33]

    Da Yu, Huishuai Zhang, Wei Chen, and Tie-Yan Liu. 2021. Do not let privacy overbill utility: Gradient embedding perturbation for private learning. InInt. Conf. Learn. Represent. (ICLR). OpenReview.net

  34. [34]

    Da Yu, Huishuai Zhang, Wei Chen, Jian Yin, and Tie-Yan Liu. 2021. Large scale private learning via low-rank reparametrization. InInt. Conf. Mach. Learn. (ICML). PMLR, 12208–12218

  35. [35]

    Jianyi Zhang, Saeed Vahidian, Martin Kuo, Chunyuan Li, Ruiyi Zhang, Tong Yu, Guoyin Wang, and Yiran Chen. 2024. Towards building the FederatedGPT: Federated instruction tuning. InIEEE Int. Conf. Acoust., Speech Signal Process. (ICASSP). IEEE, 6915–6919

  36. [36]

    Yingxue Zhou, Steven Wu, and Arindam Banerjee. 2021. Bypassing the ambient dimension: Private SGD with gradient subspace identification. InInt. Conf. Learn. Represent. (ICLR). OpenReview.net. Improving Parameter-Efficient Federated Learning with Differentially Private Refactorization A Additional Details on Experimental Setup Table 4 details the informati...