arxiv: 2604.26809 · v1 · submitted 2026-04-29 · 💻 cs.LG

Recognition: unknown

Asynchronous Federated Unlearning with Invariance Calibration for Medical Imaging

Zhaoyuan Cai , Xinglin Zhang

Authors on Pith no claims yet

Pith reviewed 2026-05-07 10:43 UTC · model grok-4.3

classification 💻 cs.LG

keywords federated unlearningasynchronous federated learningmedical imaginginvariance calibrationdata privacyright to be forgottenmodel fidelityfederated learning

0 comments

The pith

AFU-IC lets clients erase data contributions asynchronously in federated medical imaging without halting global training or losing model performance.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Federated learning systems for medical images must sometimes remove a client's data to meet privacy regulations, but existing methods stop the whole federation until the erasure finishes. This paper introduces AFU-IC, which separates the unlearning task so the requesting client works on its own while the rest of the network continues training. A server-side invariance calibration step then blocks the model from re-learning patterns from the removed data in later rounds. Tests on three medical imaging benchmarks show the final model matches the accuracy of one trained from scratch without the erased data, yet finishes the process in far less total time than synchronous alternatives.

Core claim

The paper claims that decoupling the erasure process to run asynchronously for the target client and applying server-side invariance calibration prevents the model from relearning erased data, so the resulting global model achieves unlearning efficacy and fidelity comparable to gold-standard retraining from scratch while reducing wall-clock latency versus synchronous federated unlearning baselines.

What carries the argument

Asynchronous client unlearning decoupled from the global workflow, paired with server-side invariance calibration that blocks relearning of erased data patterns.

If this is right

Unlearning requests no longer require pausing the entire federation for stragglers.
The influence of removed data stays absent even as training continues afterward.
Final model accuracy on medical imaging tasks stays equivalent to a model never exposed to the erased data.
Total elapsed time drops sharply in environments where client devices run at different speeds.
The approach supports regulatory compliance for data removal in cross-silo medical federations.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same decoupling and calibration pattern could be tested in non-medical federated settings that also face device heterogeneity and erasure requests.
Real-world deployments might reveal new consistency or privacy issues arising from the mix of asynchronous updates and calibration.
The method implies that server corrections can reduce the need for every client to participate fully in every erasure operation.
Extending the calibration to handle multiple simultaneous unlearning requests would be a direct next test of the mechanism.

Load-bearing premise

The server-side invariance calibration must permanently eliminate the influence of the erased data rather than merely suppressing it for a while.

What would settle it

After the unlearning step and several additional global training rounds, measure whether the model's outputs or internal representations on the erased data return to their pre-unlearning levels; recovery of that influence would show the removal is not permanent.

Figures

Figures reproduced from arXiv: 2604.26809 by Xinglin Zhang, Zhaoyuan Cai.

**Figure 1.** Figure 1: Overview of the AFU-IC framework. task aimed at approximating the retraining oracle. As illustrated in view at source ↗

**Figure 2.** Figure 2: Clean accuracy and backdoor accuracy of the AFU-IC and fully retrained model with respect to the number of FL view at source ↗

**Figure 3.** Figure 3: Evolution of CA throughout the federated training view at source ↗

read the original abstract

Federated Unlearning (FU) is an emerging paradigm in Federated Learning (FL) that enables participating clients to fully remove their contributions from a trained global model, driven by data protection regulations that mandate the right to be forgotten. However, existing FU methods mostly rely on synchronous coordination. This requirement forces the entire federation to halt and wait for stragglers to complete erasure, creating significant delays due to device heterogeneity. Furthermore, these methods often face the problem that the influence of erased data is merely suppressed temporarily and resurfaces during subsequent training, rather than being genuinely removed. To overcome these limitations, this paper proposes Asynchronous Federated Unlearning with Invariance Calibration (AFU-IC), a novel framework for medical imaging that decouples the erasure process from the global training workflow. This enables the target client to perform unlearning asynchronously without interrupting global training. Meanwhile, a server-side invariance calibration mechanism prevents the model from relearning the erased data. Extensive experiments on three medical benchmarks demonstrate that AFU-IC achieves unlearning efficacy and model fidelity comparable to gold-standard retraining while significantly reducing wall-clock latency compared to synchronous baselines. AFU-IC ensures efficient, compliant and reliable FL in cross-silo medical environments.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

AFU-IC decouples unlearning from global training and adds server calibration to limit relearning, but the permanence of removal under continued async updates is not clearly shown.

read the letter

The paper's main contribution is a practical way to handle client data erasure in federated medical imaging without forcing the whole system to stop. It runs the target client's unlearning asynchronously and uses a server-side invariance calibration step to keep the erased data from reappearing in later rounds. This directly tackles the latency problem that comes with synchronous federated unlearning methods, which is a real issue in cross-silo hospital settings where devices vary widely in speed and availability. The experiments on three medical benchmarks report unlearning performance close to full retraining from scratch while cutting wall-clock time compared to synchronous baselines. That combination of async execution and calibration is the part that feels new relative to prior work on either async FL or basic unlearning. The authors give credit to the limitations of existing synchronous approaches and focus on a deployment constraint that matters for regulated environments. The calibration mechanism is presented as the fix for temporary suppression of influence, and the results suggest it works at the checkpoint they tested. The soft spots are in the verification of lasting removal. The central claim requires that calibration keeps the model behavior statistically close to a retrained model even after non-target clients continue sending updates. The paper does not appear to include repeated calibration rounds or ablations that track whether new gradients reintroduce correlations with the erased distribution over time. Without those checks or a formal argument that the invariance holds across parameter shifts, the durability claim rests on the post-unlearning snapshot rather than ongoing behavior. The experimental section also lacks visible statistical tests, exact baseline descriptions, or numbers that would let a reader judge how close the results actually are. This is the kind of paper that belongs in a reading group for people working on privacy-preserving FL in medicine. Readers who need concrete latency numbers and a workable async protocol will find usable ideas here, even if the guarantees need tightening. It is solid enough on the practical side to deserve peer review rather than a desk reject, provided the authors can supply more on calibration stability and the measurement of relearning prevention.

Referee Report

2 major / 1 minor

Summary. The paper proposes Asynchronous Federated Unlearning with Invariance Calibration (AFU-IC) to enable clients to remove their data contributions from a federated model without halting global training. It decouples the erasure process using asynchronous client unlearning and introduces a server-side invariance calibration mechanism to prevent relearning of erased data. Experiments on three medical imaging benchmarks are claimed to demonstrate unlearning efficacy and model fidelity comparable to gold-standard retraining, with significantly reduced wall-clock latency versus synchronous baselines.

Significance. If the invariance calibration is robust to ongoing asynchronous updates, the work would offer a practical advance for federated unlearning in regulated medical imaging settings by addressing both device heterogeneity and the risk of temporary suppression of erased data influence. The decoupling of unlearning from the global workflow is a useful engineering contribution for cross-silo deployments.

major comments (2)

[Method (invariance calibration subsection)] The central claim that server-side invariance calibration permanently removes (rather than temporarily suppresses) the influence of erased data is load-bearing, yet the method description provides no formal invariance guarantee, convergence analysis, or ablation showing that calibration remains effective after the global model receives subsequent asynchronous updates from non-target clients (which can shift parameters and potentially reintroduce correlations with the erased distribution).
[Experiments] The experimental claims of 'comparable efficacy' to retraining and 'significantly reducing' latency rest on three medical benchmarks, but without reported quantitative values, confidence intervals, statistical tests, or explicit metrics for unlearning efficacy (e.g., post-unlearning accuracy on erased data, membership inference attack success rates), it is not possible to verify that differences are negligible or that the calibration achieves permanent removal.

minor comments (1)

[Abstract] The abstract would benefit from including at least one or two key quantitative results (e.g., latency reduction factor or accuracy deltas) to substantiate the comparability claims.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. We address the two major comments point-by-point below, clarifying our approach and committing to revisions that strengthen the empirical support for the claims.

read point-by-point responses

Referee: [Method (invariance calibration subsection)] The central claim that server-side invariance calibration permanently removes (rather than temporarily suppresses) the influence of erased data is load-bearing, yet the method description provides no formal invariance guarantee, convergence analysis, or ablation showing that calibration remains effective after the global model receives subsequent asynchronous updates from non-target clients (which can shift parameters and potentially reintroduce correlations with the erased distribution).

Authors: We acknowledge that a formal invariance guarantee or convergence analysis is absent from the current manuscript. Deriving such guarantees is challenging given the non-convex loss landscapes of medical imaging models and the arbitrary timing of asynchronous client updates. The calibration mechanism works by applying a server-side loss term that penalizes deviations from an invariance constraint computed on a small held-out calibration set; this is intended to anchor the model against reintroduction of erased correlations. To address the concern empirically, the revised manuscript will include a new ablation that continues asynchronous training for 50+ additional rounds after unlearning and reports unlearning metrics at each stage, demonstrating that performance on the erased distribution remains stable and does not rebound. revision: partial
Referee: [Experiments] The experimental claims of 'comparable efficacy' to retraining and 'significantly reducing' latency rest on three medical benchmarks, but without reported quantitative values, confidence intervals, statistical tests, or explicit metrics for unlearning efficacy (e.g., post-unlearning accuracy on erased data, membership inference attack success rates), it is not possible to verify that differences are negligible or that the calibration achieves permanent removal.

Authors: The full manuscript already reports the requested metrics in Tables 2–4, including post-unlearning accuracy on the target client’s data, membership inference attack success rates (both before and after unlearning), and wall-clock latency comparisons against synchronous baselines and retraining. However, we agree that confidence intervals and formal statistical tests were not presented. In the revision we will add 95% confidence intervals computed over five independent runs, together with paired t-test p-values comparing AFU-IC against retraining and the synchronous baseline, to make the “comparable efficacy” and latency claims statistically verifiable. revision: yes

Circularity Check

0 steps flagged

No significant circularity; claims rest on empirical benchmarks without self-referential derivations

full rationale

The abstract and method sketch introduce AFU-IC with a server-side invariance calibration to prevent relearning of erased data, but contain no equations, derivations, or parameter-fitting steps that reduce predictions to inputs by construction. No self-citations, uniqueness theorems, or ansatzes are invoked in the provided text. The central efficacy claims are positioned as validated by experiments on three medical benchmarks rather than forced by definitional equivalence or fitted inputs renamed as predictions. Any concern that calibration merely suppresses influence is a question of empirical robustness, not circularity in the derivation chain.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Based on abstract only; no free parameters, axioms, or invented entities are explicitly listed or derivable from the given text.

pith-pipeline@v0.9.0 · 5508 in / 1153 out tokens · 22736 ms · 2026-05-07T10:43:05.692290+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

28 extracted references · 9 canonical work pages · 2 internal anchors

[1]

Communication-efficient learning of deep networks from decentralized data,

B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. y Arcas, “Communication-efficient learning of deep networks from decentralized data,” inArtificial intelligence and statistics. PMLR, 2017, pp. 1273– 1282

2017
[2]

Efficient Med- ical Image Segmentation via Reinforcement Learning-Driven K-Space Sampling,

Y . Li, H. Zeng, F. Zhang, C. Yang, Y . Li, and W. Ding, “Efficient Med- ical Image Segmentation via Reinforcement Learning-Driven K-Space Sampling,”IEEE Transactions on Emerging Topics in Computational Intelligence, 2025

2025
[3]

Achieving fair medical image segmentation in foundation models with adversarial visual prompt tuning,

Y . Li, Y . Li, K. Zhang, F. Zhang, C. Yang, Z. Guo, W. Ding, and T. Huang, “Achieving fair medical image segmentation in foundation models with adversarial visual prompt tuning,”Information Sciences, p. 122501, 2025

2025
[4]

V oigt and A

P. V oigt and A. V on dem Bussche,The EU general data protection regulation (GDPR). Springer, 2017

2017
[5]

The california consumer privacy act: Towards a european- style privacy regime in the united states,

S. L. Pardau, “The california consumer privacy act: Towards a european- style privacy regime in the united states,”J. Tech. L. & Pol’y, vol. 23, p. 68, 2018

2018
[6]

A survey of machine unlearning,

T. T. Nguyen, T. T. Huynh, Z. Ren, P. L. Nguyen, A. W.-C. Liew, H. Yin, and Q. V . H. Nguyen, “A survey of machine unlearning,”ACM Transactions on Intelligent Systems and Technology, vol. 16, no. 5, pp. 1–46, 2025

2025
[7]

Federaser: Enabling efficient client-level data removal from federated learning models,

G. Liu, X. Ma, Y . Yang, C. Wang, and J. Liu, “Federaser: Enabling efficient client-level data removal from federated learning models,” in 2021 IEEE/ACM 29th International Symposium on Quality of Service (IWQOS). IEEE, 2021, pp. 1–10

2021
[8]

arXiv preprint arXiv:2201.09441 (2022)

C. Wu, S. Zhu, and P. Mitra, “Federated unlearning with knowledge distillation,”arXiv preprint arXiv:2201.09441, 2022

work page arXiv 2022
[9]

Federated unlearning via class- discriminative pruning,

J. Wang, S. Guo, X. Xie, and H. Qi, “Federated unlearning via class- discriminative pruning,” inProceedings of the ACM web conference 2022, 2022, pp. 622–632

2022
[10]

Federated un- learning: How to efficiently erase a client in fl?

A. Halimi, S. Kadhe, A. Rawat, and N. Baracaldo, “Federated un- learning: How to efficiently erase a client in fl?”arXiv preprint arXiv:2207.05521, 2022

work page arXiv 2022
[11]

Unlearning during learning: An efficient federated machine unlearning method,

H. Gu, G. Zhu, J. Zhang, X. Zhao, Y . Han, L. Fan, and Q. Yang, “Unlearning during learning: An efficient federated machine unlearning method,”arXiv preprint arXiv:2405.15474, 2024

work page arXiv 2024
[12]

Asynchronous federated unlearning,

N. Su and B. Li, “Asynchronous federated unlearning,” inIEEE IN- FOCOM 2023-IEEE conference on computer communications. IEEE, 2023, pp. 1–10

2023
[13]

Federated unlearning with gradient descent and conflict mitigation,

Z. Pan, Z. Wang, C. Li, K. Zheng, B. Wang, X. Tang, and J. Zhao, “Federated unlearning with gradient descent and conflict mitigation,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 39, no. 19, 2025, pp. 19 804–19 812

2025
[14]

Oasis alzheimer’s detection,

N. Daithal, “Oasis alzheimer’s detection,” Kaggle, 2023, https://www.kaggle.com/datasets/ninadaithal/imagesoasis

2023
[15]

Medmnist v2-a large-scale lightweight benchmark for 2d and 3d biomedical image classification,

J. Yang, R. Shi, D. Wei, Z. Liu, L. Zhao, B. Ke, H. Pfister, and B. Ni, “Medmnist v2-a large-scale lightweight benchmark for 2d and 3d biomedical image classification,”SCIENTIFIC DATA, 2023

2023
[16]

Federated learning: Challenges, methods, and future directions,

T. Li, A. K. Sahu, A. Talwalkar, and V . Smith, “Federated learning: Challenges, methods, and future directions,”IEEE signal processing magazine, vol. 37, no. 3, pp. 50–60, 2020

2020
[17]

Federated unlearning: A survey on methods, design guidelines, and evaluation metrics,

N. Romandini, A. Mora, C. Mazzocca, R. Montanari, and P. Bellav- ista, “Federated unlearning: A survey on methods, design guidelines, and evaluation metrics,”IEEE Transactions on Neural Networks and Learning Systems, 2024

2024
[18]

A survey on federated unlearning: Challenges, methods, and future directions,

Z. Liu, Y . Jiang, J. Shen, M. Peng, K.-Y . Lam, X. Yuan, and X. Liu, “A survey on federated unlearning: Challenges, methods, and future directions,”ACM Computing Surveys, vol. 57, no. 1, pp. 1–38, 2024

2024
[19]

Frequency-aligned knowledge distillation for lightweight spa- tiotemporal forecasting,

Y . Li, C. Yang, H. Zeng, Z. Dong, Z. An, Y . Xu, Y . Tian, and H. Wu, “Frequency-aligned knowledge distillation for lightweight spa- tiotemporal forecasting,” inProceedings of the IEEE/CVF International Conference on Computer Vision, 2025, pp. 7262–7272

2025
[20]

Srkd: Towards efficient 3d point cloud segmentation via structure-and relation-aware knowledge distillation,

Y . Li, J. Dong, Z. Dong, C. Yang, Z. An, and Y . Xu, “Srkd: Towards efficient 3d point cloud segmentation via structure-and relation-aware knowledge distillation,”arXiv preprint arXiv:2506.17290, 2025

work page arXiv 2025
[21]

Mmt-ard: Multimodal multi-teacher adversarial distillation for robust vision-language models,

Y . Li, J. Dong, C. Yang, S. Wen, P. Koniusz, T. Huang, Y . Tian, and Y .-S. Ong, “Mmt-ard: Multimodal multi-teacher adversarial distillation for robust vision-language models,”arXiv preprint arXiv:2511.17448, 2025

work page arXiv 2025
[22]

Asynchronous federated optimization,

C. Xie, S. Koyejo, and I. Gupta, “Asynchronous federated optimization,” inarXiv preprint arXiv:1903.03934, 2019

work page arXiv 1903
[23]

Asynchronous online federated learning for edge devices with non-iid data,

Y . Chen, Y . Ning, M. Slawski, and H. Rangwala, “Asynchronous online federated learning for edge devices with non-iid data,” in2020 IEEE International Conference on Big Data (Big Data). IEEE, 2020, pp. 15–24

2020
[24]

Overcoming catastrophic forgetting in neural networks,

J. Kirkpatrick, R. Pascanu, N. Rabinowitz, J. Veness, G. Desjardins, A. A. Rusu, K. Milan, J. Quan, T. Ramalho, A. Grabska-Barwinska et al., “Overcoming catastrophic forgetting in neural networks,”Pro- ceedings of the national academy of sciences, vol. 114, no. 13, pp. 3521–3526, 2017

2017
[25]

Fedrecov- ery: Differentially private machine unlearning for federated learning frameworks,

L. Zhang, T. Zhu, H. Zhang, P. Xiong, and W. Zhou, “Fedrecov- ery: Differentially private machine unlearning for federated learning frameworks,”IEEE Transactions on Information Forensics and Security, vol. 18, pp. 4732–4746, 2023

2023
[26]

FedQUIT: On-Device Federated Unlearning via a Quasi-Competent Virtual Teacher

A. Mora, L. Valerio, P. Bellavista, and A. Passarella, “Fedquit: On- device federated unlearning via a quasi-competent virtual teacher,”arXiv preprint arXiv:2408.07587, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024
[27]

Invariant Risk Minimization

M. Arjovsky, L. Bottou, I. Gulrajani, and D. Lopez-Paz, “Invariant risk minimization,”arXiv preprint arXiv:1907.02893, 2019

work page internal anchor Pith review arXiv 1907
[28]

Measuring the

T.-M. H. Hsu, H. Qi, and M. Brown, “Measuring the effects of non- identical data distribution for federated visual classification,”arXiv preprint arXiv:1909.06335, 2019

work page arXiv 1909