pith. sign in

arxiv: 2604.02356 · v1 · submitted 2026-03-14 · 💻 cs.NI · cs.LG

MLFCIL: A Multi-Level Forgetting Mitigation Framework for Federated Class-Incremental Learning in LEO Satellites

Pith reviewed 2026-05-15 11:50 UTC · model grok-4.3

classification 💻 cs.NI cs.LG
keywords federated class-incremental learningLEO satellitescatastrophic forgettingknowledge distillationclass-aware aggregationstability-plasticity trade-offremote sensingnon-IID data
0
0 comments X

The pith

MLFCIL mitigates catastrophic forgetting in LEO satellite federated class-incremental learning by decomposing forgetting into three sources and coordinating stability-plasticity at dual granularities.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces MLFCIL to handle continuous new classes in collaborative satellite training without raw data sharing. It breaks down forgetting into local bias, cross-task knowledge loss, and aggregation-induced drift, then counters each with class-reweighted loss, distillation plus replay and prototype compensation, and class-aware federation. A dual-granularity strategy adds round-level loss balancing and step-level gradient projection to keep the stability-plasticity balance under tight memory and bandwidth limits. On the NWPU-RESISC45 remote-sensing dataset the method beats standard baselines in accuracy while cutting forgetting and adding only small overhead. This matters for orbital networks that must update models incrementally as new sensor classes appear.

Core claim

MLFCIL decomposes catastrophic forgetting into three sources and addresses them at different levels: class-reweighted loss to reduce local bias, knowledge distillation with feature replay and prototype-guided drift compensation to preserve cross-task knowledge, and class-aware aggregation to mitigate forgetting during federation, further supported by round-level adaptive loss balancing and step-level gradient projection for the stability-plasticity trade-off.

What carries the argument

The three-level forgetting decomposition (local reweighting, distillation-based preservation, class-aware aggregation) coordinated by dual-granularity (round and step) strategies that together reduce bias, drift, and aggregation forgetting under non-IID orbital data.

Load-bearing premise

The three-level decomposition of forgetting sources together with the dual-granularity coordination strategy will reliably reduce catastrophic forgetting under the non-IID data distributions and resource constraints specific to LEO satellite federated learning.

What would settle it

An experiment on the NWPU-RESISC45 dataset under simulated LEO orbital non-IID partitions where MLFCIL produces no statistically significant reduction in forgetting rate or accuracy gain over the strongest baseline would falsify the central claim.

Figures

Figures reproduced from arXiv: 2604.02356 by Heng Zhang, KM Mahfujul, Sijing Duan, Wu Ouyang, Xiaohong Deng, Yiqin Deng, Zhigang Chen.

Figure 1
Figure 1. Figure 1: FCIL scenario in LEO: edge satellites perform on-board class [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: System scenario of MLFCIL. constellation. In each communication round, the aggregation satellite broadcasts the current global model parameters to all edge nodes. Each edge node initializes its local model from the received parameters, trains on its current task data, and returns the updated local model parameters to the aggregation satellite. The aggregation satellite then combines the collected updates t… view at source ↗
Figure 3
Figure 3. Figure 3: Overview of the MLFCIL. in Section IV addresses this through multi-level forgetting mitigation mechanisms. IV. METHODOLOGY DESIGN A. Overview of MLFCIL As illustrated in [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Prototype-guided feature drift compensation. (a) Feature drift problem. [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Stability-plasticity gradient projection. (a) Before projection (conflict). [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Cumulative accuracy across three incremental tasks for all compared [PITH_FULL_IMAGE:figures/full_fig_p009_6.png] view at source ↗
Figure 8
Figure 8. Figure 8: Forgetting-aware adaptive balancing dynamics. (a) Forgetting score () ()() [PITH_FULL_IMAGE:figures/full_fig_p010_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Gradient conflict analysis. (a) Frequency of gradient conflicts [PITH_FULL_IMAGE:figures/full_fig_p010_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Old-class vs. new-class accuracy across tasks for MLFCIL and [PITH_FULL_IMAGE:figures/full_fig_p011_10.png] view at source ↗
Figure 12
Figure 12. Figure 12: Per-class accuracy for six representative classes (two from each of the [PITH_FULL_IMAGE:figures/full_fig_p011_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: Impact of memory buffer size on final accuracy [PITH_FULL_IMAGE:figures/full_fig_p012_13.png] view at source ↗
read the original abstract

Low-Earth-orbit (LEO) satellite constellations are increasingly performing on-board computing. However, the continuous emergence of new classes under strict memory and communication constraints poses major challenges for collaborative training. Federated class-incremental learning (FCIL) enables distributed incremental learning without sharing raw data, but faces three LEO-specific challenges: non-independent and identically distributed data heterogeneity caused by orbital dynamics, amplified catastrophic forgetting during aggregation, and the need to balance stability and plasticity under limited resources. To tackle these challenges, we propose MLFCIL, a multi-level forgetting mitigation framework that decomposes catastrophic forgetting into three sources and addresses them at different levels: class-reweighted loss to reduce local bias, knowledge distillation with feature replay and prototype-guided drift compensation to preserve cross-task knowledge, and class-aware aggregation to mitigate forgetting during federation. In addition, we design a dual-granularity coordination strategy that combines round-level adaptive loss balancing with step-level gradient projection to further enhance the stability-plasticity trade-off. Experiments on the NWPU-RESISC45 dataset show that MLFCIL significantly outperforms baselines in both accuracy and forgetting mitigation, while introducing minimal resource overhead.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript proposes MLFCIL, a multi-level forgetting mitigation framework for federated class-incremental learning in LEO satellite networks. It decomposes catastrophic forgetting into three sources—local bias, cross-task knowledge loss, and aggregation-induced forgetting—and addresses them respectively with a class-reweighted loss, knowledge distillation combined with feature replay and prototype-guided drift compensation, and class-aware aggregation. A dual-granularity coordination strategy is introduced for stability-plasticity balance. The paper claims that experiments on the NWPU-RESISC45 dataset demonstrate significant outperformance over baselines in accuracy and forgetting mitigation with minimal resource overhead.

Significance. If the empirical claims hold under LEO-appropriate non-IID conditions, the work would provide a structured approach to mitigating forgetting in resource-constrained federated settings for satellite on-board computing, extending FCIL methods to orbital dynamics with low overhead.

major comments (2)
  1. [Experiments] Experiments section: The evaluation on NWPU-RESISC45 relies on generic class splits rather than partitions modeling LEO orbital dynamics (e.g., latitude-band coverage per pass or temporal class drift from satellite motion). This is load-bearing for the central claim, as the reported gains do not directly substantiate robustness to the non-IID heterogeneity and amplified forgetting caused by orbital mechanics asserted in the abstract and introduction.
  2. [Abstract] Abstract: The outperformance claim supplies no quantitative metrics, baseline names, forgetting rates, ablation results, or statistical tests, preventing assessment of whether the three-level decomposition and dual-granularity strategy deliver the asserted improvements.
minor comments (1)
  1. [Method] Clarify notation for the three invented components (class-reweighted loss, prototype-guided drift compensation, class-aware aggregation) with explicit formulations or pseudocode in the method section.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and insightful comments. We address each major comment point by point below and agree that revisions will strengthen the manuscript's claims regarding LEO-specific challenges.

read point-by-point responses
  1. Referee: [Experiments] Experiments section: The evaluation on NWPU-RESISC45 relies on generic class splits rather than partitions modeling LEO orbital dynamics (e.g., latitude-band coverage per pass or temporal class drift from satellite motion). This is load-bearing for the central claim, as the reported gains do not directly substantiate robustness to the non-IID heterogeneity and amplified forgetting caused by orbital mechanics asserted in the abstract and introduction.

    Authors: We acknowledge that the current evaluation uses standard class-incremental splits on NWPU-RESISC45, which serves as a widely adopted remote-sensing benchmark but does not explicitly simulate LEO orbital mechanics such as latitude-band coverage or temporal drift. This is a valid observation. In the revised manuscript, we will add new experiments that generate non-IID partitions explicitly modeling satellite pass coverage and class drift over time, and we will report how MLFCIL performs under these conditions. We will also clarify in the text how the three-level mitigation and dual-granularity coordination are designed to address these dynamics. revision: yes

  2. Referee: [Abstract] Abstract: The outperformance claim supplies no quantitative metrics, baseline names, forgetting rates, ablation results, or statistical tests, preventing assessment of whether the three-level decomposition and dual-granularity strategy deliver the asserted improvements.

    Authors: We agree that the abstract would benefit from concrete quantitative support. In the revision, we will expand the abstract to include specific accuracy improvements, forgetting rates, baseline names, key ablation findings, and reference to statistical significance from the experimental results, while remaining within length constraints. revision: yes

Circularity Check

0 steps flagged

No significant circularity: MLFCIL is an algorithmic proposal validated by experiments

full rationale

The paper introduces MLFCIL as a new multi-level framework that decomposes forgetting into three sources (local bias, cross-task knowledge loss, federation aggregation) and mitigates them via class-reweighted loss, KD+feature replay+prototype compensation, class-aware aggregation, plus dual-granularity coordination. No equations, derivations, or fitted parameters are shown that reduce by construction to the inputs or to self-citations. The central claims rest on empirical outperformance on NWPU-RESISC45 rather than self-referential definitions or load-bearing uniqueness theorems from the same authors. The derivation chain is therefore self-contained as a proposed method whose value is asserted through external benchmark results.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 3 invented entities

Abstract-only review; ledger populated from stated challenges and proposed components. No explicit free parameters or invented physical entities appear; domain assumptions about LEO data are taken as given.

axioms (2)
  • domain assumption Data across LEO satellites is non-IID due to orbital dynamics
    Explicitly listed as the first LEO-specific challenge in the abstract.
  • domain assumption Catastrophic forgetting is amplified during model aggregation in federated settings
    Stated as the second LEO-specific challenge.
invented entities (3)
  • class-reweighted loss no independent evidence
    purpose: Reduce local bias from class imbalance
    Introduced as the first-level mitigation component.
  • prototype-guided drift compensation no independent evidence
    purpose: Preserve cross-task knowledge during distillation
    Part of the knowledge-distillation module.
  • class-aware aggregation no independent evidence
    purpose: Mitigate forgetting at federation level
    Third-level mitigation component.

pith-pipeline@v0.9.0 · 5528 in / 1520 out tokens · 64031 ms · 2026-05-15T11:50:41.302272+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

38 extracted references · 38 canonical work pages

  1. [1]

    LEOEdge: a Satellite-Ground cooperation platform for the AI inference in large LEO constellation,

    S. Yao et al., “LEOEdge: a Satellite-Ground cooperation platform for the AI inference in large LEO constellation,” IEEE Journal on Selected Areas in Communications, vol. 43, no. 1, pp. 36–50, Sep. 2024

  2. [2]

    X. Wang, Z. Tang, J. Guo, T. Meng, C. Wang, T. Wang, and W. Jia, ”Empowering edge intelligence: A comprehensive survey on on-device AI models,” ACM Comput. Surv., vol. 57, no. 9, pp. 1–45, 2025

  3. [3]

    M. R. Jabbarpour, B. Javadi, P. H. W. Leong, R. N. Calheiros, D. Boland and C. Butler, ”On-Board Federated Learning in Orbital Edge Computing,” 2023 IEEE 29th International Conference on Parallel and Distributed Systems (ICPADS), Ocean Flower Island, China, 2023, pp. 1045-1052

  4. [4]

    Y . Qiao, S. Teng, J. Luo, P. Sun, F. Li and F. Tang, ”On-Orbit DNN Distributed Inference for Remote Sensing Images in Satellite Internet of Things,” in IEEE Internet of Things Journal, vol. 12, no. 5, pp. 5687- 5703, 1 March1, 2025

  5. [5]

    H. Zhao et al., ”Self-Training and Curriculum Learning Guided Dynamic Refined Network for Remote Sensing Class-Incremental Semantic Seg- mentation,” IGARSS 2024 - 2024 IEEE International Geoscience and Remote Sensing Symposium, Athens, Greece, 2024, pp. 8334-8338

  6. [6]

    Y . Wei, Z. Pan and Y . Wu, ”Class Bias Correction Matters: A Class- Incremental Learning Framework for Remote Sensing Scene Classifica- tion,” in IEEE Transactions on Geoscience and Remote Sensing, vol. 63, pp. 1-18, 2025, Art no. 5616518

  7. [7]

    M. A. C. Purio, D. J. D. Lopez and D. J. M. Cuaresma, ”Optimized AI Model Development for On-Board Image Classification in CubeSats Using Birds Satellite Imagery,” IGARSS 2025 - 2025 IEEE International Geoscience and Remote Sensing Symposium, Brisbane, Australia, 2025, pp. 5856-5860

  8. [8]

    X. Li, Z. Zhu, Y . Qin, et al., ”Semi-supervised remote sensing image scene classification with prototype-based consistency,” Chinese J. Aeronautics, vol. 37, no. 3, pp. 459-470, 2024

  9. [9]

    Zhang, Z

    Y . Zhang, Z. Lin, Z. Chen, et al., ”SatFed: A resource-efficient LEO- satellite-assisted heterogeneous federated learning framework,” Engineer- ing, vol. 54, pp. 115-126, 2025

  10. [10]

    Joint Computation Offload- ing and Resource Management for Cooperative Satellite–Aerial–Marine Internet of Things Networks,

    S. Qi, B. Lin, Y . Deng, H. Pan, and X. Hu, “Joint Computation Offload- ing and Resource Management for Cooperative Satellite–Aerial–Marine Internet of Things Networks,” IEEE Internet of Things Journal, vol. 12, no. 24, pp. 53164–53176, Oct. 2025

  11. [11]

    C. Wu, S. Han, Q. Chen, Y . Wang, W. Meng and A. Benslimane, ”Enhancing LEO Mega-constellations with Inter-Satellite Links: Vision and Challenges,” in IEEE Wireless Communications, vol. 32, no. 5, pp. 196-202, October 2025

  12. [12]

    C. Chen, T. Liao, X. Deng, Z. Wu, S. Huang and Z. Zheng, ”Advances in Robust Federated Learning: A Survey With Heterogeneity Considera- tions,” in IEEE Transactions on Big Data, vol. 11, no. 3, pp. 1548-1567, June 2025

  13. [13]

    Federated learning over multi- hop wireless networks with In-Network aggregation,

    X. Chen, G. Zhu, Y . Deng, and Y . Fang, “Federated learning over multi- hop wireless networks with In-Network aggregation,” IEEE Transactions on Wireless Communications, vol. 21, no. 6, pp. 4622–4634, Apr. 2022

  14. [14]

    ESFL: Efficient Split Federated Learning over Resource-Constrained Heterogeneous Wireless Devices,

    G. Zhu, Y . Deng, X. Chen, H. Zhang, Y . Fang, and T. F. Wong, “ESFL: Efficient Split Federated Learning over Resource-Constrained Heterogeneous Wireless Devices,” IEEE Internet of Things Journal, vol. 11, no. 16, pp. 27153–27166, May 2024. JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 13

  15. [15]

    FEDSN: a federated learning framework over heterogeneous LEO satellite net- works,

    Z. Lin, Z. Chen, Z. Fang, X. Chen, X. Wang, and Y . Gao, “FEDSN: a federated learning framework over heterogeneous LEO satellite net- works,” IEEE Transactions on Mobile Computing, vol. 24, no. 3, pp. 1293–1307, Oct. 2024

  16. [16]

    Z. Zhai, Q. Wu, S. Yu, R. Li, F. Zhang and X. Chen, ”FedLEO: An Offloading-Assisted Decentralized Federated Learning Framework for Low Earth Orbit Satellite Networks,” in IEEE Transactions on Mobile Computing, vol. 23, no. 5, pp. 5260-5279, May 2024

  17. [17]

    Elmahallawy and T

    M. Elmahallawy and T. Luo, ”Optimizing Federated Learning in LEO Satellite Constellations via Intra-Plane Model Propagation and Sink Satellite Scheduling,” ICC 2023 - IEEE International Conference on Communications, Rome, Italy, 2023, pp. 3444-3449

  18. [18]

    Z. Jiang et al., ”Lifelong Learning With Adaptive Knowledge Fusion and Class Margin Dynamic Adjustment for Hyperspectral Image Classi- fication,” in IEEE Transactions on Geoscience and Remote Sensing, vol. 63, pp. 1-19, 2025

  19. [19]

    Loss of plasticity in deep continual learning,

    S. Dohare, J. F. Hernandez-Garcia, Q. Lan, P. Rahman, A. R. Mahmood, and R. S. Sutton, “Loss of plasticity in deep continual learning,” Nature, vol. 632, no. 8026, pp. 768-774, 2024

  20. [20]

    A Comprehensive survey of Con- tinual Learning: Theory, Method and application,

    L. Wang, X. Zhang, H. Su, and J. Zhu, “A Comprehensive survey of Con- tinual Learning: Theory, Method and application,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 46, no. 8, pp. 5362–5383, Feb. 2024

  21. [21]

    Z. Zhao et al., ”Rethinking Gradient Projection Continual Learning: Sta- bility/Plasticity Feature Space Decoupling,” 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 2023, pp. 3718-3727

  22. [22]

    Federated Continual Learning via Knowledge Fusion: a survey,

    X. Yang, H. Yu, X. Gao, H. Wang, J. Zhang, and T. Li, “Federated Continual Learning via Knowledge Fusion: a survey,” IEEE Transactions on Knowledge and Data Engineering, vol. 36, no. 8, pp. 3832–3850, Feb. 2024

  23. [23]

    Dong et al., ”No one left behind: Real-world federated class- incremental learning,” IEEE Trans

    J. Dong et al., ”No one left behind: Real-world federated class- incremental learning,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 46, no. 4, pp. 2054–2070, Apr. 2024

  24. [24]

    F. -Y . Liang et al., ”Class-Aware Prompting for Federated Few-Shot Class-Incremental Learning,” in IEEE Transactions on Circuits and Sys- tems for Video Technology, vol. 35, no. 9, pp. 8520-8532, Sept. 2025

  25. [25]

    Z. Niu, P. Cheng, Z. Wang, L. Zhao, Z. Sun, X. Sun, and Z. Guo, ”FCIL- MSN: A Federated Class-Incremental Learning Method for Multisatellite Networks,” IEEE Transactions on Geoscience and Remote Sensing, vol. 62, pp. 1–15, 2024

  26. [26]

    Meng et al., ”Sample-Level Prototypical Federated Learning,” in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol

    C. Meng et al., ”Sample-Level Prototypical Federated Learning,” in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 48, no. 2, pp. 1133-1144, Feb. 2026

  27. [27]

    On-Board federated learning for satellite clusters with Inter-Satellite links,

    N. Razmi, B. Matthiesen, A. Dekorsy, and P. Popovski, “On-Board federated learning for satellite clusters with Inter-Satellite links,” IEEE Transactions on Communications, vol. 72, no. 6, pp. 3408–3424, Jan. 2024

  28. [28]

    D. -W. Zhou, Q. -W. Wang, Z. -H. Qi, H. -J. Ye, D. -C. Zhan and Z. Liu, ”Class-Incremental Learning: A Survey,” in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 46, no. 12, pp. 9851-9873, Dec. 2024

  29. [29]

    F. Zhu, X. -Y . Zhang, Z. Cheng and C. -L. Liu, ”PASS++: A Dual Bias Reduction Framework for Non-Exemplar Class-Incremental Learning,” in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 47, no. 8, pp. 7123-7139, Aug. 2025

  30. [30]

    Q. Yang, L. Wang, J. Wicker, and G. Dobbie, ”Continual learning: A systematic literature review,” Neural Networks, vol. 195, art. 108226, 2026

  31. [31]

    Liu, X.-J

    W. Liu, X.-J. Wu, F. Zhu, M.-M. Yu, C. Wang, and C.-L. Liu, ”Class incremental learning with self-supervised pre-training and prototype learning,” Pattern Recognit., vol. 157, art. 110943, 2025

  32. [32]

    Guo, et al., ”PILoRA: Prototype guided incremental LoRA for federated class-incremental learning,” in Proc

    H. Guo, et al., ”PILoRA: Prototype guided incremental LoRA for federated class-incremental learning,” in Proc. Eur. Conf. Comput. Vis. (ECCV), 2024, pp. 146–163

  33. [33]

    Khademi Nori, I.-M

    M. Khademi Nori, I.-M. Kim, and G. Wang, ”Federated class- incremental learning: A hybrid approach using latent exemplars and data- free techniques to address local and global forgetting,” in Proc. Int. Conf. Learn. Represent. (ICLR), 2025

  34. [34]

    X. Zhu, L. Bai and Y . Ruan, ”Federated Class-Incremental Learning: A Survey,” 2025 10th International Conference on Machine Learning Technologies (ICMLT), Helsinki, Finland, 2025, pp. 218-222

  35. [35]

    Zhang, C

    J. Zhang, C. Chen, W. Zhuang and L. Lyu, ”TARGET: Federated Class- Continual Learning via Exemplar-Free Distillation,” 2023 IEEE/CVF International Conference on Computer Vision (ICCV), Paris, France, 2023, pp. 4759-4770

  36. [36]

    Cheng, J

    G. Cheng, J. Han and X. Lu, ”Remote Sensing Image Scene Classifica- tion: Benchmark and State of the Art,” in Proceedings of the IEEE, vol. 105, no. 10, pp. 1865-1883, Oct. 2017

  37. [37]

    Communication-Efficient Learning of Deep Networks from Decentralized Data,

    H. B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. Y . Arcas, “Communication-Efficient Learning of Deep Networks from Decentralized Data,” arXiv (Cornell University), pp. 1273–1282, Feb. 2016. Heng Zhangreceived the M.S. degree in soft- ware engineering from Central South University, Changsha, Hunan, China, in 2020. She is currently a teaching assi...

  38. [38]

    Yiqin Deng(Member, IEEE) received the M.S

    His current research interests include real- time scheduling and learning algorithm design for converged wireless/cloud communication networks. Yiqin Deng(Member, IEEE) received the M.S. degree in software engineering and the Ph.D. degree in computer science and technology from Central South University, Changsha, China, in 2017 and 2022, respectively. She...