pith. sign in

arxiv: 2606.18003 · v1 · pith:ZY6M7TL3new · submitted 2026-06-16 · 💻 cs.LG · cs.AI

C2FL: Clustered Continual Federated Learning under Spatial and Temporal Drift

Pith reviewed 2026-06-27 01:45 UTC · model grok-4.3

classification 💻 cs.LG cs.AI
keywords federated learningcontinual learningspatial clusteringtemporal driftmobile networksexperience replaydistributed adaptationcollective systems
0
0 comments X

The pith

Nodes in mobile sensing networks self-organize into spatial clusters and apply dwell-time-aware averaging to maintain performance under shifting data distributions.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that conventional federated learning breaks down when nodes move through regions with different sensed conditions while data distributions change over time and privacy bars central collection. It shows that a fully distributed alternative lets each node form learning groups by spatial clustering and then blend its local experience replay buffer with a regional model whose influence grows the longer the node stays in one area. This setup is evaluated on synthetic scenarios that recreate the mobility patterns of vehicles, drones, and crowdsensing devices. A sympathetic reader would care because the approach keeps models accurate without moving raw data off the devices, which matters for any collective system that must keep learning while its members roam.

Core claim

C2FL is a fully distributed federated learning approach where nodes self-organize into learning groups through spatial clustering that reflects the geographic structure of the environment. To counteract temporal drift, each node combines experience replay with a dwell-time-aware adaptive averaging step that progressively incorporates the regional consensus as it remains longer within the same area while preserving previously acquired knowledge under evolving distributions. Synthetic experiments that reproduce spatial and temporal shifts demonstrate that standard federated strategies degrade significantly whereas C2FL restores robust collective adaptation.

What carries the argument

The dwell-time-aware adaptive averaging step, which scales the weight given to the regional consensus according to how long a node has stayed in one location, working together with spatial clustering for group formation and experience replay for retaining earlier knowledge.

If this is right

  • Standard federated averaging without clustering or dwell-time weighting loses accuracy as soon as nodes cross region boundaries or distributions shift.
  • Experience replay combined with increasing regional weight over dwell time lets each node retain old knowledge while absorbing the current area's consensus.
  • The method operates without any central server or raw data exchange, satisfying privacy constraints in vehicular, drone, and smartphone sensing.
  • Collective adaptation remains stable across the tested range of spatial cluster sizes and temporal drift speeds.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same dwell-time mechanism could be tested on non-spatial forms of drift such as seasonal or event-driven changes to see whether the clustering component is essential or optional.
  • If clustering overhead grows with network size, lightweight local density estimation might replace explicit group formation while keeping the averaging rule intact.
  • Real deployments could measure the trade-off between longer dwell times (stronger regional pull) and the risk of over-fitting to one area before the node moves again.

Load-bearing premise

The assumption that nodes can reliably detect and join clusters that match the true geographic boundaries of the sensed phenomena and that the synthetic mobility patterns capture the shifts found in actual deployments.

What would settle it

Running the same nodes on real geographic traces from vehicles or phones and measuring whether the clustered models retain accuracy longer than non-clustered federated baselines once the actual spatial boundaries and drift rates are encountered.

Figures

Figures reproduced from arXiv: 2606.18003 by Davide Domini, Gianluca Aguzzi, Lorenzo Pellegrini, Lukas Esterle, Mirko Viroli.

Figure 1
Figure 1. Figure 1: Proximity-based heterogeneous data distribution. Cir [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Per-area accuracy (Acc(t) d (k)) of a mobile device over the learning process. Each subplot reports the accuracy on the test data associated with one spatial area. The vertical dashed red lines mark mobility-induced region transitions. scenario described in the previous section using the FBFL baseline without any continual-learning mechanism. Figure 2a reports the test accuracy of a mobile node over time o… view at source ↗
Figure 3
Figure 3. Figure 3: Cumulative accuracy CAcc of the evaluated methods over global rounds. The proposed C²FL approach improves performance with respect to all considered baselines. tive learning process within each region (local/global knowl￾edge integration). To this end, we compare C²FL against three baselines, which also serve as an ablation study of its main components, namely: (i) Local, where each mobile device trains on… view at source ↗
read the original abstract

Collective Adaptive Systems (CAS) increasingly rely on machine learning to let each node learn from locally sensed data, aligning its behavior with the surrounding environment. Scaling this intelligence, however, raises fundamental challenges: sensed data is often privacy-sensitive, preventing centralized collection; nodes are mobile, traversing regions where nearby nodes perceive similar phenomena while distant ones observe radically different conditions, creating natural spatial clusters; and these distributions evolve over time due to mobility, introducing temporal drift that makes local models progressively stale. These dynamics arise across domains - vehicular sensing, drone-based monitoring, smartphone crowdsensing - yet the interplay of privacy, spatial heterogeneity, and temporal drift severely undermines conventional learning strategies. Therefore, we propose C2FL, a fully distributed Federated Learning (FL) approach where nodes self-organize into learning groups through spatial clustering, reflecting the geographic structure of the environment. To counteract temporal drift, each node combines experience replay with a dwell-time-aware adaptive averaging step, progressively incorporating the regional consensus as it remains longer within the same area, while preserving previously acquired knowledge under evolving distributions. We evaluate our approach on synthetic experiments that systematically reproduce spatial and temporal shifts, showing that standard federated strategies degrade significantly under these conditions and that our method restores robust collective adaptation.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 0 minor

Summary. The manuscript proposes C2FL, a fully distributed federated learning approach for collective adaptive systems in which mobile nodes self-organize into spatial clusters reflecting geographic structure and employ experience replay combined with a dwell-time-aware adaptive averaging step to handle temporal drift while preserving prior knowledge; it claims that standard FL degrades under these conditions while C2FL restores robust collective adaptation, as demonstrated on synthetic experiments that reproduce spatial and temporal shifts.

Significance. If the synthetic experiments faithfully capture the non-stationary, multi-scale phenomena of real CAS deployments, the method would offer a practical, privacy-preserving solution for continual learning under spatial heterogeneity and mobility-induced drift, with relevance to vehicular sensing, drone monitoring, and crowdsensing.

major comments (1)
  1. [Experiments / Evaluation] The evaluation (described in the abstract and presumably detailed in the experiments section) provides no information on the mobility model, sensing correlation length, drift process, or definition of 'true geographic structure' used to validate clusters. This is load-bearing for the central claim, as the reported gains over standard federated strategies depend entirely on whether the synthetic setup systematically reproduces the spatial/temporal shifts of real deployments rather than introducing artifacts.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive comment on the experimental evaluation. The concern about missing details in the synthetic setup is valid and directly impacts the interpretability of our results; we will revise the manuscript accordingly to provide the requested information.

read point-by-point responses
  1. Referee: [Experiments / Evaluation] The evaluation (described in the abstract and presumably detailed in the experiments section) provides no information on the mobility model, sensing correlation length, drift process, or definition of 'true geographic structure' used to validate clusters. This is load-bearing for the central claim, as the reported gains over standard federated strategies depend entirely on whether the synthetic setup systematically reproduces the spatial/temporal shifts of real deployments rather than introducing artifacts.

    Authors: We agree that the manuscript as currently written does not provide sufficient detail on these aspects of the synthetic data generation, which is necessary to allow readers to evaluate how well the experiments capture the target phenomena. In the revised version we will add a new subsection (or expand the existing Experiments section) that explicitly describes: the mobility model (including parameters such as speed ranges and movement patterns), the sensing correlation length and its role in defining spatial clusters, the formulation of the temporal drift process (e.g., how distributions evolve over time steps), and the precise definition of 'true geographic structure' used as ground truth for validating the learned clusters. These additions will make the experimental design transparent and reproducible. revision: yes

Circularity Check

0 steps flagged

No circularity: method proposal and synthetic evaluation are self-contained

full rationale

The paper presents C2FL as a distributed FL algorithm combining spatial clustering, experience replay, and dwell-time-aware averaging to handle mobility-induced drift. No equations, fitted parameters, or first-principles derivations appear in the abstract or description; the central claims rest on empirical results from synthetic experiments that reproduce spatial/temporal shifts. These experiments are external to any internal derivation chain, and no self-citation load-bearing steps, self-definitional loops, or renamed known results are present. The derivation is therefore self-contained with independent content.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract provides no equations, parameters, or explicit assumptions; therefore no free parameters, axioms, or invented entities can be identified.

pith-pipeline@v0.9.1-grok · 5762 in / 1142 out tokens · 35369 ms · 2026-06-27T01:45:00.904507+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

45 extracted references · 21 canonical work pages

  1. [1]

    Analysis of spatially distributed data in internet of things in the environmental context,

    L. J. de Melo de Azevedo, J. C. Estrella, A. C. B. Delbem, R. I. Meneguette, S. Reiff-Marganiec, and S. C. de Andrade, “Analysis of spatially distributed data in internet of things in the environmental context,”Sensors, vol. 22, no. 5, p. 1693, 2022. [Online]. Available: https://doi.org/10.3390/s22051693

  2. [3]

    Part, Christopher Kanan, and Stefan Wermter

    G. I. Parisi, R. Kemker, J. L. Part, C. Kanan, and S. Wermter, “Continual lifelong learning with neural networks: A review,” Neural Networks, vol. 113, pp. 54–71, 2019. [Online]. Available: https://doi.org/10.1016/j.neunet.2019.01.012

  3. [4]

    Recent advances on federated learning: A systematic survey,

    B. Liu, N. Lv, Y . Guo, and Y . Li, “Recent advances on federated learning: A systematic survey,”Neurocomputing, vol. 597, p. 128019, 2024. [Online]. Available: https://doi.org/10.1016/j.neucom.2024.128019

  4. [5]

    Federated learning for mobile keyboard prediction,

    A. Hard, K. Rao, R. Mathews, F. Beaufays, S. Augenstein, H. Eichner, C. Kiddon, and D. Ramage, “Federated learning for mobile keyboard prediction,”CoRR, vol. abs/1811.03604, 2018. [Online]. Available: http://arxiv.org/abs/1811.03604

  5. [6]

    Federated learning for vehicular internet of things: Recent advances and open issues,

    Z. Du, C. Wu, T. Yoshinaga, K. A. Yau, Y . Ji, and J. Li, “Federated learning for vehicular internet of things: Recent advances and open issues,”IEEE Open J. Comput. Soc., vol. 1, pp. 45–61, 2020. [Online]. Available: https://doi.org/10.1109/OJCS.2020.2992630

  6. [7]

    Crowdfl: Privacy- preserving mobile crowdsensing system via federated learning,

    B. Zhao, X. Liu, W.-N. Chen, and R. H. Deng, “Crowdfl: Privacy- preserving mobile crowdsensing system via federated learning,”IEEE Transactions on Mobile Computing, vol. 22, no. 8, p. 4607–4619, Aug

  7. [8]

    Available: https://doi.org/10.1109/TMC.2022.3157603

    [Online]. Available: https://doi.org/10.1109/TMC.2022.3157603

  8. [9]

    Decentralised, collaborative, and privacy- preserving machine learning for multi-hospital data,

    C. Fang, A. Dziedzic, L. Zhang, L. Oliva, A. Verma, F. Razak, N. Papernot, and B. Wang, “Decentralised, collaborative, and privacy- preserving machine learning for multi-hospital data,”eBioMedicine, vol. 101, p. 105006, 2024. [Online]. Available: https://www.sciencedir ect.com/science/article/pii/S2352396424000410

  9. [10]

    Decentralized proximity-aware clustering for collective self-federated learning,

    D. Domini, N. Farabegoli, G. Aguzzi, M. Viroli, and L. Esterle, “Decentralized proximity-aware clustering for collective self-federated learning,”Internet of Things, vol. 35, p. 101841, 2026. [Online]. Available: https://doi.org/10.1016/j.iot.2025.101841

  10. [11]

    An efficient framework for clustered federated learning,

    A. Ghosh, J. Chung, D. Yin, and K. Ramchandran, “An efficient framework for clustered federated learning,”IEEE Trans. Inf. Theory, vol. 68, no. 12, pp. 8076–8091, 2022. [Online]. Available: https://doi.org/10.1109/TIT.2022.3192506

  11. [12]

    FBFL: A field-based coordination approach for data heterogeneity in federated learning,

    D. Domini, G. Aguzzi, L. Esterle, and M. Viroli, “FBFL: A field-based coordination approach for data heterogeneity in federated learning,” Logical Methods in Computer Science, vol. 22, p. 30, 2026. [Online]. Available: https://lmcs.episciences.org/17663

  12. [14]

    Communication-efficient learning of deep networks from decentralized data,

    B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. y Arcas, “Communication-efficient learning of deep networks from decentralized data,” inProceedings of the 20th International Conference on Artificial Intelligence and Statistics, AISTATS 2017, 20-22 April 2017, Fort Lauderdale, FL, USA, ser. Proceedings of Machine Learning Research, A. Singh and X. J...

  13. [15]

    Federated learning of deep networks using model averaging,

    H. B. McMahan, E. Moore, D. Ramage, and B. A. y Arcas, “Federated learning of deep networks using model averaging,”CoRR, vol. abs/1602.05629, 2016. [Online]. Available: http://arxiv.org/abs/16 02.05629

  14. [16]

    Decentralized federated learning with non-iid data: Challenges, trends, and future opportunities,

    W. Chung, C. Lo, Y . Lin, Z. Chen, and C. Hung, “Decentralized federated learning with non-iid data: Challenges, trends, and future opportunities,”ACM Comput. Surv., vol. 58, no. 8, pp. 192:1–192:41,

  15. [17]

    Available: https://doi.org/10.1145/3785657

    [Online]. Available: https://doi.org/10.1145/3785657

  16. [18]

    Federated learning on non-iid data: A survey,

    H. Zhu, J. Xu, S. Liu, and Y . Jin, “Federated learning on non-iid data: A survey,”Neurocomputing, vol. 465, pp. 371–390, 2021. [Online]. Available: https://doi.org/10.1016/j.neucom.2021.07.098

  17. [19]

    A survey of clustering federated learning in heterogeneous data scenarios,

    E. Liu, W. Yang, Y . Gu, W. Long, S. Istv ´an, and L. Jiang, “A survey of clustering federated learning in heterogeneous data scenarios,”Journal of Computing and Electronic Information Management, vol. 16, no. 3, pp. 17–22, 2025

  18. [20]

    Profed: a benchmark for proximity-based non-iid federated learning,

    D. Domini, G. Aguzzi, and M. Viroli, “Profed: a benchmark for proximity-based non-iid federated learning,”Joural of Open Research Software, vol. 14, 2026. [Online]. Available: https: //openresearchsoftware.metajnl.com/articles/10.5334/jors.624

  19. [21]

    Catastrophic forgetting in connectionist networks,

    R. M. French, “Catastrophic forgetting in connectionist networks,” Trends in Cognitive Sciences, vol. 3, no. 4, pp. 128–135, 1999. [Online]. Available: https://www.sciencedirect.com/science/article/pii/ S1364661399012942

  20. [22]

    Three types of incremental learning,

    G. M. van de Ven, T. Tuytelaars, and A. S. Tolias, “Three types of incremental learning,”Nature Machine Intelligence, vol. 4, no. 12, p. 1185–1197, Dec 2022. [Online]. Available: http://dx.doi.org/10.1038/s 42256-022-00568-3

  21. [23]

    Class-incremental learning with repetition,

    H. Hemati, A. Cossu, A. Carta, J. Hurtado, L. Pellegrini, D. Bacciu, V . Lomonaco, and D. Borth, “Class-incremental learning with repetition,” inConference on Lifelong Learning Agents, ser. Proceedings of Machine Learning Research, S. Chandar, R. Pascanu, H. Sedghi, and D. Precup, Eds. PMLR, 2023, pp. 437–455. [Online]. Available: https://proceedings.mlr....

  22. [24]

    Gradient episodic memory for contin- ual learning,

    D. Lopez-Paz and M. Ranzato, “Gradient episodic memory for contin- ual learning,” inAdvances in Neural Information Processing Systems, I. Guyon, U. V . Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vish- wanathan, and R. Garnett, Eds., vol. 30. Curran Associates, Inc., 2017

  23. [25]

    icarl: Incre- mental classifier and representation learning,

    S.-A. Rebuffi, A. Kolesnikov, G. Sperl, and C. H. Lampert, “icarl: Incre- mental classifier and representation learning,” in2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 5533– 5542

  24. [26]

    Progressive neural networks,

    A. A. Rusu, N. C. Rabinowitz, G. Desjardins, H. Soyer, J. Kirkpatrick, K. Kavukcuoglu, R. Pascanu, and R. Hadsell, “Progressive neural networks,”CoRR, vol. abs/1606.04671, 2016. [Online]. Available: http://arxiv.org/abs/1606.04671

  25. [27]

    Lifelong learning with dynamically expandable networks,

    J. Yoon, E. Yang, J. Lee, and S. J. Hwang, “Lifelong learning with dynamically expandable networks,” in6th International Conference on Learning Representations (ICLR), 2018

  26. [28]

    Packnet: Adding multiple tasks to a single network by iterative pruning,

    A. Mallya and S. Lazebnik, “Packnet: Adding multiple tasks to a single network by iterative pruning,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE Computer Society, 2018, pp. 7765–7773

  27. [29]

    Piggyback: Adapting a single network to multiple tasks by learning to mask weights,

    A. Mallya, D. Davis, and S. Lazebnik, “Piggyback: Adapting a single network to multiple tasks by learning to mask weights,” inComputer Vision – ECCV 2018. Berlin, Heidelberg: Springer-Verlag, 2018, p. 72–88

  28. [30]

    Non-iid data and continual learning processes in federated learning: A long road ahead,

    M. F. Criado, F. E. Casado, R. Iglesias, C. V . Regueiro, and S. Barro, “Non-iid data and continual learning processes in federated learning: A long road ahead,”Information Fusion, vol. 88, pp. 263–280, 2022. [Online]. Available: https://www.sciencedirect.com/science/article/pii/ S1566253522000884

  29. [31]

    Federated continual learning: Concepts, challenges, and solutions,

    P. Hamedi, R. Razavi-Far, and E. Hallaji, “Federated continual learning: Concepts, challenges, and solutions,”Neurocomputing, vol. 651, p. 130844, 2025. [Online]. Available: https://doi.org/10.1016/j.neucom.2 025.130844

  30. [32]

    Concept drift detection and adaptation for federated and continual learning,

    F. E. Casado, D. Lema, M. F. Criado, R. Iglesias, C. V . Regueiro, and S. Barro, “Concept drift detection and adaptation for federated and continual learning,”Multim. Tools Appl., vol. 81, no. 3, pp. 3397–3419,

  31. [33]

    Available: https://doi.org/10.1007/s11042-021-11219-x

    [Online]. Available: https://doi.org/10.1007/s11042-021-11219-x

  32. [34]

    Attentive federated learning for concept drift in distributed 5g edge networks,

    A. H. Estiri and M. Maheswaran, “Attentive federated learning for concept drift in distributed 5g edge networks,”CoRR, vol. abs/2111.07457, 2021. [Online]. Available: https://arxiv.org/abs/2111.0 7457

  33. [35]

    Cross-fcl: Toward a cross- edge federated continual learning framework in mobile edge computing systems,

    Z. Zhang, B. Guo, W. Sun, Y . Liu, and Z. Yu, “Cross-fcl: Toward a cross- edge federated continual learning framework in mobile edge computing systems,”IEEE Trans. Mob. Comput., vol. 23, no. 1, pp. 313–326,

  34. [36]

    Zhang, B

    [Online]. Available: https://doi.org/10.1109/TMC.2022.3223944

  35. [37]

    Towards lifelong federated learning in autonomous mobile robots with continuous sim-to- real transfer,

    X. Yu, J. P. Queralta, and T. Westerlund, “Towards lifelong federated learning in autonomous mobile robots with continuous sim-to- real transfer,” inThe 13th International Conference on Emerging Ubiquitous Systems and Pervasive Networks (EUSPN 2022) / The 12th International Conference on Current and Future Trends of Information and Communication Technolog...

  36. [38]

    Mobility-aware cluster federated learning in hierarchical wireless networks,

    C. Feng, H. H. Yang, D. Hu, Z. Zhao, T. Q. S. Quek, and G. Min, “Mobility-aware cluster federated learning in hierarchical wireless networks,”CoRR, vol. abs/2108.09103, 2021. [Online]. Available: https://arxiv.org/abs/2108.09103

  37. [39]

    Mobility accelerates learning: Convergence analysis on hierarchical federated learning in vehicular networks,

    T. Chen, J. Yan, Y . Sun, S. Zhou, D. G ¨und¨uz, and Z. Niu, “Mobility accelerates learning: Convergence analysis on hierarchical federated learning in vehicular networks,”IEEE Trans. Veh. Technol., vol. 74, no. 1, pp. 1657–1673, 2025. [Online]. Available: https: //doi.org/10.1109/TVT.2024.3466299

  38. [40]

    Aggregate programming for the internet of things,

    J. Beal, D. Pianini, and M. Viroli, “Aggregate programming for the internet of things,”IEEE Computer, vol. 48, no. 9, pp. 22–30, 2015. [Online]. Available: https://doi.org/10.1109/MC.2015.261

  39. [41]

    An aggregate computing approach to self-stabilizing leader election,

    Y . Mo, J. Beal, and S. Dasgupta, “An aggregate computing approach to self-stabilizing leader election,” in2018 IEEE 3rd International Workshops on Foundations and Applications of Self* Systems (FAS*W), Trento, Italy, September 3-7, 2018. IEEE, 2018, pp. 112–117. [Online]. Available: https://doi.org/10.1109/FAS-W.2018.00034

  40. [42]

    Partitioned integration and coordination via the self-organising coordination regions pattern,

    D. Pianini, R. Casadei, M. Viroli, and A. Natali, “Partitioned integration and coordination via the self-organising coordination regions pattern,” Future Gener. Comput. Syst., vol. 114, pp. 44–68, 2021. [Online]. Available: https://doi.org/10.1016/j.future.2020.07.032

  41. [43]

    Effective collective summarisation of distributed data in mobile multi-agent systems,

    G. Audrito, S. Bergamini, F. Damiani, and M. Viroli, “Effective collective summarisation of distributed data in mobile multi-agent systems,” inProceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems, AAMAS ’19, Montreal, QC, Canada, May 13-17, 2019, E. Elkind, M. Veloso, N. Agmon, and M. E. Taylor, Eds. International F...

  42. [44]

    Compositional blocks for optimal self-healing gradients,

    G. Audrito, R. Casadei, F. Damiani, and M. Viroli, “Compositional blocks for optimal self-healing gradients,” in11th IEEE International Conference on Self-Adaptive and Self-Organizing Systems, SASO 2017, Tucson, AZ, USA, September 18-22, 2017. IEEE Computer Society, 2017, pp. 91–100. [Online]. Available: https://doi.ieeecomputersociety. org/10.1109/SASO.2017.18

  43. [45]

    Online continual learning from imbalanced data,

    A. Chrysakis and M.-F. Moens, “Online continual learning from imbalanced data,” inProceedings of the 37th International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, H. D. III and A. Singh, Eds., vol. 119. PMLR, 13–18 Jul 2020, pp. 1952–1961. [Online]. Available: https://proceedings.mlr.press/v119/chr ysakis20a.html

  44. [46]

    Phyelds: A pythonic framework for aggregate computing,

    G. Aguzzi, D. Domini, N. Farabegoli, and M. Viroli, “Phyelds: A pythonic framework for aggregate computing,”CoRR, vol. abs/2603.29999, 2026. [Online]. Available: https://doi.org/10.48550/a rXiv.2603.29999

  45. [47]

    Automatic differentiation in pytorch,

    A. Paszke, S. Gross, S. Chintala, G. Chanan, E. Yang, Z. DeVito, Z. Lin, A. Desmaison, L. Antiga, and A. Lerer, “Automatic differentiation in pytorch,” 2017