pith. sign in

arxiv: 2512.09827 · v2 · pith:FFCLBY55new · submitted 2025-12-10 · 📡 eess.SP

Energy-Efficient Federated Learning with Relay-Assisted Aggregation in IIoT Networks

Pith reviewed 2026-05-25 07:35 UTC · model grok-4.3

classification 📡 eess.SP
keywords federated learningIIoTrelay-assisted aggregationenergy efficiencydecode-and-forwardSPCA optimizationlatency constraints
0
0 comments X

The pith

Relay-assisted partial aggregation in federated learning reduces energy use up to 6x and outage to 10^-6 in IIoT networks.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes a transmission framework for federated learning where subnetworks upload local model updates to an edge server either directly or through neighboring subnetworks that act as relays and perform partial aggregation. This setup is analyzed for convergence, then optimized by decomposing the non-convex energy-efficiency problem into separate computation and communication sub-problems, grouping devices by latency, selecting relays, and applying sequential parametric convex approximation to set power levels. The framework also handles imperfect channel state information and maximizes the number of devices meeting round-wise delay constraints. Simulations indicate faster convergence, lower outage, and large energy reductions relative to single-hop and unaggregated baselines.

Core claim

By allowing decode-and-forward relays to aggregate model updates before forwarding them, the scheme reduces transmission overhead and training latency in federated learning. The decomposed optimization combined with latency-based grouping and SPCA power allocation then maximizes energy efficiency while satisfying strict IIoT latency limits, yielding outage probability down to 10^-6 and energy savings of at least 2x versus unaggregated cooperation and up to 6x versus single-hop transmission.

What carries the argument

Relay-assisted partial aggregation, in which neighboring subnetworks decode, aggregate, and forward model updates to the edge server, carrying the reduction in communication overhead.

If this is right

  • Maximizing the number of subnetworks meeting each round's delay constraint increases participation and improves convergence stability under non-IID data.
  • The SPCA solver produces feasible power allocations that jointly satisfy latency, energy, and reliability targets after grouping and relay selection.
  • The same grouping-plus-relay mechanism extends to the imperfect channel state information case without changing the decomposition structure.
  • Outage probability drops from order 10^-2 (single-hop) to 10^-6 because two-hop paths with aggregation avoid direct-link failures.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The partial-aggregation step at relays could be generalized to other wireless distributed training settings where uplink bandwidth is the bottleneck.
  • Latency-minimizing grouping may need periodic re-execution in mobile IIoT scenarios, suggesting an online variant of the algorithm.
  • Energy savings are reported under fixed traffic models; variable packet sizes or model compression would alter the reported 2x-6x gains.

Load-bearing premise

The decomposition of the non-convex energy efficiency optimization into separate computation and communication sub-problems remains valid and near-optimal.

What would settle it

A measurement campaign in an IIoT testbed in which the proposed scheme fails to deliver at least 2x lower energy consumption than unaggregated cooperation while keeping the same delay and outage targets would falsify the central performance claims.

Figures

Figures reproduced from arXiv: 2512.09827 by Gilberto Berardinelli, Hamid Reza Hashempour, Hien Quoc Ngo, Jie Zhang, Mostafa Nozari, Shashi Raj Pandey, Yanjiao Li.

Figure 1
Figure 1. Figure 1: System model for FL with multiple SNs in an IIoT network. it adopts the relay path over h r n,k, where Rk decodes, aggregates, and then transmits the result to the ES over h a k . Upon receiving both direct and relayed updates, the ES applies a global aggregation (e.g., FedAvg) to produce an updated global model w, which it then broadcasts back to all SNs. This scheme allows FL to iteratively refine local … view at source ↗
Figure 2
Figure 2. Figure 2: Proposed implementation framework for an FL algorithm using the cooperative TDMA protocol. they are assumed to remain quasi-static within each FL training round. Thus, the channel training procedure can be performed periodically with a relaxed update frequency to accommodate practical deployment conditions. The ES scheduler utilizes the complete CSI from all SN links to optimize communication parameters. S… view at source ↗
Figure 3
Figure 3. Figure 3: Training loss convergence of FL for different methods over 500 rounds. 0 100 200 300 400 500 Global Round 20 40 60 80 Test Accuracy (%) Ideal FL Proposed Method 1-Hop FL [PITH_FULL_IMAGE:figures/full_fig_p012_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Test accuracy of FL for different methods over 500 rounds. Algorithm 1. 2) Alg. 1, Fixed th: Algorithm 1 with a fixed th instead of searching for the optimal th. The average channel gain in each round is used as the threshold. 3) Only 2-hop, Fixed th: Assumes all SNs use 2-hop trans￾mission, with relays selected based on the average channel gain. 4) Only 1-hop: Direct single-hop transmission. 5) Random Sel… view at source ↗
Figure 5
Figure 5. Figure 5: Outage probability versus the maximum transmit power per SN for different methods with |B| = 1 kbits. 10 20 30 40 50 60 70 80 90 100 Number of SNs 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 CDF [PITH_FULL_IMAGE:figures/full_fig_p013_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Empirical CDF of the number of SNs participating in each round of FL for different values of Pmax using the proposed algorithm. To examine Algorithm 2, we plot the empirical cumu￾lative distribution function (CDF) of the number of SNs participating in each round of FL for different values of Pmax in [PITH_FULL_IMAGE:figures/full_fig_p013_6.png] view at source ↗
Figure 8
Figure 8. Figure 8: Comparison of average transmission energy for 50 SNs, each transmitting |B| = 5 kbit per FL round, as a function of the latency threshold under different schemes. all latency thresholds, the proposed scheme consistently outperforms the baselines, achieving up to six-fold energy reduction compared with the 1-Hop scheme and at least a two-fold improvement over the 2-Hop-wo-PA case. While the results are show… view at source ↗
Figure 10
Figure 10. Figure 10: Comparison of total transmission energy and average computation energy per SN as a function of the number of SNs. challenging environments with non-line-of-sight (NLOS) conditions and larger packet sizes, the advantages of our approach become even more pronounced, underscoring its effectiveness in energy-efficient FL. D. Effect of Imperfect CSI [PITH_FULL_IMAGE:figures/full_fig_p014_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: CDF of uplink transmit energy for 50 SNs with a data size of |B| = 5 kbit per FL round under different CSI conditions. reduced to roughly 10 µJ, which is negligible compared to the Lp = 1 case. These results demonstrate that the proposed approach maintains strong energy performance even under practical channel-estimation uncertainty. VII. Conclusion In this paper, we proposed an EE-FL framework where SNs … view at source ↗
read the original abstract

This paper presents an energy-efficient transmission framework for federated learning (FL) in industrial Internet of Things (IIoT) environments with strict latency and energy constraints. Machinery subnetworks (SNs) collaboratively train a global model by uploading local updates to an edge server (ES), either directly or via neighboring SNs acting as decode-and-forward relays. To enhance communication efficiency, relays perform partial aggregation before forwarding the models to the ES, significantly reducing overhead and training latency. We analyze the convergence behavior of this relay-assisted FL scheme. To address the inherent energy efficiency (EE) challenges, we decompose the original non-convex optimization problem into sub-problems addressing computation and communication energy separately. An SN grouping algorithm categorizes devices into single-hop and two-hop transmitters based on latency minimization, followed by a relay selection mechanism. To improve FL reliability, we further maximize the number of SNs that meet the roundwise delay constraint, promoting broader participation and improved convergence stability under practical IIoT data distributions. Transmit power levels are then optimized to maximize EE, and a sequential parametric convex approximation (SPCA) method is proposed for joint configuration of system parameters. We further extend the EE formulation to the imperfect channel state information (ICSI). Simulation results demonstrate that the proposed framework significantly enhances convergence speed, reduces outage probability from 10-2 in single-hop to 10-6 and achieves substantial energy savings, with the SPCA approach reducing energy consumption by at least 2x compared to unaggregated cooperation and up to 6x over single-hop transmission.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The manuscript proposes a relay-assisted federated learning framework for IIoT networks in which machinery subnetworks (SNs) upload local updates to an edge server either directly or via decode-and-forward relays that perform partial aggregation. The non-convex energy-efficiency optimization is decomposed into separate computation and communication sub-problems; an SN grouping algorithm based on latency minimization and a relay-selection step are introduced, followed by SPCA-based power optimization. An extension to imperfect CSI is also presented. Simulations are claimed to demonstrate faster convergence, outage reduction from 10^{-2} to 10^{-6}, and energy savings of 2x–6x relative to unaggregated and single-hop baselines.

Significance. If the decomposition and SPCA approximations can be shown to be near-optimal, the partial-aggregation relay scheme would constitute a practically relevant contribution for latency- and energy-constrained IIoT FL deployments. The reported simulation gains, if reproducible, would be noteworthy for the field.

major comments (3)
  1. [Abstract] Abstract (optimization decomposition paragraph): the central claim of 2x–6x energy savings rests on the decomposition of the joint non-convex EE problem into independent computation and communication sub-problems whose solutions are recombined; no proof of near-optimality, bound on the optimality gap, or comparison against the joint formulation is supplied.
  2. [Abstract] Abstract (SPCA paragraph): the SPCA method is asserted to solve the joint configuration, yet no a-posteriori error bound relative to the original non-convex problem or convergence guarantee for the approximated EE objective is provided; this directly underpins the reported performance numbers.
  3. [Abstract] Abstract (SN grouping and relay selection): grouping and relay choice are performed exclusively on latency minimization; because the EE objective is not used in this ordering, the claimed energy and outage improvements do not necessarily follow, and no analysis quantifies the resulting sub-optimality.
minor comments (2)
  1. The convergence analysis mentioned in the abstract is not summarized with explicit assumptions or rate expressions; adding a brief statement would improve readability.
  2. Simulation parameters (channel models, device counts, exact baseline implementations) should be stated more explicitly when the numerical results are presented.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive comments highlighting the need for stronger justification of the optimization claims. We respond point-by-point below.

read point-by-point responses
  1. Referee: [Abstract] Abstract (optimization decomposition paragraph): the central claim of 2x–6x energy savings rests on the decomposition of the joint non-convex EE problem into independent computation and communication sub-problems whose solutions are recombined; no proof of near-optimality, bound on the optimality gap, or comparison against the joint formulation is supplied.

    Authors: The decomposition exploits the additive separability of computation and communication energy terms, permitting independent sub-problem solutions that are recombined. We acknowledge that no formal optimality-gap bound or comparison to a joint formulation is supplied. In revision we will add numerical comparisons against a joint benchmark (where computationally feasible) to quantify the gap and support the reported savings. revision: partial

  2. Referee: [Abstract] Abstract (SPCA paragraph): the SPCA method is asserted to solve the joint configuration, yet no a-posteriori error bound relative to the original non-convex problem or convergence guarantee for the approximated EE objective is provided; this directly underpins the reported performance numbers.

    Authors: SPCA is applied to the non-convex power-allocation sub-problem after grouping. We agree that explicit a-posteriori error bounds and EE-specific convergence guarantees are absent. The revision will include a discussion of standard SPCA convergence results together with numerical checks on approximation accuracy relative to the original objective. revision: partial

  3. Referee: [Abstract] Abstract (SN grouping and relay selection): grouping and relay choice are performed exclusively on latency minimization; because the EE objective is not used in this ordering, the claimed energy and outage improvements do not necessarily follow, and no analysis quantifies the resulting sub-optimality.

    Authors: Latency-based grouping is chosen to maximize the number of SNs meeting the round-wise delay constraint, which directly supports FL convergence stability. Subsequent power optimization then maximizes EE under the fixed grouping. We recognize that this ordering does not jointly optimize EE and that sub-optimality is not quantified. The revision will add analysis or simulations comparing latency-based versus EE-aware grouping to bound the resulting performance difference. revision: partial

Circularity Check

0 steps flagged

No circularity detected in derivation chain

full rationale

The provided abstract and description outline a standard non-convex optimization decomposition into computation and communication subproblems, followed by latency-based grouping, relay selection, and SPCA approximation for EE maximization. No equations, fitted parameters renamed as predictions, or self-citation chains are quoted that would reduce the claimed energy savings or outage reductions to inputs by construction. The approach relies on external optimization techniques and simulations without load-bearing self-referential definitions or uniqueness theorems imported from the authors' prior work. The derivation chain is therefore self-contained against the stated benchmarks.

Axiom & Free-Parameter Ledger

2 free parameters · 2 axioms · 0 invented entities

Limited to abstract; paper appears to rely on standard assumptions in wireless optimization and FL convergence analysis. No explicit free parameters or invented entities mentioned beyond typical system variables such as power levels and grouping thresholds.

free parameters (2)
  • Transmit power levels
    Optimized via SPCA to maximize EE under delay constraints
  • Grouping and relay selection thresholds
    Chosen based on latency minimization criteria
axioms (2)
  • domain assumption Convergence behavior of the relay-assisted FL scheme can be analyzed under the stated network conditions
    Abstract states analysis of convergence behavior
  • ad hoc to paper Decomposition of the non-convex optimization into sub-problems addressing computation and communication energy separately preserves solution quality
    Used to address EE challenges

pith-pipeline@v0.9.0 · 5840 in / 1551 out tokens · 33186 ms · 2026-05-25T07:35:15.576871+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

34 extracted references · 34 canonical work pages

  1. [1]

    Federated Learning for the Internet of Things: Applications, Challenges, and Opportunities,

    T. Zhang, L. Gao, C. He, M. Zhang, B. Krishnamachari and A. S. A vestimehr, “Federated Learning for the Internet of Things: Applications, Challenges, and Opportunities,” IEEE IoTM, vol. 5, no. 1, pp. 24-29, March 2022

  2. [2]

    Federated Learning Over Wireless IoT Networks With Optimized Communication and Resources,

    H. Chen, S. Huang, D. Zhang, M. Xiao, M. Skoglund and H. V. Poor, “Federated Learning Over Wireless IoT Networks With Optimized Communication and Resources,” IEEE Inter- net Things J., vol. 9, no. 17, pp. 16592-16605, 1 Sept.1, 2022

  3. [3]

    Federated Learning for Industrial Internet of Things in Future Industries,

    D. C. Nguyen et al., “Federated Learning for Industrial Internet of Things in Future Industries,” IEEE Wireless Communica- tions, vol. 28, no. 6, pp. 192-199, December 2021

  4. [4]

    Industrial Internet of Things: Challenges, Opportunities, and Directions,

    E. Sisinni et al., “Industrial Internet of Things: Challenges, Opportunities, and Directions,” IEEE Trans. Industrial Infor- matics, vol. 14, no. 11, Nov. 2018, pp. 4724–34

  5. [5]

    A Blockchained Federated Learning Framework for Cognitive Computing in Industry 4.0 Networks,

    Y. Qu et al., “A Blockchained Federated Learning Framework for Cognitive Computing in Industry 4.0 Networks,” IEEE Trans. Industrial Informatics, vol. 17, no. 4, April 2021, pp. 2964–73

  6. [6]

    Efficient and Privacy-Enhanced Federated Learning for Industrial Artificial Intelligence,

    M. Hao et al., “Efficient and Privacy-Enhanced Federated Learning for Industrial Artificial Intelligence,” IEEE Trans. Industrial Informatics, vol. 16, no. 10, Oct. 2020, pp. 6532–42

  7. [7]

    Energy- Aware Resource Management for Federated Learning in Multi- Access Edge Computing Systems,

    C. W. Zaw, S. R. Pandey, K. Kim and C. S. Hong, “Energy- Aware Resource Management for Federated Learning in Multi- Access Edge Computing Systems,” IEEE Access, vol. 9, pp. 34938-34950, 2021

  8. [8]

    Federated learning with non-IID data in wireless networks,

    Z. Zhao et al., “Federated learning with non-IID data in wireless networks,” IEEE Trans. Wireless Commun., vol. 21, no. 3, pp. 1927–1942, Mar. 2022

  9. [9]

    A survey of incentive mechanism design for federated learning,

    Y. Zhan et al., “A survey of incentive mechanism design for federated learning,” IEEE Trans. Emerg. Top. Comput., vol. 10, no. 2, pp. 1035–1044, Mar. 2022

  10. [10]

    Extreme Communication in 6G: Vision and Challenges for ’in-X’ Subnetworks,

    G. Berardinelli et al., “Extreme Communication in 6G: Vision and Challenges for ’in-X’ Subnetworks,” IEEE Open Journal of the Communications Society, vol. 2, pp. 2516-2535, 2021

  11. [11]

    Energy efficient resource allocation for mobile edge computing with multiple relays,

    X. Li, R. Fan et al., “Energy efficient resource allocation for mobile edge computing with multiple relays,” IEEE Internet Things J., vol. 9, no. 13, pp. 10732–10750, Jul. 2022

  12. [12]

    Efficient Rate-Splitting Multiple Access for the Internet of Vehicles: Federated Edge Learning and Latency Minimization,

    S. Zhang, S. Zhang, W. Yuan, Y. Li and L. Hanzo, “Efficient Rate-Splitting Multiple Access for the Internet of Vehicles: Federated Edge Learning and Latency Minimization,” IEEE J. Sel. Areas Commun., vol. 41, no. 5, pp. 1468-1483, May 2023

  13. [13]

    En- ergy Efficient Federated Learning Over Heterogeneous Mobile Devices via Joint Design of Weight Quantization and Wireless Transmission,

    R. Chen, L. Li, K. Xue, C. Zhang, M. Pan and Y. Fang, “En- ergy Efficient Federated Learning Over Heterogeneous Mobile Devices via Joint Design of Weight Quantization and Wireless Transmission,” IEEE Trans. Mob. Comput., vol. 22, no. 12, pp. 7451-7465, Dec. 2023

  14. [14]

    Energy-Efficient Federated Learning Over Cell-Free IoT Networks: Modeling and Optimization,

    T. Zhao, X. Chen, Q. Sun and J. Zhang, “Energy-Efficient Federated Learning Over Cell-Free IoT Networks: Modeling and Optimization,” IEEE Internet Things J., vol. 10, no. 19, pp. 17436-1744 9, 1 Oct.1, 2023

  15. [15]

    Energy-Efficient Federated Learning Over UA V-Enabled Wire- less Powered Communications,

    Q. -V. Pham, M. Le, T. Huynh-The, Z. Han and W. -J. Hwang, “Energy-Efficient Federated Learning Over UA V-Enabled Wire- less Powered Communications,” IEEE Trans. Veh. Technol., vol. 71, no. 5, pp. 4977-4990, May 2022

  16. [16]

    Energy-Efficient Federated Learning With Intelligent Reflecting Surface,

    T. Zhang and S. Mao, “Energy-Efficient Federated Learning With Intelligent Reflecting Surface,” IEEE Trans. Green Com- mun. Netw., vol. 6, no. 2, pp. 845-858, June 2022

  17. [17]

    Energy-Efficient Federated Learning With Resource Allocation for Green IoT Edge Intelligence in B5G,

    A. Salh et al., “Energy-Efficient Federated Learning With Resource Allocation for Green IoT Edge Intelligence in B5G,” IEEE Access, vol. 11, pp. 16353-16367, 2023

  18. [18]

    Energy- Efficient Resource Allocation for Federated Learning in NOMA- Enabled and Relay-Assisted Internet of Things Networks,

    M. S. Al-Abiad, M. Z. Hassan and M. J. Hossain, “Energy- Efficient Resource Allocation for Federated Learning in NOMA- Enabled and Relay-Assisted Internet of Things Networks,” IEEE Internet Things J., vol. 9, no. 24, pp. 24736-24753, 15 Dec.15, 2022. 16

  19. [19]

    Power Efficient Cooperative Communication Within IIoT Subnetworks: Relay or RIS?,

    H. R. Hashempour, G. Berardinelli, R. Adeogun and E. A. Jor- swieck, “Power Efficient Cooperative Communication Within IIoT Subnetworks: Relay or RIS?,” IEEE Internet Things J., doi: 10.1109/JIOT.2024.3521001

  20. [20]

    Communication-efficient learning of deep networks from decentralized data,

    B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. Aguera y Arcas, “Communication-efficient learning of deep networks from decentralized data,” in Proc. Artif. Intell. Stat., PMLR, 2017, pp. 1273–1282

  21. [21]

    Energy efficient federated learning over wireless communication networks,

    Z. Yang et al., “Energy efficient federated learning over wireless communication networks,” Trans. Wireless Commun., vol. 20, no. 3, pp. 1935–1949, Mar. 2021

  22. [22]

    Statistical energy consumption analysis and optimization for relaying transmission with wireless power transfer,

    F. Xu, Y. Wang, X. Zhang, Y. Xie, and R. Samy, “Statistical energy consumption analysis and optimization for relaying transmission with wireless power transfer,” Digital Commun. Netw., 2025, doi: 10.1016/j.dcan.2025.06.012

  23. [23]

    Energy efficiency analysis of hybrid-ARQ relay-assisted schemes in LTE-based systems,

    M. Maaz, et al., “Energy efficiency analysis of hybrid-ARQ relay-assisted schemes in LTE-based systems,” EURASIP J. Wireless Commun. Netw., vol. 2016, no. 22, pp. 1–12, 2016

  24. [24]

    Coding for energy efficient wireless embedded networks,

    H. Karvonen, Z. Shelby, and C. Pomalaza-Raez, “Coding for energy efficient wireless embedded networks,” in Proc. Int. Workshop Wireless Ad-Hoc Netw. (IWW AN), Oulu, Finland, 2004, pp. 300–304, doi: 10.1109/IWW AN.2004.1525590

  25. [25]

    Exploiting Di- versity for Ultra-Reliable and Low-Latency Wireless Control,

    S. R. Khosravirad, H. Viswanathan, and W. Yu, “Exploiting Di- versity for Ultra-Reliable and Low-Latency Wireless Control,” IEEE Trans. Wireless Commun., vol. 20, no. 1, pp. 316-331, Jan. 2021

  26. [26]

    On the convergence of FedA vg on non-IID data,

    X. Li, K. Huang, W. Yang, S. Wang, and Z. Zhang, “On the convergence of FedA vg on non-IID data,” arXiv preprint arXiv:1907.02189, 2019

  27. [27]

    Introduction to Algorithms,

    T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein, “Introduction to Algorithms,” MIT Press, 3rd ed., 2009

  28. [28]

    CVX: Matlab software for disciplined convex programming, version 2.1,

    M. Grant and S. Boyd, “CVX: Matlab software for disciplined convex programming, version 2.1,” http://cvxr.com/cvx, Mar. 2014

  29. [29]

    Boyd and L

    S. Boyd and L. Vanderberghe, Convex Optimization. Cam- bridge, U.K.: Cambridge Univ. Press, 2004

  30. [30]

    Communication- efficient learning of deep networks from decentralized data,

    B. McMahan, E. Moore, D. Ramage, et al., “Communication- efficient learning of deep networks from decentralized data,” AISTATS, 2017

  31. [31]

    Energy and spectral efficiency of very large multiuser MIMO systems,

    H. Q. Ngo, E. G. Larsson, and T. L. Marzetta, “Energy and spectral efficiency of very large multiuser MIMO systems,” IEEE Trans. Commun., vol. 61, no. 4, pp. 1436–1449, Apr. 2013

  32. [32]

    Cell-free massive MIMO versus small cells,

    H. Q. Ngo, A. Ashikhmin, H. Yang, E. G. Larsson, and T. L. Marzetta, “Cell-free massive MIMO versus small cells,” IEEE Trans. Wireless Commun., vol. 16, no. 3, pp. 1834–1850, Mar. 2017

  33. [33]

    Technical Specification Group Radio Access Network; Study on channel model for frequencies from 0.5 to 100 GHz,

    3GPP TR 38.901, v17.0.0, “Technical Specification Group Radio Access Network; Study on channel model for frequencies from 0.5 to 100 GHz,” 2022

  34. [34]

    Y. LeCun. The MNIST Database of Handwritten Digits. Accessed: Sep. 2020. [Online]. A vailable: http://yann.lecun.com/exdb/mnist/