pith. machine review for the scientific record. sign in

arxiv: 2601.15127 · v3 · submitted 2026-01-21 · 💻 cs.LG · cs.CV· cs.DC

Recognition: 2 theorem links

· Lean Theorem

DeepFedNAS: Efficient Hardware-Aware Architecture Adaptation for Heterogeneous IoT Federations via Pareto-Guided Supernet Training

Authors on Pith no claims yet

Pith reviewed 2026-05-16 12:01 UTC · model grok-4.3

classification 💻 cs.LG cs.CVcs.DC
keywords Federated LearningNeural Architecture SearchIoT DevicesHeterogeneous HardwareSupernet TrainingPareto OptimizationZero-Cost Proxy
0
0 comments X

The pith

A multi-objective fitness score lets federated supernet training and subnet search run without predictors or long validation loops.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows how to tailor neural networks to different IoT hardware classes inside a single federated training round. It first builds a supernet by sampling only from a pre-ranked cache of high-fitness architectures instead of random choices. It then uses the same fitness score, built from network metrics and hardware rules, as a direct stand-in for accuracy so that suitable subnets for each device class can be picked in seconds. The result is reported higher final accuracy, smaller model sizes sent each round, and stable behavior even when data across devices is highly skewed.

Core claim

DeepFedNAS shows that a multi-objective fitness function combining information-theoretic metrics with architectural heuristics can both steer supernet training toward better weight sharing and serve as a zero-cost proxy that ranks candidate subnets correctly, removing the need for separate accuracy predictors or GPU-heavy post-search validation while still producing hardware-aware models that meet per-device constraints.

What carries the argument

The multi-objective fitness function that merges information-theoretic network metrics with hardware heuristics, applied first to select elite architectures for supernet training and second as an immediate ranking score for subnet selection.

If this is right

  • Subnets suited to each hardware class appear in roughly 20 seconds instead of over 20 GPU-hours.
  • Models sent each federated round shrink by a factor of 2.8 while accuracy rises up to 1.21 percent on CIFAR-100.
  • Performance holds when data distributions across devices are extremely unbalanced.
  • The same training pipeline works across CIFAR-10, CIFAR-100, and CINIC-10 without separate per-target retraining.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The approach could support on-device architecture updates when new hardware joins the federation mid-deployment.
  • Replacing the current fitness components with direct energy or latency measurements would tighten the hardware match for battery-limited sensors.
  • The zero-cost ranking step might transfer to other federated domains such as time-series forecasting on industrial equipment.
  • If the fitness proxy continues to hold, deployment pipelines for heterogeneous fleets could move from cloud GPU clusters to edge servers.

Load-bearing premise

The fitness score ranks which subnet will actually perform best on a given device without ever training or validating that subnet.

What would settle it

Train the subnets returned by the 20-second predictor-free search on the target devices and measure whether their final test accuracies preserve the same ordering that the fitness scores predicted.

Figures

Figures reproduced from arXiv: 2601.15127 by Bostan Khan, Masoud Daneshtalab.

Figure 1
Figure 1. Figure 1: DeepFedNAS Pipeline. This diagram illustrates the three core phases of our framework. First, the Offline Pareto Optimal Subnet Search generates a “Pareto Path Subnet Cache” of high-fitness architectures (e.g., 60 subnets in ∼20 minutes) using a multi-objective fitness function F(A). Second, the Federated Supernet Training leverages this cache for “Pareto Path Guided Subnet Sampling,” where clients are assi… view at source ↗
Figure 2
Figure 2. Figure 2: Comparison of architectures discovered by our [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗
Figure 4
Figure 4. Figure 4: Test Accuracy vs. True Latency (CPU and GPU). [PITH_FULL_IMAGE:figures/full_fig_p011_4.png] view at source ↗
Figure 3
Figure 3. Figure 3: Accuracy vs. Parameters (Millions). Pareto frontiers for test accuracy percentage against model pa￾rameters (Millions) on (a) CIFAR-100 and (b) CINIC-10. DeepFedNAS subnets are blue squares; SuperFedNAS subnets are red circles. Numerical annotations denote MACs in billions. G. Hardware-Aware Latency Optimization Neural network design necessitates a trade-off be￾tween accuracy and computational efficiency (… view at source ↗
Figure 5
Figure 5. Figure 5: Analysis of cache size effects: (a) average fitness [PITH_FULL_IMAGE:figures/full_fig_p012_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Correlation analysis validating the predictor-free [PITH_FULL_IMAGE:figures/full_fig_p013_6.png] view at source ↗
read the original abstract

Deploying federated learning across heterogeneous IoT device fleets requires tailored neural network architectures for each device class, yet existing Federated Neural Architecture Search (FedNAS) methods suffer from unguided supernet training and prohibitively costly post-training search pipelines that demand over 20 GPU-hours per deployment target. We introduce DeepFedNAS, a two-phase framework built on a multi-objective fitness function that synthesizes information-theoretic network metrics with architectural heuristics. In the first phase, Federated Pareto Optimal Supernet Training replaces random subnet sampling with a pre-computed cache of elite, high-fitness architectures, yielding a superior supernet. In the second phase, a Predictor-Free Search uses this fitness function as a zero-cost accuracy proxy, discovering hardware-optimized subnets in ~20 seconds, a ~61x speedup over the baseline pipeline. Experiments on CIFAR-10, CIFAR-100, and CINIC-10 demonstrate state-of-the-art accuracy (up to +1.21% on CIFAR-100), a 2.8x reduction in per-round transmission size, and robust performance under extreme non-IID conditions ({\alpha} = 0.1), making DeepFedNAS practical for scalable, communication-constrained IoT federations. Source code: https://github.com/bostankhan6/DeepFedNAS

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 1 minor

Summary. The paper introduces DeepFedNAS, a two-phase framework for Federated Neural Architecture Search in heterogeneous IoT settings. Phase 1 performs Federated Pareto Optimal Supernet Training by caching elite subnets selected via a multi-objective fitness function that combines information-theoretic metrics with architectural heuristics. Phase 2 uses the same fitness function as a zero-cost proxy for predictor-free search to discover hardware-optimized subnets. Experiments on CIFAR-10, CIFAR-100, and CINIC-10 report up to +1.21% accuracy gains, 2.8x reduction in per-round transmission size, robustness under extreme non-IID partitions (α=0.1), and ~61x search speedup (~20 seconds per target).

Significance. If the fitness function reliably ranks subnets without training, the work would advance practical FedNAS for communication-constrained IoT federations by enabling fast, hardware-aware adaptation with reduced overhead.

major comments (1)
  1. [Abstract / Phase 1 and Phase 2 descriptions] The central claims of SOTA accuracy, transmission reduction, and 61x speedup all rest on the multi-objective fitness function serving as a reliable zero-cost accuracy proxy that correctly ranks subnets. No evidence is provided of its rank correlation with trained accuracy, ablation against actual validation performance, or sensitivity analysis under non-IID conditions (α=0.1) on CIFAR-100/CINIC-10; this is load-bearing for the reported gains in both phases.
minor comments (1)
  1. [Experiments] The abstract reports concrete accuracy and speedup numbers but provides no details on baselines, statistical significance, error bars, or exact data splits used.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback and the recommendation for major revision. The primary concern regarding validation of the multi-objective fitness function as a zero-cost proxy is well-taken, and we address it directly below with a commitment to strengthen the manuscript.

read point-by-point responses
  1. Referee: [Abstract / Phase 1 and Phase 2 descriptions] The central claims of SOTA accuracy, transmission reduction, and 61x speedup all rest on the multi-objective fitness function serving as a reliable zero-cost accuracy proxy that correctly ranks subnets. No evidence is provided of its rank correlation with trained accuracy, ablation against actual validation performance, or sensitivity analysis under non-IID conditions (α=0.1) on CIFAR-100/CINIC-10; this is load-bearing for the reported gains in both phases.

    Authors: We agree that explicit validation of the fitness function's ranking reliability is essential to support the central claims. The current manuscript demonstrates end-to-end gains from using the function in both phases but does not report rank correlations, ablations against trained validation accuracy, or targeted sensitivity analysis under extreme non-IID partitions. In the revised version we will add: (1) Spearman rank correlation coefficients between fitness scores and post-training accuracies for a large sample of subnets across CIFAR-10, CIFAR-100, and CINIC-10; (2) an ablation comparing predictor-free search guided by the fitness function versus random sampling and versus an oracle using actual validation accuracy; and (3) sensitivity plots showing correlation stability under non-IID degrees including α=0.1. These results will appear in a new subsection of Section 4 (Experiments) or as Appendix C. The core methodology and reported performance numbers remain unchanged; the additions will only strengthen the empirical grounding of the proxy. revision: yes

Circularity Check

0 steps flagged

Fitness function defined independently; no reduction to inputs by construction

full rationale

The multi-objective fitness function is introduced as an independent construct from information-theoretic metrics plus architectural heuristics and is used only to guide elite caching and zero-cost ranking during search. Reported SOTA accuracy, 2.8x transmission reduction, and 61x speedup are obtained from actual federated training runs on CIFAR-10/100 and CINIC-10 under non-IID partitions, not from the proxy values themselves. No equations equate final performance to the fitness scores by definition, no self-citation chains carry the central claims, and the derivation remains self-contained against external dataset benchmarks.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 0 invented entities

The central claims rest on the effectiveness of a custom multi-objective fitness function and the assumption that Pareto selection of elite architectures produces a superior supernet; these are introduced without independent external validation in the abstract.

free parameters (1)
  • weights in multi-objective fitness function
    The synthesis of information-theoretic metrics and architectural heuristics requires relative weighting that is not specified as fixed or derived from first principles.
axioms (2)
  • domain assumption Pre-computed cache of elite high-fitness architectures yields a superior supernet compared with random sampling
    Invoked in the first phase description as the basis for improved supernet quality.
  • domain assumption The fitness function correlates strongly enough with true accuracy to serve as a zero-cost proxy
    Central premise enabling the ~20-second search phase without training.

pith-pipeline@v0.9.0 · 5552 in / 1522 out tokens · 40967 ms · 2026-05-16T12:01:49.962101+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

50 extracted references · 50 canonical work pages · 4 internal anchors

  1. [1]

    Deepfednas: Pareto optimal supernet training for improved and predictor-free federated neural architecture search,

    B. Khan and M. Daneshtalab, “Deepfednas: Pareto optimal supernet training for improved and predictor-free federated neural architecture search,” in34th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, 2026. [Online]. Available: http: //www.es.mdu.se/publications/7335-

  2. [2]

    Federated Learning for Mobile Keyboard Prediction

    A. Hard, K. Rao, R. Mathews, F. Beaufays, S. Augenstein, H. Eichner, C. Kiddon, and D. Ramage, “Federated learning for mobile keyboard prediction,”CoRR, vol. abs/1811.03604, 2018. [Online]. Available: http://arxiv.org/abs/1811.03604

  3. [3]

    Federated learning in distributed medical databases: Meta-analysis of large-scale subcortical brain data,

    S. Silva, B. A. Gutman, E. Romero, P. M. Thompson, A. Alt- mann, and M. Lorenzi, “Federated learning in distributed medical databases: Meta-analysis of large-scale subcortical brain data,” in 2019 IEEE 16th International Symposium on Biomedical Imaging (ISBI 2019), 2019, pp. 270–274

  4. [4]

    Communication-Efficient Learning of Deep Networks from Decentralized Data,

    B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. y. Arcas, “Communication-Efficient Learning of Deep Networks from Decentralized Data,” inProceedings of the 20th Interna- tional Conference on Artificial Intelligence and Statistics, ser. Proceedings of Machine Learning Research, A. Singh and J. Zhu, Eds., vol. 54. PMLR, 20–22 Apr 2017, pp. 1273–1282

  5. [5]

    Federated optimization in heterogeneous networks,

    T. Li, A. K. Sahu, M. Zaheer, M. Sanjabi, A. Talwalkar, and V . Smith, “Federated optimization in heterogeneous networks,” inProceedings of Machine Learning and Systems, I. Dhillon, D. Papailiopoulos, and V . Sze, Eds., vol. 2, 2020, pp. 429–450

  6. [6]

    FedNAS: Federated deep learning via neural architecture search,

    C. He, M. Annavaram, and S. Avestimehr, “FedNAS: Federated deep learning via neural architecture search,”CoRR, vol. abs/2004.08546, 2020. [Online]. Available: https://arxiv.org/abs/ 2004.08546

  7. [7]

    Once-for-All: Train one network and specialize it for efficient deployment,

    H. Cai, C. Gan, and S. Han, “Once-for-All: Train one network and specialize it for efficient deployment,”CoRR, vol. abs/1908.09791, 2019. [Online]. Available: http://arxiv.org/abs/ 1908.09791

  8. [8]

    SuperFedNAS: Cost-efficient federated neural architecture search for on-device inference,

    A. Khare, A. Agrawal, A. Annavajjala, P. Behnam, M. Lee, H. Latapie, and A. Tumanov, “SuperFedNAS: Cost-efficient federated neural architecture search for on-device inference,” in Computer Vision – ECCV 2024, A. Leonardis, E. Ricci, S. Roth, O. Russakovsky, T. Sattler, and G. Varol, Eds. Cham: Springer Nature Switzerland, 2024, pp. 161–179

  9. [9]

    Universally slimmable networks and improved training techniques,

    J. Yu and T. S. Huang, “Universally slimmable networks and improved training techniques,” inProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), October 2019

  10. [10]

    DeepMAD: Mathematical architecture design for deep convolutional neural network,

    X. Shen, Y . Wang, M. Lin, Y . Huang, H. Tang, X. Sun, and Y . Wang, “DeepMAD: Mathematical architecture design for deep convolutional neural network,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recog- nition (CVPR), June 2023, pp. 6163–6173

  11. [11]

    Deep Residual Learning for Image Recognition

    K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,”CoRR, vol. abs/1512.03385, 2015. [Online]. Available: http://arxiv.org/abs/1512.03385

  12. [12]

    Recent advances on federated learning: A systematic survey,

    B. Liu, N. Lv, Y . Guo, and Y . Li, “Recent advances on federated learning: A systematic survey,”Neurocomputing, vol. 597, p. 128019, 2024. [Online]. Available: https://www.sciencedirect. com/science/article/pii/S0925231224007902

  13. [13]

    Advances and open problems in federated learning,

    P. Kairouz, H. B. McMahan, B. Avent, A. Bellet, M. Bennis, A. N. Bhagoji, K. Bonawitz, Z. Charles, G. Cormode, R. Cummings, R. G. L. D’Oliveira, H. Eichner, S. E. Rouayheb, D. Evans, J. Gardner, Z. Garrett, A. Gasc ´on, B. Ghazi, P. B. Gibbons, M. Gruteser, Z. Harchaoui, C. He, L. He, Z. Huo, B. Hutchinson, J. Hsu, M. Jaggi, T. Javidi, G. Joshi, M. Khodak...

  14. [14]

    Federated learn- ing: Challenges, methods, and future directions,

    T. Li, A. K. Sahu, A. Talwalkar, and V . Smith, “Federated learn- ing: Challenges, methods, and future directions,”IEEE Signal Processing Magazine, vol. 37, no. 3, pp. 50–60, 2020

  15. [15]

    Deep federated learning: a systematic review of methods, applications, and challenges,

    L. Cooray, J. Sendanayake, P. Vithanaarachchi, and Y . H. P. P. Priyadarshana, “Deep federated learning: a systematic review of methods, applications, and challenges,”Frontiers in Computer Science, vol. V olume 7 - 2025, 2025. [Online]. Available: https://www.frontiersin.org/journals/computer-science/ articles/10.3389/fcomp.2025.1617597

  16. [16]

    Federated Learning: Strategies for Improving Communication Efficiency

    J. Kone ˇcn´y, H. B. McMahan, F. X. Yu, P. Richt´arik, A. T. Suresh, and D. Bacon, “Federated learning: Strategies for improving communication efficiency,”CoRR, vol. abs/1610.05492, 2016. [Online]. Available: http://arxiv.org/abs/1610.05492

  17. [17]

    Federated learning with non-iid data: A survey,

    Z. Lu, H. Pan, Y . Dai, X. Si, and Y . Zhang, “Federated learning with non-iid data: A survey,”IEEE Internet of Things Journal, vol. 11, no. 11, pp. 19 188–19 209, 2024

  18. [18]

    A bayesian approach for personalized federated learning in heterogeneous settings,

    D. Makhija, J. Ghosh, and N. Ho, “A bayesian approach for personalized federated learning in heterogeneous settings,” inAd- vances in Neural Information Processing Systems, A. Globerson, L. Mackey, D. Belgrave, A. Fan, U. Paquet, J. Tomczak, and C. Zhang, Eds., vol. 37. Curran Associates, Inc., 2024, pp. 102 428–102 455

  19. [19]

    Fedproto: Federated prototype learning across heterogeneous clients,

    Y . Tan, G. Long, L. Liu, T. Zhou, Q. Lu, J. Jiang, and C. Zhang, “Fedproto: Federated prototype learning across heterogeneous clients,” 2022. [Online]. Available: https://arxiv.org/abs/2105. 00243

  20. [20]

    Model elasticity for hardware heterogeneity in federated learning systems,

    A.-J. Farcas, X. Chen, Z. Wang, and R. Marculescu, “Model elasticity for hardware heterogeneity in federated learning systems,” inProceedings of the 1st ACM Workshop on Data Privacy and Federated Learning Technologies for Mobile Edge Network, ser. FedEdge ’22. New York, NY , USA: Association for Computing Machinery, 2022, p. 19–24. [Online]. Available: ht...

  21. [21]

    Neural Architecture Search with Reinforcement Learning

    B. Zoph and Q. V . Le, “Neural architecture search with reinforcement learning,”CoRR, vol. abs/1611.01578, 2016. [Online]. Available: http://arxiv.org/abs/1611.01578

  22. [22]

    Progressive neural architecture search,

    C. Liu, B. Zoph, M. Neumann, J. Shlens, W. Hua, L.-J. Li, L. Fei- Fei, A. Yuille, J. Huang, and K. Murphy, “Progressive neural architecture search,” inProceedings of the European Conference on Computer Vision (ECCV), September 2018

  23. [23]

    Understanding and simplifying one-shot architecture search,

    G. Bender, P.-J. Kindermans, B. Zoph, V . Vasudevan, and Q. Le, “Understanding and simplifying one-shot architecture search,” in Proceedings of the 35th International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, J. Dy and A. Krause, Eds., vol. 80. PMLR, 10–15 Jul 2018, pp. 550– 559

  24. [24]

    Weight-sharing neural architecture search: A battle to shrink the optimization gap,

    L. Xie, X. Chen, K. Bi, L. Wei, Y . Xu, L. Wang, Z. Chen, A. Xiao, J. Chang, X. Zhang, and Q. Tian, “Weight-sharing neural architecture search: A battle to shrink the optimization gap,”ACM Comput. Surv., vol. 54, no. 9, Oct. 2021

  25. [25]

    DARTS: Differentiable architecture search,

    H. Liu, K. Simonyan, and Y . Yang, “DARTS: Differentiable architecture search,” inInternational Conference on Learning Representations, 2019. [Online]. Available: https://openreview. net/forum?id=S1eYHoC5FX

  26. [26]

    Bignas: Scaling up neural architecture search with big single-stage models,

    J. Yu, P. Jin, H. Liu, G. Bender, P.-J. Kindermans, M. Tan, T. Huang, X. Song, R. Pang, and Q. Le, “Bignas: Scaling up neural architecture search with big single-stage models,” in Computer Vision – ECCV 2020, A. Vedaldi, H. Bischof, T. Brox, and J.-M. Frahm, Eds. Cham: Springer International Publishing, 2020, pp. 702–717

  27. [27]

    Distribution con- sistent neural architecture search,

    J. Pan, C. Sun, Y . Zhou, Y . Zhang, and C. Li, “Distribution con- sistent neural architecture search,” in2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 10 874–10 883

  28. [28]

    Overcoming multi-model forgetting in one-shot nas with diversity maximiza- tion,

    M. Zhang, H. Li, S. Pan, X. Chang, and S. Su, “Overcoming multi-model forgetting in one-shot nas with diversity maximiza- tion,” in2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 7806–7815

  29. [29]

    Defying multi-model forgetting in one-shot neural architecture search using orthogonal gradient learning,

    L. Ma, Y . Zhou, Y . Ma, G. Yu, Q. Li, Q. He, and Y . Pei, “Defying multi-model forgetting in one-shot neural architecture search using orthogonal gradient learning,”IEEE Transactions on Computers, vol. 74, no. 5, pp. 1678–1689, 2025

  30. [30]

    Divide- and-conquer the nas puzzle in resource-constrained federated learning systems,

    Y . Venkatesha, Y . Kim, H. Park, and P. Panda, “Divide- and-conquer the nas puzzle in resource-constrained federated learning systems,”Neural Networks, vol. 168, pp. 569– 579, 2023. [Online]. Available: https://www.sciencedirect.com/ science/article/pii/S0893608023005609

  31. [31]

    PreNAS: Preferred one-shot learning towards efficient neural architecture search,

    H. Wang, C. Ge, H. Chen, and X. Sun, “PreNAS: Preferred one-shot learning towards efficient neural architecture search,” inProceedings of the 40th International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, A. Krause, E. Brunskill, K. Cho, B. Engelhardt, S. Sabato, and J. Scarlett, Eds., vol. 202. PMLR, 23– 29 Jul 2023, pp...

  32. [32]

    Direct federated neural architecture search,

    A. Garg, A. K. Saha, and D. Dutta, “Direct federated neural architecture search,” 2020. [Online]. Available: https: //arxiv.org/abs/2010.06223

  33. [34]

    Federated whole prostate segmentation in mri with personalized neural architectures,

    H. R. Roth, D. Yang, W. Li, A. Myronenko, W. Zhu, Z. Xu, X. Wang, and D. Xu, “Federated whole prostate segmentation in mri with personalized neural architectures,” inMedical Image Computing and Computer Assisted Intervention – MICCAI 2021, M. de Bruijne, P. C. Cattin, S. Cotin, N. Padoy, S. Speidel, Y . Zheng, and C. Essert, Eds. Cham: Springer Internatio...

  34. [35]

    Peaches: Person- alized federated learning with neural architecture search in edge computing,

    J. Yan, J. Liu, H. Xu, Z. Wang, and C. Qiao, “Peaches: Person- alized federated learning with neural architecture search in edge computing,”IEEE Transactions on Mobile Computing, vol. 23, no. 11, pp. 10 296–10 312, 2024

  35. [36]

    Personalized neural architecture search for federated learning,

    M. Hoang and C. Kingsford, “Personalized neural architecture search for federated learning,”1st NeurIPS Workshop on New Frontiers in Federated Learning (NFFL 2021), 2021

  36. [37]

    Federated model search via reinforcement learning,

    D. Yao, L. Wang, J. Xu, L. Xiang, S. Shao, Y . Chen, and Y . Tong, “Federated model search via reinforcement learning,” in2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS), 2021, pp. 830–840

  37. [38]

    Perfedrlnas: One-for-all personalized federated neural architecture search,

    D. Yao and B. Li, “Perfedrlnas: One-for-all personalized federated neural architecture search,”Proceedings of the AAAI Conference on Artificial Intelligence, vol. 38, no. 15, pp. 16 398–16 406, Mar. 2024. [Online]. Available: https: //ojs.aaai.org/index.php/AAAI/article/view/29576

  38. [39]

    Col- laborative neural architecture search for personalized federated learning,

    Y . Liu, S. Guo, J. Zhang, Z. Hong, Y . Zhan, and Q. Zhou, “Col- laborative neural architecture search for personalized federated learning,”IEEE Transactions on Computers, vol. 74, no. 1, pp. 250–262, 2025

  39. [40]

    Heterogeneity-aware personalized federated neural architecture search,

    A. Yang and Y . Liu, “Heterogeneity-aware personalized federated neural architecture search,”Entropy, vol. 27, no. 7, 2025. [Online]. Available: https://www.mdpi.com/1099-4300/27/7/759

  40. [41]

    Fjord: Fair and accurate federated learning under heterogeneous targets with ordered dropout,

    S. Horvath, S. Laskaridis, M. Almeida, I. Leontiadis, S. Venieris, and N. Lane, “Fjord: Fair and accurate federated learning under heterogeneous targets with ordered dropout,”Advances in Neural Information Processing Systems, vol. 34, pp. 12 876–12 889, 2021

  41. [42]

    Energy-friendly federated neural architecture search for industrial cyber-physical systems,

    X. Wang, Y . Zhao, C. Qiu, F. Gao, Z. Zhao, H. Yao, and X. Li, “Energy-friendly federated neural architecture search for industrial cyber-physical systems,”IEEE Journal on Selected Areas in Communications, vol. 43, no. 10, pp. 3502–3518, 2025

  42. [43]

    Finch: Enhancing federated learning with hierarchical neural architec- ture search,

    J. Liu, J. Yan, H. Xu, Z. Wang, J. Huang, and Y . Xu, “Finch: Enhancing federated learning with hierarchical neural architec- ture search,”IEEE Transactions on Mobile Computing, vol. 23, no. 5, pp. 6012–6026, 2024

  43. [44]

    Resource-aware federated neural architecture search over heterogeneous mobile devices,

    J. Yuan, M. Xu, Y . Zhao, K. Bian, G. Huang, X. Liu, and S. Wang, “Resource-aware federated neural architecture search over heterogeneous mobile devices,”IEEE Transactions on Big Data, pp. 1–11, 2022

  44. [45]

    Regularized evolution for image classifier architecture search,

    E. Real, A. Aggarwal, Y . Huang, and Q. V . Le, “Regularized evolution for image classifier architecture search,”Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, no. 01, pp. 4780–4789, Jul. 2019. [Online]. Available: https://ojs.aaai.org/index.php/AAAI/article/view/4405

  45. [46]

    Large language models enhanced personalized graph neural architecture search in federated learning,

    H. Fang, Y . Gao, P. Zhang, J. Yao, H. Chen, and H. Wang, “Large language models enhanced personalized graph neural architecture search in federated learning,”Proceedings of the AAAI Conference on Artificial Intelligence, vol. 39, no. 16, pp. 16 514–16 522, Apr. 2025. [Online]. Available: https://ojs.aaai.org/index.php/AAAI/article/view/33814

  46. [47]

    Fl-nas: Towards fairness of nas for resource constrained devices via large language models,

    R. Qin, Y . Hu, Z. Yan, J. Xiong, A. Abbasi, and Y . Shi, “Fl-nas: Towards fairness of nas for resource constrained devices via large language models,” inProceedings of the 29th Asia and South Pacific Design Automation Conference, ser. ASPDAC ’24. IEEE Press, 2024, p. 429–434. [Online]. Available: https://doi.org/10.1109/ASP-DAC58780.2024.10473847

  47. [48]

    Evolutionary neural architecture search by mutual information analysis,

    S. Namekawa and T. Tezuka, “Evolutionary neural architecture search by mutual information analysis,” in2021 IEEE Congress on Evolutionary Computation (CEC), 2021, pp. 966–972

  48. [49]

    Feddc: Federated learning with non-iid data via local drift decoupling and correction,

    L. Gao, H. Fu, L. Li, Y . Chen, M. Xu, and C.-Z. Xu, “Feddc: Federated learning with non-iid data via local drift decoupling and correction,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2022, pp. 10 112–10 121

  49. [50]

    Measuring the effects of non-identical data distribution for federated visual classification,

    T.-M. H. Hsu, H. Qi, and M. Brown, “Measuring the effects of non-identical data distribution for federated visual classification,”

  50. [51]

    Available: https://arxiv.org/abs/1909.06335

    [Online]. Available: https://arxiv.org/abs/1909.06335