arxiv: 2605.09813 · v1 · submitted 2026-05-10 · 💻 cs.NI · cs.DC· cs.LG· cs.SY· eess.SY

Recognition: 2 theorem links

· Lean Theorem

Optimizing Server Placement for Vertical Federated Learning in Dynamic Edge/Fog Networks

H. Vincent Poor, Mung Chiang, Su Wang

Authors on Pith no claims yet

Pith reviewed 2026-05-12 02:38 UTC · model grok-4.3

classification 💻 cs.NI cs.DCcs.LGcs.SYeess.SY

keywords vertical federated learningdynamic edge networksserver placementresource optimizationmixed-integer signomial programedge computingfederated learningmachine learning

0 comments

The pith

Server-controlled vertical federated learning establishes stationary points each round to jointly optimize placement, power, frequency, and iterations in dynamic edge networks.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops SC-DN for vertical federated learning in networks where devices hold separate data features that can permanently enter or exit. It proves a global first-order stationary point exists for every training round and uses the result to optimize server placement, device transmit power, processor frequency, and local iterations together. This joint control reduces resource use while preserving model quality. A reader would care because heterogeneous and changing edge devices make standard vertical federated learning costly, and the method claims measurable gains in performance and efficiency on image and multi-modal data over greedy baselines.

Core claim

In dynamic edge/fog networks with heterogeneous data features, the SC-DN methodology establishes the existence of a global first-order stationary point for every global round of vertical federated learning, then formulates a joint optimization over server placement, device-to-server transmit power, local device processor frequency, and local training iterations per round as a mixed-integer signomial program, for which a general solver is developed, yielding superior classification and regression performance with lower resource consumption than greedy approaches.

What carries the argument

The existence proof for a global first-order stationary point per round, which enables the mixed-integer signomial program that couples server placement with transmit power, processor frequency, and local iterations for joint optimization.

Load-bearing premise

The dynamic network model of permanent feature entry and exit, together with the guaranteed global first-order stationary point each round, continues to hold under realistic hardware heterogeneity and channel variations.

What would settle it

A controlled experiment in which devices change features unpredictably while running the SC-DN solver, then checking whether the stationary-point condition fails or the reported performance and resource gains disappear relative to baselines.

Figures

Figures reproduced from arXiv: 2605.09813 by H. Vincent Poor, Mung Chiang, Su Wang.

**Figure 31.** Figure 31: 0 10 20 30 40 50 Network Devices (N) 0 20 40 60 80 100 120 Average Total Runtime (s) Optimization Wall-Clock Runtime Linear fit: 2.47N - 1.80 Measured (mean ± std) FIGURE 33: Wall-clock runtime of the per-round optimization solver (i.e., Algorithm 3) as a function of the number of network devices, N. Points denote the mean ± one standard deviation over 10 independent runs. The dashed line shows a linear f… view at source ↗

read the original abstract

We investigate the control and optimization of vertical federated learning (VFL), a class of distributed machine learning (ML) methods in which edge/fog devices contain separate data features, in dynamic edge/fog networks. Owing to heterogeneous data features and hardware across edge/fog networks, devices' contributions to VFL vary substantially, and, moreover, dynamic edge/fog networks can lead to the permanent exit or entry of select data features. In this setting, our proposed methodology, server controlled VFL in dynamic networks (SC-DN), first establishes the existence of a global first-order stationary point for every global round, and then leverages this result to jointly optimize ML model training and resource consumption based on four key control variables: (i) server placement, (ii) device-to-server transmit power, (iii) local device processor frequency, and (iv) local training iterations per global round. The resulting optimization formulation contains coupled variables as well as numerous forms of logarithmic constraints which we show is a mixed-integer signomial program, an NP-hard problem, and for which we develop a general solver. Finally, via experiments on both image and multi-modal datasets, we show that our methodology demonstrates superior classification/regression performance and resource consumption savings than even greedy methodologies.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

SC-DN tries to optimize server placement and resources for vertical FL in networks with feature churn by proving per-round stationarity and solving a mixed-integer signomial program, but the proof's handling of changing loss landscapes needs checking.

read the letter

The paper's core idea is to prove that a global first-order stationary point exists for every round of vertical federated learning even as features enter and exit, then use that fact to jointly optimize server placement, transmit power, device processor frequency, and local training iterations through a mixed-integer signomial program. This approach is new in how it links the stationarity result directly to the resource allocation problem in a dynamic network. The authors cast the coupled variables and log constraints into an NP-hard form and supply a general solver for it. Their experiments on image and multi-modal datasets report better performance and lower resource use than greedy baselines. That combination of theory and joint optimization is what stands out. The main soft spot is whether the stationarity proof survives the feature churn. Permanent changes to the data features alter the loss function and its properties from round to round. The proof must therefore work with updated dimensions and constants each time rather than assuming a fixed model. The abstract states the result but offers no sketch of how the argument accommodates the varying landscape, so it is not clear if the subsequent optimization rests on solid ground. The experimental section also lacks specifics on dataset scale, the exact churn model, and any error bars or statistical tests. Without those, the reported gains are hard to interpret or reproduce. This paper would interest researchers focused on edge computing and federated learning who deal with heterogeneous and changing device participation. Someone looking for concrete optimization formulations that include both ML training and network control would find the signomial program useful. It is worth sending to peer review because the problem is practical and the technical framing is specific enough for referees to evaluate the claims in detail. I would recommend accepting it for review.

Referee Report

1 major / 2 minor

Summary. The paper proposes SC-DN for vertical federated learning in dynamic edge/fog networks with heterogeneous devices and permanent feature entry/exit. It claims to establish existence of a global first-order stationary point for every global round, then jointly optimizes server placement, device-to-server transmit power, local processor frequency, and local training iterations per round. The resulting problem is cast as a mixed-integer signomial program (NP-hard) for which a general solver is developed. Experiments on image and multi-modal datasets report superior classification/regression accuracy and resource savings versus greedy baselines.

Significance. If the stationary-point result is valid under feature dynamics and the solver yields reliable solutions, the work could enable more practical and efficient VFL deployments in resource-constrained, time-varying edge environments by explicitly trading off model performance against communication and computation costs.

major comments (1)

[Theoretical Analysis section] Theoretical Analysis section: the existence proof for a global first-order stationary point per round must explicitly accommodate permanent feature entry/exit, which alters the loss landscape, gradient structure, input dimension, and any Lipschitz/smoothness constants. If the argument treats the feature set or model dimension as fixed across rounds, the lemma does not carry over to the claimed dynamic setting and the subsequent optimization rests on an invalid premise.

minor comments (2)

[Abstract and Experiments] Abstract and experimental sections: dataset sizes, number of runs, and error bars (or statistical significance) for the reported performance and resource gains are not provided, preventing verification of the claimed superiority.
[Experiments section] Experiments section: comparisons are restricted to greedy methodologies; additional baselines (static placement, other heuristics, or non-optimized VFL) would better substantiate the gains from the joint optimization.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their thorough review and constructive comments on our manuscript. We address the major comment regarding the theoretical analysis below, and we believe the concerns can be resolved with clarifications and minor revisions.

read point-by-point responses

Referee: [Theoretical Analysis section] Theoretical Analysis section: the existence proof for a global first-order stationary point per round must explicitly accommodate permanent feature entry/exit, which alters the loss landscape, gradient structure, input dimension, and any Lipschitz/smoothness constants. If the argument treats the feature set or model dimension as fixed across rounds, the lemma does not carry over to the claimed dynamic setting and the subsequent optimization rests on an invalid premise.

Authors: The proof establishes the existence of a global first-order stationary point for each global round separately, with the feature set, model dimension, and associated constants (such as Lipschitz and smoothness) fixed within that round. Permanent feature entry/exit occurs between rounds, changing the landscape for the next round, but the per-round analysis holds for the current configuration. The subsequent optimization is performed per round using the current variables. To make this explicit, we will revise the Theoretical Analysis section to include a statement clarifying the per-round independence and the handling of dynamic feature sets. This addresses the concern without invalidating the premise. revision: partial

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper claims to first establish existence of a global first-order stationary point per round as an independent lemma, then leverage it for joint optimization of server placement, power, frequency, and iterations via the mixed-integer signomial program. No quoted equations or self-citations reduce the stationary-point result to a definition in terms of the optimized variables, fitted inputs renamed as predictions, or a closed self-citation loop. The dynamic feature entry/exit is modeled explicitly in the problem statement without the proof assuming fixed dimensions that would make the result tautological. The derivation chain is therefore self-contained and does not reduce to its inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review prevents enumeration of specific free parameters or axioms; no invented entities are mentioned.

pith-pipeline@v0.9.0 · 5537 in / 1211 out tokens · 57099 ms · 2026-05-12T02:38:00.684298+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

first establishes the existence of a global first-order stationary point for every global round, and then leverages this result to jointly optimize ML model training and resource consumption based on four key control variables: (i) server placement, (ii) device-to-server transmit power, (iii) local device processor frequency, and (iv) local training iterations per global round
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Assumption 1 (Smoothness). The gradients for loss functions ℓ(·) are Lipschitz continuous... ∥∇ℓ(Θ(r)₁)−∇ℓ(Θ(r)₂)∥≤Lʳ∥Θ(r)₁−Θ(r)₂∥

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

83 extracted references · 83 canonical work pages

[1]

Communication-efficient learning of deep networks from decentralized data,

B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. y Arcas, “Communication-efficient learning of deep networks from decentralized data,” inProceedings of the 20th International Conference on Artificial Intelligence and Statistics. PMLR, 2017, pp. 1273–1282

work page 2017
[2]

Federated machine learning: Concept and applications,

Q. Yang, Y . Liu, T. Chen, and Y . Tong, “Federated machine learning: Concept and applications,”ACM Transactions on Intelligent Systems and Technology (TIST), vol. 10, no. 2, pp. 1–19, 2019

work page 2019
[3]

Adaptive federated learning in resource constrained edge com- puting systems,

S. Wang, T. Tuor, T. Salonidis, K. K. Leung, C. Makaya, T. He, and K. Chan, “Adaptive federated learning in resource constrained edge com- puting systems,”IEEE Journal on Selected Areas in Communications, vol. 37, no. 6, pp. 1205–1221, 2019

work page 2019
[4]

Vertical federated learning: Concepts, advances, and challenges,

Y . Liu, Y . Kang, T. Zou, Y . Pu, Y . He, X. Ye, Y . Ouyang, Y .-Q. Zhang, and Q. Yang, “Vertical federated learning: Concepts, advances, and challenges,”IEEE Transactions on Knowledge and Data Engineering, vol. 36, no. 7, pp. 3615–3634, 2024

work page 2024
[5]

Fault-tolerant vertical federated learning on dynamic networks,

S. Ganguli, Z. Zhou, C. G. Brinton, and D. I. Inouye, “Fault-tolerant vertical federated learning on dynamic networks,” arXiv:2312.16638, 2023

work page arXiv 2023
[6]

Flexible vertical federated learning with heterogeneous parties,

T. Castiglia, S. Wang, and S. Patterson, “Flexible vertical federated learning with heterogeneous parties,”IEEE Transactions on Neural Networks and Learning Systems, 2023, to appear

work page 2023
[7]

Attribute-distributed learning: Models, limits, and algorithms,

H. Zheng, S. R. Kulkarni, and H. V . Poor, “Attribute-distributed learning: Models, limits, and algorithms,”IEEE Transactions on Signal Process- ing, vol. 59, no. 1, pp. 386–398, 2010

work page 2010
[8]

Toward cooperative federated learning over heterogeneous edge/fog networks,

S. Wang, S. Hosseinalipour, V . Aggarwal, C. G. Brinton, D. J. Love, W. Su, and M. Chiang, “Toward cooperative federated learning over heterogeneous edge/fog networks,”IEEE Communications Magazine, vol. 61, no. 12, pp. 54–60, 2023

work page 2023
[9]

Asyn- chronous multi-model dynamic federated learning over wireless net- works: Theory, modeling, and optimization,

Z.-L. Chang, S. Hosseinalipour, M. Chiang, and C. G. Brinton, “Asyn- chronous multi-model dynamic federated learning over wireless net- works: Theory, modeling, and optimization,”IEEE Transactions on Cognitive Communications and Networking, 2024, to appear

work page 2024
[10]

Communication-efficient multimodal federated learning: Joint modality and client selection,

L. Yuan, D.-J. Han, S. Wang, D. Upadhyay, and C. G. Brinton, “Communication-efficient multimodal federated learning: Joint modality and client selection,” arXiv:2401.16685, 2024

work page arXiv 2024
[11]

Adaptive vertical federated learning on unbalanced features,

J. Zhang, S. Guo, Z. Qu, D. Zeng, H. Wang, Q. Liu, and A. Y . Zomaya, “Adaptive vertical federated learning on unbalanced features,”IEEE Transactions on Parallel and Distributed Systems, vol. 33, no. 12, pp. 4006–4018, 2022

work page 2022
[12]

A unified solution for privacy and communication efficiency in vertical federated learning,

G. Wang, B. Gu, Q. Zhang, X. Li, B. Wang, and C. X. Ling, “A unified solution for privacy and communication efficiency in vertical federated learning,”Advances in Neural Information Processing Systems, vol. 36, 2024

work page 2024
[13]

Compressed-vfl: Communication-efficient learning with vertically partitioned data,

T. J. Castiglia, A. Das, S. Wang, and S. Patterson, “Compressed-vfl: Communication-efficient learning with vertically partitioned data,” in Proceedings of the 39th International Conference on Machine Learning. PMLR, 2022, pp. 2738–2766

work page 2022
[14]

6G internet of things: A comprehensive survey,

D. C. Nguyen, M. Ding, P. N. Pathirana, A. Seneviratne, J. Li, D. Niyato, O. Dobre, and H. V . Poor, “6G internet of things: A comprehensive survey,”IEEE Internet of Things Journal, vol. 9, no. 1, pp. 359–383, 2021

work page 2021
[15]

UA V-assisted online machine learning over multi-tiered networks: A hierarchical nested personalized federated learning approach,

S. Wang, S. Hosseinalipour, M. Gorlatova, C. G. Brinton, and M. Chi- ang, “UA V-assisted online machine learning over multi-tiered networks: A hierarchical nested personalized federated learning approach,”IEEE Transactions on Network and Service Management, vol. 20, no. 2, pp. 1847–1865, 2022

work page 2022
[16]

Towards flexi- ble device participation in federated learning,

Y . Ruan, X. Zhang, S.-C. Liang, and C. Joe-Wong, “Towards flexi- ble device participation in federated learning,” inProceedings of The 24th International Conference on Artificial Intelligence and Statistics. PMLR, 2021, pp. 3403–3411

work page 2021
[17]

Forest fire detection system using wireless sensor networks and machine learning,

U. Dampage, L. Bandaranayake, R. Wanasinghe, K. Kottahachchi, and B. Jayasanka, “Forest fire detection system using wireless sensor networks and machine learning,”Scientific Reports, vol. 12, no. 1, p. 46, 2022

work page 2022
[18]

Wireless sensing networks for environmental monitoring: Two case studies from tropical forests,

C. Rankine, M. M. do Espirito Santo, R. Fatland, M. Garciaet al., “Wireless sensing networks for environmental monitoring: Two case studies from tropical forests,” inProceedings of the Seventh IEEE International Conference on eScience. IEEE, 2011, pp. 70–76

work page 2011
[19]

Opportunities and challenges of wireless sensor networks in smart grid,

V . C. Gungor, B. Lu, and G. P. Hancke, “Opportunities and challenges of wireless sensor networks in smart grid,”IEEE Transactions on Industrial Electronics, vol. 57, no. 10, pp. 3557–3564, 2010

work page 2010
[20]

Toward resilient modern power systems: From single-domain to cross-domain resilience enhancement,

H. Huang, H. V . Poor, K. R. Davis, T. J. Overbye, A. Layton, A. E. Goulart, and S. Zonouz, “Toward resilient modern power systems: From single-domain to cross-domain resilience enhancement,”Proceedings of the IEEE, vol. 112, no. 4, pp. 365–398, 2024

work page 2024
[21]

Multi-source to multi- target decentralized federated domain adaptation,

S. Wang, S. Hosseinalipour, and C. G. Brinton, “Multi-source to multi- target decentralized federated domain adaptation,”IEEE Transactions on Cognitive Communications and Networking, no. 3, pp. 1011–1025, 2024

work page 2024
[22]

Taming subnet-drift in d2d- enabled fog learning: A hierarchical gradient tracking approach,

E. Chen, S. Wang, and C. G. Brinton, “Taming subnet-drift in d2d- enabled fog learning: A hierarchical gradient tracking approach,” inPro- ceedings of the 2024 IEEE Conference on Computer Communications. IEEE, 2024, pp. 2438–2447

work page 2024
[23]

Efficient coordination of federated learning and inference offloading at the edge: A proactive optimization paradigm,

K. Luo, K. Zhao, T. Ouyang, X. Zhang, Z. Zhou, H. Wang, and X. Chen, “Efficient coordination of federated learning and inference offloading at the edge: A proactive optimization paradigm,”IEEE Transactions on Mobile Computing, 2024, to appear

work page 2024
[24]

Federated learning over wireless networks: Optimization model design and analysis,

N. H. Tran, W. Bao, A. Zomaya, M. N. Nguyen, and C. S. Hong, “Federated learning over wireless networks: Optimization model design and analysis,” inProceedings of the 2019 IEEE Conference on Computer Communications. IEEE, 2019, pp. 1387–1395. 20

work page 2019
[25]

Federated learning meets multi-objective optimization,

Z. Hu, K. Shaloudegi, G. Zhang, and Y . Yu, “Federated learning meets multi-objective optimization,”IEEE Transactions on Network Science and Engineering, vol. 9, no. 4, pp. 2039–2051, 2022

work page 2039
[26]

Min-max cost optimization for efficient hierarchical federated learning in wireless edge networks,

J. Feng, L. Liu, Q. Pei, and K. Li, “Min-max cost optimization for efficient hierarchical federated learning in wireless edge networks,”IEEE Transactions on Parallel and Distributed Systems, vol. 33, no. 11, pp. 2687–2700, 2021

work page 2021
[27]

Incentives in federated learning: Equilibria, dynamics, and mechanisms for welfare maximization,

A. Murhekar, Z. Yuan, B. Ray Chaudhury, B. Li, and R. Mehta, “Incentives in federated learning: Equilibria, dynamics, and mechanisms for welfare maximization,”Advances in Neural Information Processing Systems, vol. 36, 2024

work page 2024
[28]

Network-aware optimization of distributed learning for fog computing,

S. Wang, Y . Ruan, Y . Tu, S. Wagle, C. G. Brinton, and C. Joe-Wong, “Network-aware optimization of distributed learning for fog computing,” IEEE/ACM Transactions on Networking, vol. 29, no. 5, pp. 2019–2032, 2021

work page 2019
[29]

Anarchic federated learning,

H. Yang, X. Zhang, P. Khanduri, and J. Liu, “Anarchic federated learning,” inProceedings of the 39th International Conference on Machine Learning. PMLR, 2022, pp. 25 331–25 363

work page 2022
[30]

Stochastic client selection for federated learning with volatile clients,

T. Huang, W. Lin, L. Shen, K. Li, and A. Y . Zomaya, “Stochastic client selection for federated learning with volatile clients,”IEEE Internet of Things Journal, vol. 9, no. 20, pp. 20 055–20 070, 2022

work page 2022
[31]

Fast federated learning in the presence of arbitrary device unavailability,

X. Gu, K. Huang, J. Zhang, and L. Huang, “Fast federated learning in the presence of arbitrary device unavailability,”Advances in Neural Information Processing Systems, vol. 34, pp. 12 052–12 064, 2021

work page 2021
[32]

Ro- bust federated learning with connectivity failures: A semi-decentralized framework with collaborative relaying,

M. Yemini, R. Saha, E. Ozfatura, D. G ¨und¨uz, and A. J. Goldsmith, “Ro- bust federated learning with connectivity failures: A semi-decentralized framework with collaborative relaying,” arXiv:2202.11850, 2022

work page arXiv 2022
[33]

Communication efficient distributed learning with feature partitioned data,

B. Zhang, J. Geng, W. Xu, and L. Lai, “Communication efficient distributed learning with feature partitioned data,” inProceedings of the 52nd Annual Conference on Information Sciences and Systems (CISS). IEEE, 2018, pp. 1–6

work page 2018
[34]

VF-PS: How to select important participants in vertical feder- ated learning, efficiently and securely?

J. Jiang, L. Burkhalter, F. Fu, B. Ding, B. Du, A. Hithnawi, B. Li, and C. Zhang, “VF-PS: How to select important participants in vertical feder- ated learning, efficiently and securely?”Advances in Neural Information Processing Systems, vol. 35, pp. 2088–2101, 2022

work page 2088
[35]

LESS-VFL: Communication-efficient feature selection for vertical federated learning,

T. Castiglia, Y . Zhou, S. Wang, S. Kadhe, N. Baracaldo, and S. Patter- son, “LESS-VFL: Communication-efficient feature selection for vertical federated learning,” inProceedings of the 40th International Conference on Machine Learning. PMLR, 2023, pp. 3757–3781

work page 2023
[36]

Fedsdg-fs: Efficient and secure feature selection for vertical federated learning,

A. Li, H. Peng, L. Zhang, J. Huang, Q. Guo, H. Yu, and Y . Liu, “Fedsdg-fs: Efficient and secure feature selection for vertical federated learning,” inProceedings of the 2023 IEEE Conference on Computer Communications. IEEE, 2023, pp. 1–10

work page 2023
[37]

V AFL: A method of vertical asynchronous federated learning,

T. Chen, X. Jin, Y . Sun, and W. Yin, “V AFL: A method of vertical asynchronous federated learning,” inProceedings of the 2020 ICML Workshop on Federated Learning for User Privacy and Data Confiden- tiality, July 2020

work page 2020
[38]

Learning multiple layers of features from tiny images,

A. Krizhevsky, G. Hintonet al., “Learning multiple layers of features from tiny images,” 2009

work page 2009
[39]

Distributed lifetime optimization in wireless sensor networks,

J. M. Bahi, M. Haddad, M. Hakem, and H. Kheddouci, “Distributed lifetime optimization in wireless sensor networks,” inProceedings of the 2011 IEEE International Conference on High Performance Computing and Communications. IEEE, 2011, pp. 432–439

work page 2011
[40]

Failure data analysis with extended weibull distri- bution,

T. Zhang and M. Xie, “Failure data analysis with extended weibull distri- bution,”Communications in Statistics—Simulation and Computation®, vol. 36, no. 3, pp. 579–592, 2007

work page 2007
[41]

A primer on spatial modeling and analysis in wireless networks,

J. G. Andrews, R. K. Ganti, M. Haenggi, N. Jindal, and S. Weber, “A primer on spatial modeling and analysis in wireless networks,”IEEE Communications Magazine, vol. 48, no. 11, pp. 156–163, 2010

work page 2010
[42]

A survey of air-to-ground propagation channel modeling for unmanned aerial vehicles,

W. Khawaja, I. Guvenc, D. W. Matolak, U.-C. Fiebig, and N. Schneck- enburger, “A survey of air-to-ground propagation channel modeling for unmanned aerial vehicles,”IEEE Communications Surveys & Tutorials, vol. 21, no. 3, pp. 2361–2391, 2019

work page 2019
[43]

Uwb air-to-ground propagation channel measurements and modeling using uavs,

W. Khawaja, O. Ozdemir, F. Erden, I. Guvenc, and D. W. Matolak, “Uwb air-to-ground propagation channel measurements and modeling using uavs,” inProceedings of the 2019 IEEE Aerospace Conference. IEEE, 2019, pp. 1–10

work page 2019
[44]

Cellular uav-to-x com- munications: Design and optimization for multi-uav networks,

S. Zhang, H. Zhang, B. Di, and L. Song, “Cellular uav-to-x com- munications: Design and optimization for multi-uav networks,”IEEE Transactions on Wireless Communications, vol. 18, no. 2, pp. 1346– 1359, 2019

work page 2019
[45]

Modeling air-to- ground path loss for low altitude platforms in urban environments,

A. Al-Hourani, S. Kandeepan, and A. Jamalipour, “Modeling air-to- ground path loss for low altitude platforms in urban environments,” in Proceedings of the 2014 IEEE Global Communuications Conference. IEEE, 2014, pp. 2898–2904

work page 2014
[46]

Mobile unmanned aerial vehicles (uavs) for energy-efficient internet of things communica- tions,

M. Mozaffari, W. Saad, M. Bennis, and M. Debbah, “Mobile unmanned aerial vehicles (uavs) for energy-efficient internet of things communica- tions,”IEEE Transactions on Wireless Communications, vol. 16, no. 11, pp. 7574–7589, 2017

work page 2017
[47]

Parallel coordinate descent methods for big data optimization,

P. Richt ´arik and M. Tak ´aˇc, “Parallel coordinate descent methods for big data optimization,”Mathematical Programming, vol. 156, pp. 433–484, 2016

work page 2016
[48]

Efficiency of coordinate descent methods on huge-scale optimization problems,

Y . Nesterov, “Efficiency of coordinate descent methods on huge-scale optimization problems,”SIAM Journal on Optimization, vol. 22, no. 2, pp. 341–362, 2012

work page 2012
[49]

Tackling the objective inconsistency problem in heterogeneous federated optimiza- tion,

J. Wang, Q. Liu, H. Liang, G. Joshi, and H. V . Poor, “Tackling the objective inconsistency problem in heterogeneous federated optimiza- tion,”Advances in Neural Information Processing Systems, vol. 33, pp. 7611–7623, 2020

work page 2020
[50]

A proximal stochastic gradient method with progressive variance reduction,

L. Xiao and T. Zhang, “A proximal stochastic gradient method with progressive variance reduction,”SIAM Journal on Optimization, vol. 24, no. 4, pp. 2057–2075, 2014

work page 2057
[51]

On the importance of initialization and momentum in deep learning,

I. Sutskever, J. Martens, G. Dahl, and G. Hinton, “On the importance of initialization and momentum in deep learning,” inProceedings of the 30th International Conference on Machine Learning. PMLR, 2013, pp. 1139–1147

work page 2013
[52]

On the lambert w function,

R. M. Corless, G. H. Gonnet, D. E. Hare, D. J. Jeffrey, and D. E. Knuth, “On the lambert w function,”Advances in Computational Mathematics, vol. 5, pp. 329–359, 1996

work page 1996
[53]

Implicit regularization for deep neural networks driven by an ornstein-uhlenbeck like process,

G. Blanc, N. Gupta, G. Valiant, and P. Valiant, “Implicit regularization for deep neural networks driven by an ornstein-uhlenbeck like process,” inProceedings of the Thirty Third Conference on Learning Theory. PMLR, 2020, pp. 483–513

work page 2020
[54]

How to escape sharp minima with random perturbations,

K. Ahn, A. Jadbabaie, and S. Sra, “How to escape sharp minima with random perturbations,” inProceedings of the 41st International Conference on Machine Learning. PMLR, 2024, pp. 597–618

work page 2024
[55]

Spectral normal- ization for generative adversarial networks,

T. Miyato, T. Kataoka, M. Koyama, and Y . Yoshida, “Spectral normal- ization for generative adversarial networks,” inProceedings of the Sixth International Conference on Learning Representations, 2018

work page 2018
[56]

Improving lipschitz-constrained neural networks by learning activation functions,

S. Ducotterd, A. Goujon, P. Bohra, D. Perdios, S. Neumayer, and M. Unser, “Improving lipschitz-constrained neural networks by learning activation functions,”Journal of Machine Learning Research, vol. 25, no. 65, pp. 1–30, 2024

work page 2024
[57]

arXiv preprint arXiv:1704.00805 , year=

B. Gao and L. Pavel, “On the properties of the softmax function with ap- plication in game theory and reinforcement learning,” arXiv:1704.00805, 2017

work page arXiv 2017
[58]

On lipschitz bounds of general con- volutional neural networks,

D. Zou, R. Balan, and M. Singh, “On lipschitz bounds of general con- volutional neural networks,”IEEE Transactions on Information Theory, vol. 66, no. 3, pp. 1738–1759, 2019

work page 2019
[59]

Federated learning over wireless networks: Convergence analysis and resource allocation,

C. T. Dinh, N. H. Tran, M. N. Nguyen, C. S. Hong, W. Bao, A. Y . Zomaya, and V . Gramoli, “Federated learning over wireless networks: Convergence analysis and resource allocation,”IEEE/ACM Transactions on Networking, vol. 29, no. 1, pp. 398–409, 2020

work page 2020
[60]

Estimating training compute of deep learning models,

J. Sevilla, L. Heim, M. Hobbhahn, T. Besiroglu, A. Ho, and P. Villalobos, “Estimating training compute of deep learning models,” 2022. [Online]. Available: https://epochai.org/blog/estimating-training-compute

work page 2022
[61]

Ai and compute,

D. Amodei and D. Hernandez, “Ai and compute,” 2018. [Online]. Available: https://openai.com/research/ai-and-compute

work page 2018
[62]

M. L. Salby,Fundamentals of Atmospheric Physics. Elsevier, 1996

work page 1996
[63]

F. A. Administration,Aviation Weather Services. Aviation Supplies & Academics, 2001

work page 2001
[64]

Non-linear model predictive control for UA Vs with slung/swung load,

F. Gonzalez, A. Heckmann, S. Notter, M. Zurn, J. Trachte, and A. Mcfadyen, “Non-linear model predictive control for UA Vs with slung/swung load,” inProceedings of the 2015 IEEE International Conference on Robotics and Automation, 2015, pp. 1–1

work page 2015
[65]

Energy-efficient UA V communication with tra- jectory optimization,

Y . Zeng and R. Zhang, “Energy-efficient UA V communication with tra- jectory optimization,”IEEE Transactions on Wireless Communications, vol. 16, no. 6, pp. 3747–3760, 2017

work page 2017
[66]

Energy minimization for wireless communication with rotary-wing UA V,

Y . Zeng, J. Xu, and R. Zhang, “Energy minimization for wireless communication with rotary-wing UA V,”IEEE Transactions on Wireless Communications, vol. 18, no. 4, pp. 2329–2345, 2019

work page 2019
[67]

CVXPY: A Python-embedded modeling lan- guage for convex optimization,

S. Diamond and S. Boyd, “CVXPY: A Python-embedded modeling lan- guage for convex optimization,”Journal of Machine Learning Research, vol. 17, no. 83, pp. 1–5, 2016

work page 2016
[68]

Geometric programming for communication systems,

M. Chiang, “Geometric programming for communication systems,” Foundations and Trends® in Communications and Information Theory, vol. 2, no. 1–2, pp. 1–154, 2005

work page 2005
[69]

Disciplined geometric program- ming,

A. Agrawal, S. Diamond, and S. Boyd, “Disciplined geometric program- ming,”Optimization Letters, vol. 13, no. 5, pp. 961–976, 2019

work page 2019
[70]

Global optimization of signomial geometric programming problems,

G. Xu, “Global optimization of signomial geometric programming problems,”European Journal of Operational Research, vol. 233, no. 3, pp. 500–510, 2014. 21

work page 2014
[71]

Reversed geometric programs treated by harmonic means,

R. J. Duffin and E. L. Peterson, “Reversed geometric programs treated by harmonic means,”Indiana University Mathematics Journal, vol. 22, no. 6, pp. 531–550, 1972

work page 1972
[72]

Some bounds for the logarithmic function,

F. TOPSØE 1, “Some bounds for the logarithmic function,”Inequality Theory and Applications, vol. 4, p. 137, 2007

work page 2007
[73]

Boyd and L

S. Boyd and L. Vandenberghe,Convex Optimization. Cambridge University Press, 2004

work page 2004
[74]

The MNIST database of handwritten digit images for machine learning research,

L. Deng, “The MNIST database of handwritten digit images for machine learning research,”IEEE Signal Processing Magazine, vol. 29, no. 6, pp. 141–142, 2012

work page 2012
[75]

Petfinder.my - pawpularity contest,

A. Howard, M. Jedi, and R. Holbrook, “Petfinder.my - pawpularity contest,” https://kaggle.com/competitions/petfinder-pawpularity-score, 2021, kaggle

work page 2021
[76]

Imagenet classification with deep convolutional neural networks,

A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,”Advances in Neural Informa- tion Processing Systems, vol. 25, 2012

work page 2012
[77]

Loss functions for top-k error: Analysis and insights,

M. Lapin, M. Hein, and B. Schiele, “Loss functions for top-k error: Analysis and insights,” inProceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2016, pp. 1468–1477

work page 2016
[78]

Efficient algorithms for capacitated cloudlet placements,

Z. Xu, W. Liang, W. Xu, M. Jia, and S. Guo, “Efficient algorithms for capacitated cloudlet placements,”IEEE Transactions on Parallel and Distributed Systems, vol. 27, no. 10, pp. 2866–2880, 2015

work page 2015
[79]

Durrett,Probability: Theory and Examples

R. Durrett,Probability: Theory and Examples. Cambridge University Press, 2019, vol. 49

work page 2019
[80]

1 Br Br X i=1 ∇nℓ(Φr n(xi);f n(θ(r,q) n , xi n)|yi)− 1 Br Br X i=1 ∇nℓ(Φr n(xi);f n(θ(r,0) n ,x i n)|yi) 2# (55) (a) ≤ 1 Br Br X i=1 EBr

J. Stewart,Calculus, 8th ed. Cengage Learning, 2015. 22 APPENDIX TABLE OFCONTENTS Appendix A: Proof of Lemma 1 23 Appendix B: Proof of Proposition 1 25 Appendix C: Proof of Theorem 1 28 Appendix D: Solution to Optimization 34 D-A Geometric Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ....

work page 2015

Showing first 80 references.