Resource-Element Energy Difference for Noncoherent Over-the-Air Federated Learning

Hao Chen; Zavareh Bozorgasl

arxiv: 2605.07263 · v2 · pith:RQKZDCSKnew · submitted 2026-05-08 · 📡 eess.SP · cs.AI· cs.DC· cs.LG· stat.ML

Resource-Element Energy Difference for Noncoherent Over-the-Air Federated Learning

Hao Chen , Zavareh Bozorgasl This is my paper

Pith reviewed 2026-05-20 23:27 UTC · model grok-4.3

classification 📡 eess.SP cs.AIcs.DCcs.LGstat.ML

keywords over-the-air federated learningnoncoherent aggregationresource element energy differencesigned model updatesRayleigh fadingenergy detectionchip diversity

0 comments

The pith

Resource-element energy difference lets noncoherent OTA federated learning aggregate signed updates by subtracting energies from paired resource elements.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces resource-element energy difference (REED) to enable signed aggregation in over-the-air federated learning without phase synchronization or instantaneous channel state information. Positive and negative parts of each real-valued client update are mapped to transmit energies on a pair of orthogonal resource elements; the receiver estimates the signed sum simply by subtracting the two received energies after slow-timescale calibration of average channel powers. Exact first- and second-moment expressions are derived for both a single-shot version and a chip-diverse version that spreads each coordinate across multiple independently faded pairs, separating the contributions of fading self-noise, signal-noise interaction, and receiver noise. This construction removes the need for channel inversion or coherent combining while making the variance scaling with the number of chips explicit.

Core claim

By transmitting the positive and negative parts of a real-valued update on two orthogonal resource elements and subtracting the corresponding received energies, REED recovers an unbiased estimate of the signed aggregate using only slow-timescale average channel-power calibration and no instantaneous CSI or phase alignment. For independent Rayleigh fading the paper supplies closed-form expressions for the mean and variance of both the single-shot estimator and the chip-diverse extension, showing how the three variance components trade off against the number of chips per coordinate.

What carries the argument

Resource-element energy difference (REED): the mapping of the positive and negative parts of each real-valued update onto transmit energies on a pair of orthogonal resource elements, followed by subtraction of the two received energies after average-power calibration.

If this is right

Noncoherent OTA-FL can now handle signed real-valued model updates without transmitter or receiver CSI.
Chip diversity provides an explicit resource-versus-variance trade-off by spreading each coordinate across independently faded pairs.
Variance laws isolate fading self-noise, signal-noise interaction, and receiver-noise terms, allowing separate optimization of each.
The same energy-difference primitive can be used for any real-valued linear aggregation task over a noncoherent multiple-access channel.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

REED could be combined with existing power-control schemes that operate only on average channel gains to further reduce total transmit energy.
The chip-diverse construction suggests a natural extension to frequency-selective or time-varying channels where each chip sees a different average power.
Testing REED in a real testbed with measured Rayleigh-like fading would reveal whether calibration drift or hardware non-idealities dominate the predicted variance terms.

Load-bearing premise

The exact moment derivations require independent Rayleigh fading across the paired resource elements and accurate slow-timescale calibration of average channel powers that does not itself introduce bias into the single-shot or chip-diverse estimators.

What would settle it

Measure the empirical bias and variance of the signed-sum estimator under controlled independent Rayleigh fading; if the observed statistics deviate systematically from the closed-form first- and second-moment expressions once average powers are calibrated, the central claim is falsified.

Figures

Figures reproduced from arXiv: 2605.07263 by Hao Chen, Zavareh Bozorgasl.

**Figure 1.** Figure 1: Overview of the proposed REED-enabled OTA-FL system. The server broadcasts the current global model to the clients, each client computes a [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗

**Figure 1.** Figure 1: REED-enabled OTA-FL workflow. Clients compute local FedAvg increments, transmit positive and negative coordinate parts over paired resource [PITH_FULL_IMAGE:figures/full_fig_p005_1.png] view at source ↗

**Figure 2.** Figure 2: The CNN architecture used in the experiments for both MNIST and Fashion-MNIST. The model consists of two Conv–ReLU–MaxPool blocks [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗

**Figure 2.** Figure 2: REED with M = 1 on MNIST at −10 dB effective receive SNR. REED with M = 1 remains close to clean FedAvg, with a round-100 gap of 0.26 percentage points. Under Dirichlet α = 0.3, the M = 1 gap increases to 3.17 percentage points, creating a setting in which the variance-reduction effect of chip diversity can be observed. Therefore, we use Fashion-MNIST for the chip-diversity study below. C. Effect of Chip D… view at source ↗

**Figure 3.** Figure 3: Test accuracy versus communication round for clean FedAvg, coherent CSIT aggregation, and REED at [PITH_FULL_IMAGE:figures/full_fig_p009_3.png] view at source ↗

**Figure 4.** Figure 4: Chip-diverse REED on Fashion-MNIST at −10 dB effective receive SNR. V. CONCLUSION This paper proposed REED, a simple paired-energy method for noncoherent signed aggregation in OTA-FL. By transmitting the positive and negative parts of each scalar update over paired orthogonal resource elements, REED forms signed aggregates from energy differences without instantaneous CSIT/CSIR, channel inversion, artific… view at source ↗

**Figure 4.** Figure 4: Late-round accuracy gap relative to clean FedAvg, [PITH_FULL_IMAGE:figures/full_fig_p010_4.png] view at source ↗

**Figure 5.** Figure 5: Round-100 accuracy gap relative to clean FedAvg. Negative values indicate degradation relative to ideal aggregation. [PITH_FULL_IMAGE:figures/full_fig_p011_5.png] view at source ↗

read the original abstract

Over-the-air federated learning (OTA-FL) reduces uplink latency by aggregating client updates directly over the wireless multiple-access channel. Coherent analog aggregation realizes this idea by aligning the phases and amplitudes of simultaneously transmitted waveforms, which typically requires synchronization, instantaneous channel-state information (CSI), phase compensation, and power control. Noncoherent energy detection removes the need for phase-coherent combining, but a single energy measurement is nonnegative and, therefore, cannot represent signed model updates. This paper introduces resource-element energy difference (REED), a noncoherent physical-layer primitive for continuous signed aggregation. REED maps the positive and negative parts of each real-valued update to transmit energies on paired orthogonal resource elements and estimates the signed sum by subtracting the corresponding received energies. The construction uses slow-timescale calibration of average channel powers, but does not require instantaneous transmitter- or receiver-side CSI or channel inversion. For independent Rayleigh fading, we derive exact first- and second-moment expressions for single-shot REED and for a chip-diverse extension that spreads each coordinate over multiple independently faded paired chips. The resulting variance laws separate fading-induced self-noise, signal-noise interaction, and receiver-noise fluctuation, giving an explicit diversity-resource tradeoff. More->The rest of abstract is in the paper.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

REED gives a workable noncoherent way to handle signed updates in OTA federated learning by subtracting energies on paired resource elements, backed by clean moment derivations under Rayleigh fading.

read the letter

The core contribution is a new physical-layer primitive that lets the server recover a signed sum from energy measurements alone. Positive and negative parts of each update get mapped to transmit power on two orthogonal resource elements; the receiver subtracts the two received energies after slow-timescale power calibration. No instantaneous CSI or phase alignment is required, which removes a major practical hurdle in coherent OTA aggregation schemes.

Referee Report

1 major / 3 minor

Summary. The paper introduces the Resource-Element Energy Difference (REED) technique for noncoherent over-the-air federated learning. It enables signed aggregation of real-valued model updates by mapping positive and negative components to energies on paired orthogonal resource elements and estimating the difference from received energies. The approach relies on slow-timescale average channel power calibration without requiring instantaneous CSI or channel inversion. Exact first- and second-moment expressions are derived for both single-shot and chip-diverse REED estimators under independent Rayleigh fading, separating contributions from fading self-noise, signal-noise interaction, and receiver noise to illustrate the diversity-resource tradeoff.

Significance. Should the analytical derivations hold, this work offers a practical noncoherent alternative to coherent OTA-FL methods, potentially reducing synchronization and CSI overhead in wireless federated learning systems. The explicit variance expressions and diversity analysis provide a solid foundation for performance evaluation and system optimization in this area.

major comments (1)

[§IV] §IV (Performance Analysis), second-moment derivation for chip-diverse estimator: The variance expressions are obtained under the explicit assumption of independent Rayleigh fading on each paired resource element. Correlated fading (common for frequency-adjacent REs) would modify the cross terms E[|h_i|^2 |h_j|^2] and increase the effective variance, which directly affects the claimed diversity-resource tradeoff and should be quantified or bounded.

minor comments (3)

[Abstract] Abstract: the text ends abruptly with 'More->The rest of abstract is in the paper.'; the complete abstract should be provided.
[Notation] Notation section: define the positive/negative splitting operators (x^+ , x^-) at first appearance and ensure consistent use throughout the moment derivations.
[Figure 3] Figure 3 (or equivalent variance plot): add explicit labels or a legend entry clarifying which curves correspond to single-shot vs. chip-diverse REED.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive comment and positive overall assessment of the work. We address the single major comment below.

read point-by-point responses

Referee: [§IV] §IV (Performance Analysis), second-moment derivation for chip-diverse estimator: The variance expressions are obtained under the explicit assumption of independent Rayleigh fading on each paired resource element. Correlated fading (common for frequency-adjacent REs) would modify the cross terms E[|h_i|^2 |h_j|^2] and increase the effective variance, which directly affects the claimed diversity-resource tradeoff and should be quantified or bounded.

Authors: We agree that the derivations in Section IV are obtained under the assumption of independent Rayleigh fading across the paired resource elements, which is stated explicitly in the manuscript. Correlated fading between frequency-adjacent REs would indeed alter the cross term E[|h_i|^2 |h_j|^2] from its independent value of 1 to 1 + ρ (where ρ is the power correlation coefficient for unit-mean exponentials), thereby increasing the effective variance and softening the diversity-resource tradeoff. To address this point, we will add a short remark in Section IV that (i) recalls the independence assumption, (ii) gives the modified cross term for correlated Rayleigh fading, and (iii) provides a simple upper bound on the variance inflation as a function of ρ together with a brief discussion of its impact on the tradeoff for typical urban/suburban correlation values. This addition will be included in the revised manuscript. revision: yes

Circularity Check

0 steps flagged

Derivation self-contained from Rayleigh model and REED definition

full rationale

The paper performs direct probabilistic derivations of first- and second-moment expressions for the single-shot and chip-diverse REED estimators starting from the independent Rayleigh fading model on paired orthogonal resource elements together with the explicit positive/negative energy mapping. These calculations separate fading self-noise, signal-noise interaction, and receiver noise without invoking fitted parameters, self-citations, or uniqueness theorems that would reduce the claimed variance laws to the inputs by construction. The resulting diversity-resource tradeoff therefore constitutes an independent consequence of the stated assumptions rather than a tautology.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 1 invented entities

The central claim rests on the Rayleigh fading model and the ability to perform slow-timescale average power calibration. No free parameters are introduced in the abstract description. The REED mapping itself is the core invented construction rather than a new physical entity.

axioms (2)

domain assumption Channel coefficients follow independent Rayleigh fading across resource elements.
Invoked to derive exact first- and second-moment expressions for single-shot and chip-diverse REED.
domain assumption Average channel powers can be calibrated accurately on a slow timescale.
Required for the noncoherent estimator to function without instantaneous CSI.

invented entities (1)

Resource-Element Energy Difference (REED) no independent evidence
purpose: Noncoherent primitive that recovers signed sums from energy subtraction on paired orthogonal resource elements.
New construction introduced to enable continuous signed aggregation without phase coherence.

pith-pipeline@v0.9.0 · 5766 in / 1477 out tokens · 27584 ms · 2026-05-20T23:27:22.129432+00:00 · methodology

Review history (2 revisions) →

discussion (0)

Reference graph

Works this paper leans on

27 extracted references · 27 canonical work pages

[1]

Federated learning via over- the-air computation,

K. Yang, T. Jiang, Y . Shi, and Z. Ding, “Federated learning via over- the-air computation,”IEEE Trans. Wireless Commun., vol. 19, no. 3, pp. 2022–2035, 2020

work page 2022
[2]

Machine learning at the wireless edge: Distributed stochastic gradient descent over-the-air,

M. Mohammadi Amiri and D. Gündüz, “Machine learning at the wireless edge: Distributed stochastic gradient descent over-the-air,”IEEE Trans. Signal Process., vol. 68, pp. 2155–2169, 2020

work page 2020
[3]

Broadband analog aggregation for low-latency federated edge learning,

G. Zhu, Y . Wang, and K. Huang, “Broadband analog aggregation for low-latency federated edge learning,”IEEE Trans. Wireless Commun., vol. 19, no. 1, pp. 491–506, 2020

work page 2020
[4]

Optimized power control design for over-the-air federated edge learning,

X. Cao, G. Zhu, J. Xu, Z. Wang, and S. Cui, “Optimized power control design for over-the-air federated edge learning,”IEEE J. Sel. Areas Commun., vol. 40, no. 1, pp. 342–358, 2022

work page 2022
[5]

Transmission power control for over-the-air federated averaging at network edge,

X. Cao, G. Zhu, J. Xu, and S. Cui, “Transmission power control for over-the-air federated averaging at network edge,”IEEE J. Sel. Areas Commun., vol. 40, no. 5, pp. 1571–1586, 2022

work page 2022
[6]

Over-the-air fed- erated learning from heterogeneous data,

T. Sery, N. Shlezinger, K. Cohen, and Y . C. Eldar, “Over-the-air fed- erated learning from heterogeneous data,”IEEE Trans. Signal Process., vol. 69, pp. 3796–3811, 2021

work page 2021
[7]

Over-the-air federated learning and optimization,

J. Zhu, Y . Shi, Y . Zhou, C. Jiang, W. Chen, and K. B. Letaief, “Over-the-air federated learning and optimization,”IEEE Internet Things J., vol. 11, no. 10, pp. 16 996–17 020, May 2024, also available as arXiv:2310.10089

work page arXiv 2024
[8]

Over- the-air computation: Foundations, technologies, and applications,

Z. Wang, Y . Zhao, Y . Zhou, Y . Shi, C. Jiang, and K. B. Letaief, “Over- the-air computation: Foundations, technologies, and applications,”IEEE Internet Things J., vol. 11, no. 14, pp. 24 634–24 658, Jul. 2024, also available as arXiv:2210.10524

work page arXiv 2024
[9]

Over-the-air federated learning: Status quo, open challenges, and future directions,

B. Xiao, X. Yu, W. Ni, X. Wang, and H. V . Poor, “Over-the-air federated learning: Status quo, open challenges, and future directions,” Fundamental Research, vol. 5, no. 4, pp. 1710–1724, 2025

work page 2025
[10]

Waveforms for computing over the air: A groundbreaking approach that redefines data aggregation,

A. I. Pérez-Neira, E. Lagunas, J. Ferrer, M. Á. Vázquez, N. Maturo, S. Chatzinotas, and B. Ottersten, “Waveforms for computing over the air: A groundbreaking approach that redefines data aggregation,”IEEE Signal Process. Mag., vol. 42, no. 2, pp. 57–77, Mar. 2025

work page 2025
[11]

Over-the-air federated learning via weighted aggregation,

S. M. Azimi-Abarghouyi and L. Tassiulas, “Over-the-air federated learning via weighted aggregation,”IEEE Trans. Wireless Commun., vol. 23, no. 12, pp. 18 240–18 253, 2024

work page 2024
[12]

Random orthogonalization for federated learning in massive MIMO systems,

X. Wei, C. Shen, J. Yang, and H. V . Poor, “Random orthogonalization for federated learning in massive MIMO systems,” inProc. IEEE Int. Conf. Commun. (ICC), 2022, pp. 3382–3387

work page 2022
[13]

Blind feder- ated learning via over-the-airq-QAM,

S. Razavikia, J. M. Barros da Silva Jr., and C. Fischione, “Blind feder- ated learning via over-the-airq-QAM,”IEEE Trans. Wireless Commun., vol. 23, no. 12, pp. 19 570–19 586, 2024

work page 2024
[14]

signSGD: Compressed optimisation for non-convex problems,

J. Bernstein, J. Zhao, K. Azizzadenesheli, and A. Anandkumar, “signSGD: Compressed optimisation for non-convex problems,” inProc. 35th Int. Conf. Mach. Learn. (ICML), ser. Proc. Mach. Learn. Res., vol. 80, 2018, pp. 560–569

work page 2018
[15]

One-bit over-the-air aggregation for communication-efficient federated edge learning: Design and convergence analysis,

G. Zhu, Y . Du, D. Gündüz, and K. Huang, “One-bit over-the-air aggregation for communication-efficient federated edge learning: Design and convergence analysis,”IEEE Trans. Wireless Commun., vol. 20, no. 3, pp. 2120–2135, 2021

work page 2021
[16]

Distributed learning over a wireless network with non- coherent majority vote computation,

A. Sahin, “Distributed learning over a wireless network with non- coherent majority vote computation,”IEEE Trans. Wireless Commun., vol. 22, no. 11, pp. 8020–8034, 2023

work page 2023
[17]

One-bit aggregation for over-the-air federated learning against byzantine attacks,

Y . Miao, W. Ni, and H. Tian, “One-bit aggregation for over-the-air federated learning against byzantine attacks,”IEEE Signal Process. Lett., vol. 31, pp. 1024–1028, 2024

work page 2024
[18]

Harnessing interference for analog function computation in wireless sensor networks,

M. Goldenbaum, H. Boche, and S. Sta ´nczak, “Harnessing interference for analog function computation in wireless sensor networks,”IEEE Trans. Signal Process., vol. 61, no. 20, pp. 4893–4906, Oct. 2013

work page 2013
[19]

Robust analog function computation via wireless multiple-access channels,

M. Goldenbaum and S. Sta ´nczak, “Robust analog function computation via wireless multiple-access channels,”IEEE Trans. Commun., vol. 62, no. 9, pp. 3299–3310, Sep. 2014

work page 2014
[20]

Performance comparison between coherent and non-coherent approaches for over- the-air computation,

Y . Lee, J. Hwang, I.-H. Lee, H. Jung, and T. Q. Duong, “Performance comparison between coherent and non-coherent approaches for over- the-air computation,”IEEE Trans. Veh. Technol., vol. 73, no. 10, pp. 15 826–15 831, 2024

work page 2024
[21]

Non-coherent over-the-air decentralized gradient de- scent,

N. Michelusi, “Non-coherent over-the-air decentralized gradient de- scent,”IEEE Trans. Signal Process., vol. 72, pp. 4618–4634, 2024

work page 2024
[22]

NCAirFL: CSI-free over-the-air federated learning based on non-coherent detection,

H. Wen, N. Michelusi, O. Simeone, and H. Xing, “NCAirFL: CSI-free over-the-air federated learning based on non-coherent detection,” 2024

work page 2024
[23]

uera y Arcas, “Communication-efficient learning of deep networks from decentralized data,

H. B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. Ag"uera y Arcas, “Communication-efficient learning of deep networks from decentralized data,” inProc. 20th Int. Conf. Artificial Intelligence and Statistics (AISTATS), ser. Proc. Mach. Learn. Res., vol. 54, 2017, pp. 1273–1282. [Online]. Available: https: //proceedings.mlr.press/v54/mcmahan17a.html

work page 2017
[24]

Gradient-based learning applied to document recognition,

Y . LeCun, L. Bottou, Y . Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,”Proceedings of the IEEE, vol. 86, no. 11, pp. 2278–2324, Nov. 1998

work page 1998
[25]

Fashion-MNIST: A novel image dataset for benchmarking machine learning algorithms,

H. Xiao, K. Rasul, and R. V ollgraf, “Fashion-MNIST: A novel image dataset for benchmarking machine learning algorithms,” 2017

work page 2017
[26]

On the convergence of FedAvg on non-IID data,

X. Li, K. Huang, W. Yang, S. Wang, and Z. Zhang, “On the convergence of FedAvg on non-IID data,” inProc. Int. Conf. Learn. Represent. (ICLR), 2020

work page 2020
[27]

Federated optimization in heterogeneous networks,

T. Li, A. K. Sahu, M. Zaheer, M. Sanjabi, A. Talwalkar, and V . Smith, “Federated optimization in heterogeneous networks,” inProc. Mach. Learn. Syst. (MLSys), 2020, pp. 429–450

work page 2020

[1] [1]

Federated learning via over- the-air computation,

K. Yang, T. Jiang, Y . Shi, and Z. Ding, “Federated learning via over- the-air computation,”IEEE Trans. Wireless Commun., vol. 19, no. 3, pp. 2022–2035, 2020

work page 2022

[2] [2]

Machine learning at the wireless edge: Distributed stochastic gradient descent over-the-air,

M. Mohammadi Amiri and D. Gündüz, “Machine learning at the wireless edge: Distributed stochastic gradient descent over-the-air,”IEEE Trans. Signal Process., vol. 68, pp. 2155–2169, 2020

work page 2020

[3] [3]

Broadband analog aggregation for low-latency federated edge learning,

G. Zhu, Y . Wang, and K. Huang, “Broadband analog aggregation for low-latency federated edge learning,”IEEE Trans. Wireless Commun., vol. 19, no. 1, pp. 491–506, 2020

work page 2020

[4] [4]

Optimized power control design for over-the-air federated edge learning,

X. Cao, G. Zhu, J. Xu, Z. Wang, and S. Cui, “Optimized power control design for over-the-air federated edge learning,”IEEE J. Sel. Areas Commun., vol. 40, no. 1, pp. 342–358, 2022

work page 2022

[5] [5]

Transmission power control for over-the-air federated averaging at network edge,

X. Cao, G. Zhu, J. Xu, and S. Cui, “Transmission power control for over-the-air federated averaging at network edge,”IEEE J. Sel. Areas Commun., vol. 40, no. 5, pp. 1571–1586, 2022

work page 2022

[6] [6]

Over-the-air fed- erated learning from heterogeneous data,

T. Sery, N. Shlezinger, K. Cohen, and Y . C. Eldar, “Over-the-air fed- erated learning from heterogeneous data,”IEEE Trans. Signal Process., vol. 69, pp. 3796–3811, 2021

work page 2021

[7] [7]

Over-the-air federated learning and optimization,

J. Zhu, Y . Shi, Y . Zhou, C. Jiang, W. Chen, and K. B. Letaief, “Over-the-air federated learning and optimization,”IEEE Internet Things J., vol. 11, no. 10, pp. 16 996–17 020, May 2024, also available as arXiv:2310.10089

work page arXiv 2024

[8] [8]

Over- the-air computation: Foundations, technologies, and applications,

Z. Wang, Y . Zhao, Y . Zhou, Y . Shi, C. Jiang, and K. B. Letaief, “Over- the-air computation: Foundations, technologies, and applications,”IEEE Internet Things J., vol. 11, no. 14, pp. 24 634–24 658, Jul. 2024, also available as arXiv:2210.10524

work page arXiv 2024

[9] [9]

Over-the-air federated learning: Status quo, open challenges, and future directions,

B. Xiao, X. Yu, W. Ni, X. Wang, and H. V . Poor, “Over-the-air federated learning: Status quo, open challenges, and future directions,” Fundamental Research, vol. 5, no. 4, pp. 1710–1724, 2025

work page 2025

[10] [10]

Waveforms for computing over the air: A groundbreaking approach that redefines data aggregation,

A. I. Pérez-Neira, E. Lagunas, J. Ferrer, M. Á. Vázquez, N. Maturo, S. Chatzinotas, and B. Ottersten, “Waveforms for computing over the air: A groundbreaking approach that redefines data aggregation,”IEEE Signal Process. Mag., vol. 42, no. 2, pp. 57–77, Mar. 2025

work page 2025

[11] [11]

Over-the-air federated learning via weighted aggregation,

S. M. Azimi-Abarghouyi and L. Tassiulas, “Over-the-air federated learning via weighted aggregation,”IEEE Trans. Wireless Commun., vol. 23, no. 12, pp. 18 240–18 253, 2024

work page 2024

[12] [12]

Random orthogonalization for federated learning in massive MIMO systems,

X. Wei, C. Shen, J. Yang, and H. V . Poor, “Random orthogonalization for federated learning in massive MIMO systems,” inProc. IEEE Int. Conf. Commun. (ICC), 2022, pp. 3382–3387

work page 2022

[13] [13]

Blind feder- ated learning via over-the-airq-QAM,

S. Razavikia, J. M. Barros da Silva Jr., and C. Fischione, “Blind feder- ated learning via over-the-airq-QAM,”IEEE Trans. Wireless Commun., vol. 23, no. 12, pp. 19 570–19 586, 2024

work page 2024

[14] [14]

signSGD: Compressed optimisation for non-convex problems,

J. Bernstein, J. Zhao, K. Azizzadenesheli, and A. Anandkumar, “signSGD: Compressed optimisation for non-convex problems,” inProc. 35th Int. Conf. Mach. Learn. (ICML), ser. Proc. Mach. Learn. Res., vol. 80, 2018, pp. 560–569

work page 2018

[15] [15]

One-bit over-the-air aggregation for communication-efficient federated edge learning: Design and convergence analysis,

G. Zhu, Y . Du, D. Gündüz, and K. Huang, “One-bit over-the-air aggregation for communication-efficient federated edge learning: Design and convergence analysis,”IEEE Trans. Wireless Commun., vol. 20, no. 3, pp. 2120–2135, 2021

work page 2021

[16] [16]

Distributed learning over a wireless network with non- coherent majority vote computation,

A. Sahin, “Distributed learning over a wireless network with non- coherent majority vote computation,”IEEE Trans. Wireless Commun., vol. 22, no. 11, pp. 8020–8034, 2023

work page 2023

[17] [17]

One-bit aggregation for over-the-air federated learning against byzantine attacks,

Y . Miao, W. Ni, and H. Tian, “One-bit aggregation for over-the-air federated learning against byzantine attacks,”IEEE Signal Process. Lett., vol. 31, pp. 1024–1028, 2024

work page 2024

[18] [18]

Harnessing interference for analog function computation in wireless sensor networks,

M. Goldenbaum, H. Boche, and S. Sta ´nczak, “Harnessing interference for analog function computation in wireless sensor networks,”IEEE Trans. Signal Process., vol. 61, no. 20, pp. 4893–4906, Oct. 2013

work page 2013

[19] [19]

Robust analog function computation via wireless multiple-access channels,

M. Goldenbaum and S. Sta ´nczak, “Robust analog function computation via wireless multiple-access channels,”IEEE Trans. Commun., vol. 62, no. 9, pp. 3299–3310, Sep. 2014

work page 2014

[20] [20]

Performance comparison between coherent and non-coherent approaches for over- the-air computation,

Y . Lee, J. Hwang, I.-H. Lee, H. Jung, and T. Q. Duong, “Performance comparison between coherent and non-coherent approaches for over- the-air computation,”IEEE Trans. Veh. Technol., vol. 73, no. 10, pp. 15 826–15 831, 2024

work page 2024

[21] [21]

Non-coherent over-the-air decentralized gradient de- scent,

N. Michelusi, “Non-coherent over-the-air decentralized gradient de- scent,”IEEE Trans. Signal Process., vol. 72, pp. 4618–4634, 2024

work page 2024

[22] [22]

NCAirFL: CSI-free over-the-air federated learning based on non-coherent detection,

H. Wen, N. Michelusi, O. Simeone, and H. Xing, “NCAirFL: CSI-free over-the-air federated learning based on non-coherent detection,” 2024

work page 2024

[23] [23]

uera y Arcas, “Communication-efficient learning of deep networks from decentralized data,

H. B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. Ag"uera y Arcas, “Communication-efficient learning of deep networks from decentralized data,” inProc. 20th Int. Conf. Artificial Intelligence and Statistics (AISTATS), ser. Proc. Mach. Learn. Res., vol. 54, 2017, pp. 1273–1282. [Online]. Available: https: //proceedings.mlr.press/v54/mcmahan17a.html

work page 2017

[24] [24]

Gradient-based learning applied to document recognition,

Y . LeCun, L. Bottou, Y . Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,”Proceedings of the IEEE, vol. 86, no. 11, pp. 2278–2324, Nov. 1998

work page 1998

[25] [25]

Fashion-MNIST: A novel image dataset for benchmarking machine learning algorithms,

H. Xiao, K. Rasul, and R. V ollgraf, “Fashion-MNIST: A novel image dataset for benchmarking machine learning algorithms,” 2017

work page 2017

[26] [26]

On the convergence of FedAvg on non-IID data,

X. Li, K. Huang, W. Yang, S. Wang, and Z. Zhang, “On the convergence of FedAvg on non-IID data,” inProc. Int. Conf. Learn. Represent. (ICLR), 2020

work page 2020

[27] [27]

Federated optimization in heterogeneous networks,

T. Li, A. K. Sahu, M. Zaheer, M. Sanjabi, A. Talwalkar, and V . Smith, “Federated optimization in heterogeneous networks,” inProc. Mach. Learn. Syst. (MLSys), 2020, pp. 429–450

work page 2020