Three Birds, One Stone: Solving the Communication-Memory-Privacy Trilemma in LLM Fine-tuning Over Wireless Networks with Zeroth-Order Optimization

Bin Wang; Dongzhu Liu; Guangxu Zhu; Haolong Chen; Yuhao Zheng; Zhijie Cai

arxiv: 2604.12401 · v1 · submitted 2026-04-14 · 💻 cs.DC

Three Birds, One Stone: Solving the Communication-Memory-Privacy Trilemma in LLM Fine-tuning Over Wireless Networks with Zeroth-Order Optimization

Zhijie Cai , Yuhao Zheng , Haolong Chen , Dongzhu Liu , Bin Wang , Guangxu Zhu This is my paper

Pith reviewed 2026-05-10 14:41 UTC · model grok-4.3

classification 💻 cs.DC

keywords fine-tuningmemorycommunicationonlyoptimizationpairzeroprivacyconventional

0 comments

The pith

pAirZero uses zeroth-order optimization and over-the-air computation to solve the communication-memory-privacy trilemma in wireless federated LLM fine-tuning with low overhead and consistent privacy.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Large language models are usually fine-tuned by exchanging large gradient updates between devices, which uses too much bandwidth, requires high memory on each device, and can leak private training data. The proposed pAirZero framework replaces standard gradient exchange with zeroth-order methods that estimate updates using only function evaluations and over-the-air computation that lets devices transmit simultaneously so the wireless channel itself sums the signals. This reduces communication to simple bit transmissions and keeps memory use at the level needed for inference only. An optimization step chooses transmit power and added noise to keep privacy protection stable even when wireless channels vary. Experiments on the OPT-125M model reportedly show 25 percent peak memory use and far lower communication than standard approaches.

Core claim

pAirZero enables resource-constrained devices to submit their local gradient with only bit-level communication loads while participating in federated fine-tuning of LLMs with inference-level memory costs. This approach not only eliminates the high memory requirements needed for LLM fine-tuning but also alleviates the strict synchronization requirements that plague conventional OTA methods.

Load-bearing premise

That zeroth-order optimization can achieve acceptable fine-tuning performance for LLMs without access to first-order gradients, and that the formulated optimization model for transmit power and noise can guarantee consistent privacy protection across varying channel conditions.

read the original abstract

Federated Learning (FL) offers a promising pathway for collaboratively fine-tuning Large Language Models (LLMs) at the edge; however, this paradigm faces a critical bottleneck: the prohibitive communication and memory overheads incurred by exchanging high-dimensional gradients. Furthermore, recent studies reveal that user training data can still be recovered from these local gradients, undermining the core privacy promise of FL. In this paper, we address this trilemma of communication, memory, and privacy by proposing pAirZero, a novel framework that synergizes Zeroth-Order (ZO) optimization with Over-the-Air (OTA) computation. Uniquely, pAirZero enables resource-constrained devices to submit their local gradient with only bit-level communication loads while participating in federated fine-tuning of LLMs with inference-level memory costs. This approach not only eliminates the high memory requirements needed for LLM fine-tuning but also alleviates the strict synchronization requirements that plague conventional OTA methods. We further formulate a rigorous optimization model to adaptively determine the optimal transmit power and noise levels, ensuring consistent privacy protection regardless of channel conditions. Numerical experiments demonstrate the superiority of pAirZero in enabling secure, efficient LLM fine-tuning over wireless networks, with only 25% peak memory cost on OPT-125M and communication load orders of magnitude lower than conventional methods.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

pAirZero pairs zeroth-order updates with over-the-air aggregation to cut memory and communication in wireless LLM federated fine-tuning while adding noise for privacy, but the training quality side of the trade-off lacks direct evidence.

read the letter

The paper introduces pAirZero as a way to handle the communication, memory, and privacy issues when fine-tuning LLMs across wireless devices in a federated setup. Devices use only forward passes for zeroth-order gradient estimates, which keeps memory down to inference levels, and they send minimal bits over the air with added noise to protect data. They also optimize transmit power and noise to maintain privacy even as channels vary, and they claim this relaxes the usual tight synchronization needs of over-the-air methods. Tests on OPT-125M show 25 percent peak memory and far lower communication than standard gradient exchange. This combination targets a genuine deployment bottleneck for edge LLMs and gives concrete resource numbers that practitioners can check against their constraints. The optimization model for power and noise is a clear step beyond just stating the trilemma. The main soft spot is performance. Zeroth-order estimates carry variance that grows with model size, and the abstract gives no perplexity, accuracy, or iteration counts against first-order baselines. Without those, it is difficult to judge whether the resource wins survive once training quality is measured. The privacy guarantee also rests on the specific threat model and channel assumptions in their scheduler; real-world fading or partial channel knowledge could weaken it. The paper is aimed at people working on wireless federated learning and efficient LLM training at the edge. Readers who need ideas for low-memory, low-bandwidth setups will find the framework and the reported metrics useful even if they later adjust the method. It deserves peer review so the derivations, full experimental protocols, and larger-model results can be examined.

Referee Report

2 major / 1 minor

Summary. The paper proposes pAirZero, a framework that combines zeroth-order (ZO) optimization with over-the-air (OTA) computation for federated fine-tuning of LLMs over wireless networks. It claims to resolve the communication-memory-privacy trilemma by enabling bit-level communication loads for gradient submission, inference-level memory costs on devices, and consistent privacy via an optimization model that adaptively sets transmit power and noise levels independent of channel conditions. Numerical experiments on OPT-125M reportedly achieve 25% peak memory usage and orders-of-magnitude lower communication than conventional methods.

Significance. If the ZO-based updates deliver competitive fine-tuning performance and the privacy optimization holds under realistic conditions, this would enable practical federated LLM adaptation on resource-constrained edge devices, reducing both the memory barrier of backpropagation and the synchronization/privacy vulnerabilities of standard OTA FL. The explicit power/noise scheduler for channel-independent privacy is a potentially valuable technical contribution if the derivation is complete.

major comments (2)

[Numerical Experiments] Numerical Experiments section: the reported results emphasize memory (25% peak) and communication reductions but provide no perplexity, accuracy, or iteration-count comparisons against first-order baselines; without these, it is impossible to determine whether the linear growth in ZO gradient variance with model dimension negates the claimed resource gains for LLM fine-tuning.
[Optimization model] Optimization model (presumably §4 or equivalent): the claim that the formulated transmit-power and noise scheduler guarantees consistent privacy 'regardless of channel conditions' requires explicit statement of the threat model, channel statistics, and any assumptions on adversarial knowledge; without these, the privacy guarantee cannot be verified as load-bearing for the trilemma solution.

minor comments (1)

[Abstract] The abstract and introduction should explicitly state the largest model scale tested and the number of local ZO queries per update, as these directly affect the practicality claims.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments, which help clarify the contributions and limitations of our work. We respond to each major comment below and indicate the revisions we will make to strengthen the manuscript.

read point-by-point responses

Referee: Numerical Experiments section: the reported results emphasize memory (25% peak) and communication reductions but provide no perplexity, accuracy, or iteration-count comparisons against first-order baselines; without these, it is impossible to determine whether the linear growth in ZO gradient variance with model dimension negates the claimed resource gains for LLM fine-tuning.

Authors: We agree that direct comparisons of fine-tuning performance are necessary to evaluate whether ZO variance growth undermines the resource advantages. Our experiments prioritize demonstrating the memory and communication reductions achievable with inference-level costs and bit-level uploads. In the revised manuscript, we will add perplexity, accuracy, and iteration-count results for OPT-125M against first-order baselines under identical tasks and wireless settings. This will allow readers to assess whether the trilemma solution preserves competitive convergence despite the known dimension-dependent variance of ZO estimators. revision: yes
Referee: Optimization model (presumably §4 or equivalent): the claim that the formulated transmit-power and noise scheduler guarantees consistent privacy 'regardless of channel conditions' requires explicit statement of the threat model, channel statistics, and any assumptions on adversarial knowledge; without these, the privacy guarantee cannot be verified as load-bearing for the trilemma solution.

Authors: The optimization in Section 4 is formulated to achieve channel-independent privacy by solving for transmit power and artificial noise under a worst-case channel realization drawn from a known distribution. We will revise the manuscript to explicitly state the threat model (passive eavesdropper observing only the aggregated OTA signal), the channel statistics (i.i.d. Rayleigh fading with known distribution but unknown instantaneous realizations at the scheduler), and the assumption that the adversary knows the optimization parameters but not per-device channels. These clarifications will make the privacy guarantee verifiable while preserving the claim that privacy holds independently of instantaneous channel conditions. revision: yes

Circularity Check

0 steps flagged

No circularity detected; claims rest on proposed ZO+OTA framework without self-referential reductions

full rationale

The abstract and available claims introduce pAirZero as a novel combination of zeroth-order optimization and over-the-air computation to address the trilemma, with a formulated optimization model for transmit power and noise. No equations, derivations, or self-citations are exhibited that reduce any prediction or result to its own inputs by construction, such as fitting a parameter and renaming it a prediction or smuggling an ansatz via prior self-work. The memory/communication reductions and privacy guarantees are asserted as outcomes of the framework rather than tautological. Per hard rules, absent specific quoted reductions in the provided text, the derivation chain is treated as self-contained.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review; no explicit free parameters, axioms, or invented entities are stated. The adaptive power/noise optimization and zeroth-order estimation are presented as core but without derivation details.

pith-pipeline@v0.9.0 · 5558 in / 1123 out tokens · 22534 ms · 2026-05-10T14:41:27.695976+00:00 · methodology

Three Birds, One Stone: Solving the Communication-Memory-Privacy Trilemma in LLM Fine-tuning Over Wireless Networks with Zeroth-Order Optimization

Core claim

Load-bearing premise

discussion (0)