pith. sign in

arxiv: 2604.26330 · v1 · submitted 2026-04-29 · 📡 eess.SP

Optimizing Tracking Accuracy in Energy-Constrained Multimodal ISAC via Lyapunov-Driven Heterogeneous Mixture-of-Experts

Pith reviewed 2026-05-07 12:46 UTC · model grok-4.3

classification 📡 eess.SP
keywords multimodal ISACsemantic age of informationLyapunov optimizationmixture of expertstracking accuracyenergy efficiencyV2I networksreinforcement learning
0
0 comments X

The pith

A Lyapunov-driven heterogeneous mixture-of-experts method delivers event-triggered sensing that improves tracking accuracy and RF resilience while keeping edge queues stable and energy budgets intact.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper builds a physics-aware multimodal ISAC framework that ties network queuing delays directly to physical-layer beam misalignment through semantic age of information. It then poses the problem as a stochastic MINLP that trades off posterior Cramer-Rao bound against long-term energy use under constant-modulus constraints. To solve the resulting coupled scheduling and phase-mapping tasks, the authors introduce an RL architecture whose heterogeneous experts are steered by Lyapunov drift penalties, separating temporal and spatial decisions to avoid gradient conflicts. A sympathetic reader would care because continuous high-resolution sensing drains vehicle-edge batteries while delayed data ruins mmWave links in fast-moving V2I settings. If the method works, it yields practical event-driven policies that maintain queue stability and energy limits without sacrificing localization quality.

Core claim

The proposed LD-H-MoE achieves a highly-effective event-triggered sensing policy, yielding superior tracking accuracy and radio-frequency (RF) resilience while guaranteeing edge computing queue stability and long-term energy budgets.

What carries the argument

The Lyapunov-driven heterogeneous mixture-of-experts (LD-H-MoE) architecture that decouples temporal scheduling into one expert subnetwork and spatial phase mapping into another, each guided by Lyapunov drift-plus-penalty terms.

If this is right

  • The policy triggers high-resolution visual sensing only when semantic AoI threatens beam alignment, reducing average computational energy.
  • Edge computing queues remain stable because Lyapunov penalties explicitly penalize drift in the optimization.
  • Constant-modulus beamforming constraints are handled without violating the long-term energy budget.
  • Monolithic multi-task RL suffers gradient conflicts that the heterogeneous expert split avoids.
  • RF resilience improves because spatial uncertainty is explicitly folded into the state via semantic AoI.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same decoupling of time and space experts could be tested in non-mmWave bands or in UAV-assisted ISAC scenarios.
  • If semantic AoI generalizes beyond V2I, the framework might apply to other latency-sensitive multimodal sensing tasks such as drone swarms.
  • Real hardware validation would require measuring both PCRB and actual beam misalignment under the event-triggered schedule.
  • The approach leaves open whether simpler Lyapunov-only baselines without mixture-of-experts could achieve comparable queue stability at lower training cost.

Load-bearing premise

The semantic age of information accurately and causally bridges network-layer queuing delays with physical-layer spatial uncertainty in real V2I environments.

What would settle it

A field experiment in which the measured tracking error under the learned policy exceeds the simulated PCRB by more than a factor of two while energy and queue constraints remain satisfied would falsify the performance claims.

Figures

Figures reproduced from arXiv: 2604.26330 by Ahmad Bazzi, Chadi Assi, Ning Wei, Rongyan Xi, Wenqi Fan, You Li, Yue Xiu, Zhihan Zeng, Zhixian Song.

Figure 1
Figure 1. Figure 1: Illustration of the multimodal ISAC system in a V2I view at source ↗
Figure 2
Figure 2. Figure 2: Hierarchical execution architecture of the proposed LD view at source ↗
Figure 3
Figure 3. Figure 3: Time-averaged PCRB tracking performance of different view at source ↗
Figure 4
Figure 4. Figure 4: Time-averaged system energy consumption against the view at source ↗
Figure 6
Figure 6. Figure 6: Steady-state PCRB performance versus varying RF view at source ↗
Figure 5
Figure 5. Figure 5: Evolution of the average edge computing queue length view at source ↗
read the original abstract

The integration of multimodal sensing and millimeter-wave (mmWave) communications is a key enabler for highly mobile vehicle-to-infrastructure (V2I) networks. However, continuous high-resolution visual sensing incurs prohibitive computational energy, while delayed sensing information causes severe beam misalignment. This paper establishes a physics-aware multimodal integrated sensing and communication (M-ISAC) framework that mathematically bridges network-layer queuing delays with physical-layer spatial uncertainty via the semantic age of information (AoI). Guided by this relationship, we aim to strike an optimal trade-off between the tracking posterior Cramer-Rao bound (PCRB) and system energy budgets, we formulate a stochastic mixed-integer non-linear programming (MINLP) problem. Addressing the coupled challenges of temporal computing congestion and non-convex constant modulus constraints, we propose a reinforcement learning (RL) framework empowered by a Lyapunov-driven heterogeneous mixture-of-experts (LD-H-MoE) architecture. By strictly decoupling temporal scheduling and spatial phase mapping into specialized subnetworks, the LD-H-MoE circumvents gradient conflicts prevalent in monolithic multi-task learning. Simulations demonstrate that the proposed LD-H-MoE achieves a highly-effective event-triggered sensing policy, yielding superior tracking accuracy and radio-frequency (RF) resilience while guaranteeing edge computing queue stability and long-term energy budgets.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes a physics-aware multimodal ISAC framework for V2I networks that introduces semantic age of information (AoI) to mathematically bridge network-layer queuing delays with physical-layer spatial uncertainty. It formulates a stochastic mixed-integer nonlinear program (MINLP) to optimize the tracking posterior Cramér-Rao bound (PCRB) subject to long-term energy budgets and queue stability, and solves the problem via a Lyapunov-driven heterogeneous mixture-of-experts (LD-H-MoE) reinforcement learning architecture that decouples temporal scheduling from spatial phase mapping. Simulations are reported to demonstrate an effective event-triggered sensing policy with improved tracking accuracy and RF resilience.

Significance. If the semantic AoI mapping is rigorously justified from the mmWave channel model, the work offers a principled way to jointly manage sensing-communication trade-offs in energy-constrained mobile scenarios while providing Lyapunov-based stability guarantees. The heterogeneous MoE design addresses a known challenge in multi-task RL by avoiding gradient conflicts, which is a constructive contribution.

major comments (2)
  1. [Abstract and model formulation] The central claim that semantic AoI 'mathematically bridges' queuing delays to PCRB degradation (abstract) is load-bearing for the MINLP formulation and the subsequent LD-H-MoE policy. No derivation is supplied that starts from the mmWave channel model and explicitly incorporates beamwidth-dependent misalignment variance, Doppler-induced phase drift, or multimodal fusion latency; without this step the objective function risks being a mis-specified surrogate, undermining the claimed superiority in tracking accuracy and RF resilience.
  2. [Simulation results] Simulation results are invoked to support superiority of LD-H-MoE over baselines, yet the abstract (and presumably the results section) provides no information on the choice of baselines, statistical significance testing, parameter settings, or safeguards against post-hoc selection. This prevents verification that the reported gains in tracking accuracy and queue stability are robust rather than scenario-specific.
minor comments (2)
  1. [Notation and problem formulation] Clarify the precise definition of semantic AoI and its relationship to the constant-modulus constraint in the phase-mapping subnetwork.
  2. [Lyapunov-driven optimization] Add a short discussion of how the Lyapunov drift-plus-penalty parameters are tuned and whether the reported stability holds under channel estimation errors.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We sincerely thank the referee for the thorough and constructive review. The comments highlight important areas for improving the rigor and reproducibility of the manuscript. We address each major comment below and will revise the manuscript accordingly.

read point-by-point responses
  1. Referee: [Abstract and model formulation] The central claim that semantic AoI 'mathematically bridges' queuing delays to PCRB degradation (abstract) is load-bearing for the MINLP formulation and the subsequent LD-H-MoE policy. No derivation is supplied that starts from the mmWave channel model and explicitly incorporates beamwidth-dependent misalignment variance, Doppler-induced phase drift, or multimodal fusion latency; without this step the objective function risks being a mis-specified surrogate, undermining the claimed superiority in tracking accuracy and RF resilience.

    Authors: We appreciate the referee's point on the need for explicit derivation. The manuscript introduces semantic AoI in Section II as the bridge linking network-layer delays to physical-layer PCRB via the mmWave channel, but we acknowledge that the step-by-step mapping from the channel model (including beamwidth-dependent misalignment, Doppler phase drift, and fusion latency) is not presented with sufficient detail. In the revised version, we will insert a new subsection (e.g., II-C) that starts directly from the mmWave channel model, derives the misalignment variance and phase drift terms, incorporates multimodal latency, and arrives at the semantic AoI expression used in the MINLP objective. This will rigorously justify the formulation without changing the core results. revision: yes

  2. Referee: [Simulation results] Simulation results are invoked to support superiority of LD-H-MoE over baselines, yet the abstract (and presumably the results section) provides no information on the choice of baselines, statistical significance testing, parameter settings, or safeguards against post-hoc selection. This prevents verification that the reported gains in tracking accuracy and queue stability are robust rather than scenario-specific.

    Authors: We agree that the simulation section must provide sufficient detail for independent verification. The current manuscript compares LD-H-MoE against several baselines but does not explicitly list them, report statistical tests, tabulate all parameters, or describe anti-bias safeguards. In the revision, we will expand Section V to: (i) enumerate and justify all baselines with their configurations, (ii) present statistical significance results (e.g., mean and standard deviation over 50 independent runs with 95% confidence intervals), (iii) include a comprehensive parameter table, and (iv) document pre-defined evaluation protocols and fixed random seeds to guard against post-hoc selection. These additions will demonstrate that the reported gains in tracking accuracy and queue stability are robust. revision: yes

Circularity Check

0 steps flagged

No circularity: derivation applies standard Lyapunov-RL techniques to an explicitly formulated MINLP without self-referential reductions

full rationale

The paper defines semantic AoI as an explicit modeling construct to link queuing delay to PCRB, states the stochastic MINLP directly from that link plus energy/stability constraints, and solves it with a proposed LD-H-MoE architecture that decouples scheduling and phase mapping. No equation or claim reduces a 'prediction' to a fitted parameter by construction, no uniqueness theorem is imported from self-citation, and no ansatz is smuggled via prior work. Simulation comparisons to baselines remain external to the derivation chain, keeping the argument self-contained.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that semantic AoI provides a valid mathematical bridge between delays and uncertainty; no free parameters or invented physical entities are identifiable from the abstract.

axioms (1)
  • domain assumption Semantic age of information mathematically bridges network-layer queuing delays with physical-layer spatial uncertainty
    Invoked to formulate the stochastic MINLP problem that trades off PCRB and energy budgets.

pith-pipeline@v0.9.0 · 5553 in / 1276 out tokens · 49227 ms · 2026-05-07T12:46:42.192526+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

20 extracted references · 20 canonical work pages · 1 internal anchor

  1. [1]

    Joint radar and communication design: Applications, state-of-the-art, and the road ahead,

    F. Liu, C. Masouros, A. P. Petropulu, H. Griffiths, and L. Hanzo, “Joint radar and communication design: Applications, state-of-the-art, and the road ahead,”IEEE Transactions on Communications, vol. 68, no. 6, pp. 3834–3862, 2020

  2. [2]

    Beam tracking for mobile millimeter wave communication systems,

    V. Va, H. Vikalo, and R. W. Heath, “Beam tracking for mobile millimeter wave communication systems,” in2016 IEEE Global Conference on Signal and Information Processing (GlobalSIP), 2016, pp. 743–747

  3. [3]

    Bayesian predictive beamforming for vehicular networks: A low- overhead joint radar-communication approach,

    W. Yuan, F. Liu, C. Masouros, J. Yuan, D. W. K. Ng, and N. Gonz ´alez- Prelcic, “Bayesian predictive beamforming for vehicular networks: A low- overhead joint radar-communication approach,”IEEE Transactions on Wireless Communications, vol. 20, no. 3, pp. 1442–1456, 2021

  4. [4]

    Intelligent multi-modal sensing-communication integration: Synesthesia of machines,

    X. Cheng, H. Zhang, J. Zhang, S. Gao, S. Li, Z. Huang, L. Bai, Z. Yang, X. Zheng, and L. Yang, “Intelligent multi-modal sensing-communication integration: Synesthesia of machines,”IEEE Communications Surveys & Tutorials, vol. 26, no. 1, pp. 258–301, 2024

  5. [5]

    Millimeter wave beam- selection using out-of-band spatial information,

    A. Ali, N. Gonz ´alez-Prelcic, and R. W. Heath, “Millimeter wave beam- selection using out-of-band spatial information,”IEEE Transactions on Wireless Communications, vol. 17, no. 2, pp. 1038–1052, 2018

  6. [6]

    Age of information: An introduction and survey,

    R. D. Yates, Y. Sun, D. R. Brown, S. K. Kaul, E. Modiano, and S. Ulukus, “Age of information: An introduction and survey,”IEEE Journal on Selected Areas in Communications, vol. 39, no. 5, pp. 1183–1210, 2021

  7. [7]

    Meta-reinforcement learning optimization for movable antenna- aided full-duplex cf-dfrc systems with carrier frequency offset,

    Y. Xiu, W. Lyu, Y. Li, R. Yang, P. L. Yeoh, W. Zhang, G. Liu, and N. Wei, “Meta-reinforcement learning optimization for movable antenna- aided full-duplex cf-dfrc systems with carrier frequency offset,”IEEE Transactions on Communications, vol. 74, pp. 5803–5819, 2026

  8. [8]

    Robust optimization for movable antenna-aided cell-free isac with time synchronization errors,

    Y. Xiu, Y. Zhao, R. Yang, W. Lyu, D. Niyato, D. In Kim, G. Liu, and N. Wei, “Robust optimization for movable antenna-aided cell-free isac with time synchronization errors,”IEEE Transactions on Wireless Communications, vol. 25, pp. 10 082–10 097, 2026

  9. [9]

    Robust transceiver design for ris enhanced dual-functional radar- communication with movable antenna,

    R. Yang, Z. Dong, Y. Xiu, G. Liu, W. Lyu, X. Meng, Y. Li, and N. Wei, “Robust transceiver design for ris enhanced dual-functional radar- communication with movable antenna,”IEEE Transactions on Vehicular Technology, pp. 1–15, 2026

  10. [10]

    Power source allocation for ris-aided integrating sensing, communication, and power transfer communication systems based on noma,

    Y. Xiu, Y. Zhao, C. Xie, F. Benkhelifa, S. Yang, W. Lyu, C. Assi, and N. Wei, “Power source allocation for ris-aided integrating sensing, communication, and power transfer communication systems based on noma,”IEEE Transactions on Mobile Computing, pp. 1–14, 2026

  11. [11]

    Crosstalk-resilient beamforming for movable antenna enabled integrated sensing and communication,

    Z. Zhang, Y. Xiu, Z. Dong, J. Yin, M. J. Khabbaz, C. Assi, and N. Wei, “Crosstalk-resilient beamforming for movable antenna enabled integrated sensing and communication,”IEEE Wireless Communications Letters, vol. 15, pp. 1395–1399, 2026

  12. [12]

    Distortion- aware hybrid beamforming for integrated sensing and communication,

    Z. Zhang, Y. Xiu, P. Lep Yeoh, G. Liu, Z. Wu, and N. Wei, “Distortion- aware hybrid beamforming for integrated sensing and communication,” IEEE Communications Letters, vol. 30, pp. 682–686, 2026

  13. [13]

    Combining machine learning, molecular dynamics, and free energy analysis for (5ht)-2a receptor modulator classification,

    X. Yu, Y. Eid, M. Jama, D. Pham, M. Ahmed, M. S. attar, Z. Samiuddin, and K. Barakat, “Combining machine learning, molecular dynamics, and free energy analysis for (5ht)-2a receptor modulator classification,”Journal of Molecular Graphics and Modelling, vol. 132, p. 108842, 2024. [Online]. Available: https: //www.sciencedirect.com/science/article/pii/S1093...

  14. [14]

    minisnv: accurate and fast single nucleotide variant calling from nanopore sequencing data,

    M. Cui, Y. Liu, X. Yu, H. Guo, T. Jiang, Y. Wang, and B. Liu, “minisnv: accurate and fast single nucleotide variant calling from nanopore sequencing data,”Briefings in Bioinformatics, vol. 25, no. 6, p. bbae473, 09 2024. [Online]. Available: https://doi.org/10.1093/bib/bbae473

  15. [15]

    Movable antenna enabled isac beamforming design for low-altitude airborne vehicles,

    Y. Xiu, S. Yang, W. Lyu, P. Lep Yeoh, Y. Li, and Y. Ai, “Movable antenna enabled isac beamforming design for low-altitude airborne vehicles,”IEEE Wireless Communications Letters, vol. 14, no. 5, pp. 1311–1315, 2025

  16. [16]

    Applications of deep reinforcement learning in communications and networking: A survey,

    N. C. Luong, D. T. Hoang, S. Gong, D. Niyato, P. Wang, Y.-C. Liang, and D. I. Kim, “Applications of deep reinforcement learning in communications and networking: A survey,”IEEE Communications Surveys & Tutorials, vol. 21, no. 4, pp. 3133–3174, 2019

  17. [17]

    Heterogeneous Mixture-of-Experts for Energy-Efficient Multimodal ISAC in Highly Mobile Networks

    W. Fan, N. Wei, R. Xi, A. Bazzi, Y. Xiu, C. Assi, J. Dong, and J. Jin, “Heterogeneous mixture-of-experts for energy-efficient multimodal isac in highly mobile networks,” 2026. [Online]. Available: https://arxiv.org/abs/2604.06697

  18. [18]

    Joint request offloading and resource allocation for long-term utility optimization in collaborative edge inference with time-coupled resources,

    J. Huang, J. Wu, Y. Wu, and J. Wu, “Joint request offloading and resource allocation for long-term utility optimization in collaborative edge inference with time-coupled resources,”IEEE Transactions on Network Science and Engineering, vol. 12, no. 4, pp. 2622–2639, 2025

  19. [19]

    Sampling for remote estimation through queues: Age of information and beyond,

    T. Z. Ornee and Y. Sun, “Sampling for remote estimation through queues: Age of information and beyond,” in2019 International Symposium on Modeling and Optimization in Mobile, Ad Hoc, and Wireless Networks (WiOPT), 2019, pp. 1–8

  20. [20]

    Cram ´er-rao bound optimization for joint radar-communication beamforming,

    F. Liu, Y.-F. Liu, A. Li, C. Masouros, and Y. C. Eldar, “Cram ´er-rao bound optimization for joint radar-communication beamforming,”IEEE Transactions on Signal Processing, vol. 70, pp. 240–253, 2022