Curriculum-Guided Heterogeneous Multi-Agent Intelligence for Multi-UAV Cooperative ISAC

Jienan Chen; Jun Liu; Kang Yan; Kang Zheng; Kun Yang; Luping Xiang; Qiang Liu

arxiv: 2605.17905 · v2 · pith:3QDYO5XKnew · submitted 2026-05-18 · 📡 eess.SP

Curriculum-Guided Heterogeneous Multi-Agent Intelligence for Multi-UAV Cooperative ISAC

Kang Yan , Luping Xiang , Kang Zheng , Jienan Chen , Jun Liu , Qiang Liu , Kun Yang This is my paper

Pith reviewed 2026-05-25 06:30 UTC · model grok-4.3

classification 📡 eess.SP

keywords multi-UAVISACmulti-agent reinforcement learningcurriculum learningposterior Cramer-Rao boundtrajectory optimizationbeamforming

0 comments

The pith

A curriculum-guided multi-agent learning method lets multiple UAVs and a ground station jointly sense targets and maintain communication links.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper builds a cooperative system in which several UAVs and one ground base station act as heterogeneous agents to perform integrated sensing and communication. The design minimizes a posterior bound on sensing error subject to explicit communication quality constraints through joint control of trajectories and beam patterns. To solve the resulting high-dimensional non-convex problem, the authors introduce a reinforcement-learning procedure that first trains on simpler sub-tasks before advancing to full coordination. Kronecker and QR decompositions shrink the action space so that the policy can be learned efficiently. In simulations the method delivers more than 30 percent better sensing performance, quicker convergence, and improved tracking accuracy compared with prior approaches.

Core claim

The curriculum-based heterogeneous-agent proximal policy optimization algorithm solves the posterior Cramer-Rao bound minimization problem for multi-UAV ISAC under communication constraints, producing more than 30 percent gains in sensing performance and higher tracking accuracy than existing baselines.

What carries the argument

The C-HAPPO algorithm, which uses curriculum learning to refine policies progressively and Kronecker/QR decomposition to reduce action dimensionality in heterogeneous multi-agent settings.

If this is right

Multi-UAV ISAC systems can maintain required communication rates while achieving higher sensing accuracy through coordinated trajectory and beamforming decisions.
Curriculum learning allows heterogeneous agents to reach stable policies faster when the number of UAVs increases.
The same decomposition techniques reduce computational cost enough to support real-time execution on embedded UAV processors.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same progressive-training structure could be tested on other multi-agent tasks such as coordinated search-and-rescue or distributed spectrum monitoring.
If the communication constraints are tightened further, the method may need an additional safety layer to guarantee link reliability during early training episodes.

Load-bearing premise

That minimizing the posterior Cramer-Rao bound under communication constraints in simulation produces performance that remains useful once the same algorithm runs on real UAV hardware and radio channels.

What would settle it

A hardware experiment with actual UAVs, measured radar returns, and live communication links in which the proposed method fails to show at least 30 percent sensing improvement over the same baselines.

Figures

Figures reproduced from arXiv: 2605.17905 by Jienan Chen, Jun Liu, Kang Yan, Kang Zheng, Kun Yang, Luping Xiang, Qiang Liu.

**Figure 2.** Figure 2: Operation process of the C-HAPPO algorithm for the proposed system model. [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗

**Figure 3.** Figure 3: Reward performance of the C-HAPPO algorithm with different [PITH_FULL_IMAGE:figures/full_fig_p010_3.png] view at source ↗

**Figure 4.** Figure 4: Reward performance of the C-HAPPO algorithm under different [PITH_FULL_IMAGE:figures/full_fig_p010_4.png] view at source ↗

**Figure 5.** Figure 5: Communication SINR constraint violation rate under different [PITH_FULL_IMAGE:figures/full_fig_p010_5.png] view at source ↗

**Figure 6.** Figure 6: Reward convergence of the C-HAPPO algorithm compared with [PITH_FULL_IMAGE:figures/full_fig_p011_6.png] view at source ↗

**Figure 7.** Figure 7: Average sensing performance of the C-HAPPO algorithm compared [PITH_FULL_IMAGE:figures/full_fig_p011_7.png] view at source ↗

**Figure 8.** Figure 8: Comparison of predicted and actual locations of the sensed target with different methods: [PITH_FULL_IMAGE:figures/full_fig_p012_8.png] view at source ↗

**Figure 9.** Figure 9: Flight trajectories of UAVs and the sensed target of the C-HAPPO [PITH_FULL_IMAGE:figures/full_fig_p012_9.png] view at source ↗

**Figure 10.** Figure 10: Ablation experiment performance analysis for reward. [PITH_FULL_IMAGE:figures/full_fig_p012_10.png] view at source ↗

read the original abstract

Seamlessly unifying communication and sensing, sixth-generation (6G) networks are poised to transform into intelligent platforms with high spectral-energy efficiency and real-time environmental awareness. In the low-altitude economy, unmanned aerial vehicles (UAVs) enable air-ground integrated sensing and communication (ISAC) for applications such as logistics and inspection, yet most studies focus on single-UAV or homogeneous-agent designs. In contrast, this paper proposes a multi-UAV cooperative ISAC system that enables heterogeneous-agent collaboration between multiple UAVs and a ground base station (BS) for joint target sensing, tracking, and communication. The system is formulated as a posterior Cramer-Rao bound (PCRB) minimization problem under communication performance constraints, utilizing joint trajectory-beamforming optimization. To tackle the NP-hard nature of this problem, we design a curriculum-based heterogeneous-agent proximal policy optimization (C-HAPPO) algorithm, where curriculum learning guides progressive policy refinement and Kronecker/QR decomposition mitigates action dimensionality. Simulation results show that the proposed approach achieves more than a 30% improvement in sensing performance, faster convergence, and higher tracking accuracy than existing baselines, demonstrating its scalability and effectiveness for complex multi-UAV ISAC scenarios.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This applies curriculum learning plus heterogeneous PPO to multi-UAV ISAC trajectory-beamforming, with simulation gains on PCRB, but the sensing claims need actual estimator checks.

read the letter

The main takeaway is that the work takes curriculum-guided heterogeneous PPO (C-HAPPO) and applies it to joint trajectory and beamforming design across multiple UAVs plus a ground station, minimizing PCRB subject to communication constraints. That combination for heterogeneous multi-UAV ISAC is the concrete step forward; most prior RL work on UAV ISAC stays with single agents or homogeneous teams. The paper also uses Kronecker and QR decompositions to cut action space size, which is a practical move for scaling the policy. Simulations report over 30% better sensing performance, quicker convergence, and improved tracking versus baselines, which is the sort of result that could interest people working on low-altitude 6G scenarios. The soft spot is the reliance on PCRB as both the training objective and the reported metric. The stress-test note is right: without showing how close an actual tracker (EKF or particle filter) gets to the bound, or whether the relative gain survives in empirical MSE, the headline number is harder to interpret. The abstract and available description give no error bars, no statistical tests, and limited baseline detail, so the evidence stays simulation-only and somewhat opaque. No real-world data or hardware loop is mentioned. This is the kind of targeted systems paper that could fit a specialized conference or journal track on UAV communications or RL for wireless. Readers already working on multi-agent ISAC or curriculum methods for control would get the most out of the implementation choices and the reported numbers. It is coherent on its own terms and engages the literature, so it deserves a serious referee rather than a desk reject, even if the experiments will need tightening.

Referee Report

2 major / 1 minor

Summary. The paper proposes a multi-UAV cooperative ISAC system with heterogeneous agents (UAVs and ground BS) for joint target sensing, tracking, and communication. It formulates the problem as posterior Cramér-Rao bound (PCRB) minimization under communication constraints via joint trajectory-beamforming optimization and solves it with a curriculum-based heterogeneous-agent proximal policy optimization (C-HAPPO) algorithm that incorporates curriculum learning and Kronecker/QR decomposition. Simulations are reported to yield >30% sensing improvement, faster convergence, and higher tracking accuracy versus baselines.

Significance. If the simulation results hold under rigorous verification, the work would provide a concrete demonstration of curriculum-guided multi-agent RL for scalable multi-UAV ISAC, addressing a gap between single-UAV/homogeneous designs and heterogeneous cooperation. The use of PCRB as the optimization objective is a standard choice in the field, but the absence of supporting experimental details limits the ability to judge whether the claimed gains advance practical ISAC performance.

major comments (2)

[Abstract] Abstract (paragraph on system formulation and algorithm design): The headline claim of >30% sensing improvement rests on PCRB minimization, yet the manuscript provides no verification that realized estimation error (e.g., from an EKF or particle filter) attains or tracks the reported PCRB reduction. Because PCRB is only a lower bound, any gap between the bound and empirical MSE would directly weaken the practical significance of the performance numbers.
[Abstract] Abstract (simulation results paragraph): No error bars, baseline implementation details, dataset descriptions, or statistical tests are supplied for the reported >30% improvement, faster convergence, and higher tracking accuracy. Without these, the central empirical claim cannot be assessed for reproducibility or statistical reliability.

minor comments (1)

[Abstract] Abstract: The description of Kronecker/QR decomposition for action dimensionality reduction is mentioned but not connected to the specific steps inside the C-HAPPO policy update or the curriculum schedule.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. We address the two major comments point by point below, indicating planned revisions where appropriate.

read point-by-point responses

Referee: [Abstract] Abstract (paragraph on system formulation and algorithm design): The headline claim of >30% sensing improvement rests on PCRB minimization, yet the manuscript provides no verification that realized estimation error (e.g., from an EKF or particle filter) attains or tracks the reported PCRB reduction. Because PCRB is only a lower bound, any gap between the bound and empirical MSE would directly weaken the practical significance of the performance numbers.

Authors: We agree that PCRB is a lower bound and that explicit comparison to realized MSE from an estimator such as EKF would provide stronger practical validation. In the ISAC literature, however, direct optimization and reporting of PCRB is standard because it yields a tractable, estimator-independent metric that lower-bounds achievable performance. Our simulations therefore quantify improvement in this bound under the stated constraints. In revision we will add an explicit statement in the abstract and simulation section clarifying that all reported sensing gains refer to PCRB reduction, together with a short discussion (with citations) of why PCRB minimization is the conventional objective in comparable trajectory-beamforming studies. revision: partial
Referee: [Abstract] Abstract (simulation results paragraph): No error bars, baseline implementation details, dataset descriptions, or statistical tests are supplied for the reported >30% improvement, faster convergence, and higher tracking accuracy. Without these, the central empirical claim cannot be assessed for reproducibility or statistical reliability.

Authors: We accept this criticism. The current manuscript reports mean performance but omits variability measures and expanded implementation details. In the revised version we will (i) add error bars (standard deviation across independent random seeds) to all figures, (ii) expand the simulation-setup subsection with full hyper-parameter tables for both our algorithm and the baselines, (iii) provide a complete description of the custom simulation environment (no external public dataset is used), and (iv) include results of paired statistical tests or confidence intervals to support the significance of the reported gains. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The provided abstract formulates the multi-UAV ISAC task as a PCRB minimization problem solved by the C-HAPPO algorithm and reports comparative simulation outcomes (e.g., >30% sensing improvement). No equations, derivation steps, or self-citations are shown that would allow identification of self-definitional reductions, fitted inputs renamed as predictions, or load-bearing self-citation chains. The performance numbers are presented as empirical results of running the proposed optimizer against baselines on the stated objective; this is a standard simulation comparison and does not reduce the claimed result to its inputs by construction. The derivation chain is therefore treated as self-contained.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review; no explicit free parameters, axioms, or invented entities are stated or derivable from the provided text.

pith-pipeline@v0.9.0 · 5758 in / 1104 out tokens · 34223 ms · 2026-05-25T06:30:05.677469+00:00 · methodology

Review history (2 revisions) →

discussion (0)

Reference graph

Works this paper leans on

40 extracted references · 40 canonical work pages · 1 internal anchor

[1]

On the road to 6G: Visions, requirements, key technologies, and testbeds,

C.-X. Wang, X. You, X. Gao, X. Zhu, Z. Li, C. Zhang, H. Wang, Y . Huang, Y . Chen, H. Haaset al., “On the road to 6G: Visions, requirements, key technologies, and testbeds,”IEEE Communications Surveys & Tutorials, vol. 25, no. 2, pp. 905–974, 2023

work page 2023
[2]

Op- erator’s perspective on 6G: 6G services, vision, and spectrum,

M. Na, J. Lee, G. Choi, T. Yu, J. Choi, J. Lee, and S. Bahk, “Op- erator’s perspective on 6G: 6G services, vision, and spectrum,”IEEE Communications Magazine, vol. 62, no. 8, pp. 178–184, 2024

work page 2024
[3]

ISAC– a survey on its layered architecture, technologies, standardizations, prototypes and testbeds,

X. Luo, Q. Lin, R. Zhang, H.-H. Chen, X. Wang, and M. Huang, “ISAC– a survey on its layered architecture, technologies, standardizations, prototypes and testbeds,”IEEE Communications Surveys & Tutorials, 2025

work page 2025
[4]

Bedrock models in com- munication and sensing: Advancing generalization, transferability, and performance,

C. Luo, L. Xiang, J. Hu, and K. Yang, “Bedrock models in com- munication and sensing: Advancing generalization, transferability, and performance,”arXiv preprint arXiv:2503.08220, 2025

work page arXiv 2025
[5]

Simac: A semantic-driven integrated multimodal sensing and communication framework,

Y . Peng, L. Xiang, K. Yang, F. Jiang, K. Wang, and D. O. Wu, “Simac: A semantic-driven integrated multimodal sensing and communication framework,”IEEE Journal on Selected Areas in Communications, pp. 1–1, 2025

work page 2025
[6]

Extended target adaptive beamforming for isac: A perspective of pre- dictive error ellipse,

S. Zhou, L. Xiang, Y . Wang, K. Yang, K. K. Wong, and C.-B. Chae, “Extended target adaptive beamforming for isac: A perspective of pre- dictive error ellipse,”IEEE Transactions on Wireless Communications, vol. 25, pp. 10 604–10 617, 2026

work page 2026
[7]

Toward integrated sensing and communications for 6G: Key enabling technolo- gies, standardization, and challenges,

A. Kaushik, R. Singh, S. Dayarathna, R. Senanayake, M. Di Renzo, M. Dajer, H. Ji, Y . Kim, V . Sciancalepore, A. Zapponeet al., “Toward integrated sensing and communications for 6G: Key enabling technolo- gies, standardization, and challenges,”IEEE Communications Standards Magazine, vol. 8, no. 2, pp. 52–59, 2024

work page 2024
[8]

Toward realization of low-altitude economy networks: Core architecture, integrated technologies, and future direc- tions,

Y . Wang, G. Sun, Z. Sun, J. Wang, J. Li, C. Zhao, J. Wu, S. Liang, M. Yin, P. Wanget al., “Toward realization of low-altitude economy networks: Core architecture, integrated technologies, and future direc- tions,”arXiv preprint arXiv:2504.21583, 2025

work page arXiv 2025
[9]

State-of-the-art and future research challenges in UA V swarms,

S. Javed, A. Hassan, R. Ahmad, W. Ahmed, R. Ahmed, A. Saadat, and M. Guizani, “State-of-the-art and future research challenges in UA V swarms,”IEEE Internet of Things Journal, vol. 11, no. 11, pp. 19 023– 19 045, 2024

work page 2024
[10]

Integrated sensing and communication for low altitude econ- omy: Opportunities and challenges,

Y . Jiang, X. Li, G. Zhu, H. Li, J. Deng, K. Han, C. Shen, Q. Shi, and R. Zhang, “Integrated sensing and communication for low altitude econ- omy: Opportunities and challenges,”IEEE Communications Magazine, 2025

work page 2025
[11]

An overview of cellular ISAC for low-altitude UA V: New opportunities and challenges,

Y . Song, Y . Zeng, Y . Yang, Z. Ren, G. Cheng, X. Xu, J. Xu, S. Jin, and R. Zhang, “An overview of cellular ISAC for low-altitude UA V: New opportunities and challenges,”IEEE Communications Magazine, 2025

work page 2025
[12]

Joint maneuver and beamforming design for UA V-enabled integrated sensing and communication,

Z. Lyu, G. Zhu, and J. Xu, “Joint maneuver and beamforming design for UA V-enabled integrated sensing and communication,”IEEE Trans- actions on Wireless Communications, vol. 22, no. 4, pp. 2424–2440, 2022

work page 2022
[13]

Joint beamforming and UA V trajectory optimization for covert communications in ISAC networks,

D. Deng, W. Zhou, X. Li, D. B. Da Costa, D. W. K. Ng, and A. Nallanathan, “Joint beamforming and UA V trajectory optimization for covert communications in ISAC networks,”IEEE Transactions on Wireless Communications, 2024

work page 2024
[14]

Throughput maximization for UA V-enabled integrated periodic sensing and com- munication,

K. Meng, Q. Wu, S. Ma, W. Chen, K. Wang, and J. Li, “Throughput maximization for UA V-enabled integrated periodic sensing and com- munication,”IEEE Transactions on Wireless Communications, vol. 22, no. 1, pp. 671–687, 2022

work page 2022
[15]

ISAC from the sky: UA V trajectory design for joint communication and target localization,

X. Jing, F. Liu, C. Masouros, and Y . Zeng, “ISAC from the sky: UA V trajectory design for joint communication and target localization,”IEEE Transactions on Wireless Communications, vol. 23, no. 10, pp. 12 857– 12 872, 2024

work page 2024
[16]

Beamforming-based achievable rate maximization in ISAC system for multi-UA V networking,

S. Zhou, L. Xiang, K. Yang, K. K. Wong, D. O. Wu, and C.-B. Chae, “Beamforming-based achievable rate maximization in ISAC system for multi-UA V networking,”arXiv preprint arXiv:2507.21895, 2025

work page arXiv 2025
[17]

Sensing and communication in UA V cellular networks: Design and optimization,

C. Diaz-Vilor, M. A. Almasi, A. M. Abdelhady, A. Celik, A. M. Eltawil, and H. Jafarkhani, “Sensing and communication in UA V cellular networks: Design and optimization,”IEEE Transactions on Wireless Communications, vol. 23, no. 6, pp. 5456–5472, 2023

work page 2023
[18]

UA V assisted integrated sensing and communications for internet of things: 3D trajectory optimization and resource allocation,

Z. Liu, X. Liu, Y . Liu, V . C. Leung, and T. S. Durrani, “UA V assisted integrated sensing and communications for internet of things: 3D trajectory optimization and resource allocation,”IEEE Transactions on Wireless Communications, vol. 23, no. 8, pp. 8654–8667, 2024

work page 2024
[19]

Efficient UA V hovering, resource allocation, and trajectory design for ISAC with limited backhaul capacity,

A. Khalili, A. Rezaei, D. Xu, F. Dressler, and R. Schober, “Efficient UA V hovering, resource allocation, and trajectory design for ISAC with limited backhaul capacity,”IEEE Transactions on Wireless Communica- tions, 2024

work page 2024
[20]

ISAC enabled cooperative detection for cellular-connected UA V network,

Y . Wang, K. Zu, L. Xiang, Q. Zhang, Z. Feng, J. Hu, and K. Yang, “ISAC enabled cooperative detection for cellular-connected UA V network,” IEEE Transactions on Wireless Communications, 2024

work page 2024
[21]

Markov decision processes,

F. Garcia and E. Rachelson, “Markov decision processes,”Markov Decision Processes in Artificial Intelligence, pp. 1–38, 2013

work page 2013
[22]

Radio resource management for cellular- connected UA V: A learning approach,

Y . Li and A. H. Aghvami, “Radio resource management for cellular- connected UA V: A learning approach,”IEEE Transactions on Commu- nications, vol. 71, no. 5, pp. 2784–2800, 2023

work page 2023
[23]

Path planning for cellular- connected UA V: A DRL solution with quantum-inspired experience replay,

Y . Li, A. H. Aghvami, and D. Dong, “Path planning for cellular- connected UA V: A DRL solution with quantum-inspired experience replay,”IEEE Transactions on Wireless Communications, vol. 21, no. 10, pp. 7897–7912, 2022

work page 2022
[24]

Energy-efficient UA V-driven multi-access edge computing: a distributed many-agent perspective,

Y . Li, A. Madhukumar, T. Z. H. Ernest, G. Zheng, W. Saad, and A. H. Aghvami, “Energy-efficient UA V-driven multi-access edge computing: a distributed many-agent perspective,”IEEE Transactions on Communi- cations, 2025

work page 2025
[25]

MARL based UA Vs’ trajectory and beamforming optimization for ISAC system,

Q. Gao, R. Zhong, H. Shin, and Y . Liu, “MARL based UA Vs’ trajectory and beamforming optimization for ISAC system,”IEEE Internet of Things Journal, 2024

work page 2024
[26]

Distributed UA V swarm for device-free integrated sensing and communication relying on multi-agent reinforcement learning,

Z. Xie, Z. Wang, Z. Zhang, J. Wang, Z. Jiang, and Z. Han, “Distributed UA V swarm for device-free integrated sensing and communication relying on multi-agent reinforcement learning,”IEEE Transactions on Vehicular Technology, 2024

work page 2024
[27]

Joint UA V trajectory and radcom task schedule for IVNs: A game-embedding multi-agent deep reinforcement learning approach,

S. Cheng, X. Lin, X. Li, and J. Wang, “Joint UA V trajectory and radcom task schedule for IVNs: A game-embedding multi-agent deep reinforcement learning approach,”IEEE Transactions on Wireless Com- munications, 2024

work page 2024
[28]

Deep reinforce- ment learning based resource allocation and trajectory planning in inte- grated sensing and communications UA V network,

Y . Qin, Z. Zhang, X. Li, W. Huangfu, and H. Zhang, “Deep reinforce- ment learning based resource allocation and trajectory planning in inte- grated sensing and communications UA V network,”IEEE Transactions on Wireless Communications, vol. 22, no. 11, pp. 8158–8169, 2023

work page 2023
[29]

AoI-aware air- ground mobile crowdsensing by multi-agent curriculum learning with collaborative observation augmentation,

Y . Ye, Y . Tian, C. H. Liu, L. Dong, G. Qi, and D. Wu, “AoI-aware air- ground mobile crowdsensing by multi-agent curriculum learning with collaborative observation augmentation,”IEEE Transactions on Mobile Computing, no. 01, pp. 1–13, 2025

work page 2025
[30]

Heterogeneous-agent reinforcement learning,

Y . Zhong, J. G. Kuba, X. Feng, S. Hu, J. Ji, and Y . Yang, “Heterogeneous-agent reinforcement learning,”Journal of Machine Learning Research, vol. 25, no. 32, pp. 1–67, 2024

work page 2024
[31]

Joint transmit designs for coexistence of MIMO wireless communications and sparse sensing radars in clutter,

B. Li and A. P. Petropulu, “Joint transmit designs for coexistence of MIMO wireless communications and sparse sensing radars in clutter,” IEEE Transactions on Aerospace and Electronic Systems, vol. 53, no. 6, pp. 2846–2864, 2017

work page 2017
[32]

Optimal training for residual self-interference for full-duplex one-way relays,

X. Li, C. Tepedelenlio ˘glu, and H. S ¸enol, “Optimal training for residual self-interference for full-duplex one-way relays,”IEEE Transactions on Communications, vol. 66, no. 12, pp. 5976–5989, 2018

work page 2018
[33]

Sensing as 14 a service in 6G perceptive networks: A unified framework for ISAC resource allocation,

F. Dong, F. Liu, Y . Cui, W. Wang, K. Han, and Z. Wang, “Sensing as 14 a service in 6G perceptive networks: A unified framework for ISAC resource allocation,”IEEE Transactions on Wireless Communications, vol. 22, no. 5, pp. 3522–3536, 2022

work page 2022
[34]

Radar-assisted predictive beamforming for vehicular links: Communication served by sensing,

F. Liu, W. Yuan, C. Masouros, and J. Yuan, “Radar-assisted predictive beamforming for vehicular links: Communication served by sensing,” IEEE Transactions on Wireless Communications, vol. 19, no. 11, pp. 7704–7719, 2020

work page 2020
[35]

Industry tip: Picking the minimum process noise variance for your NCV track filter,

W. Blair, “Industry tip: Picking the minimum process noise variance for your NCV track filter,”IEEE Aerospace and Electronic Systems Magazine, vol. 36, no. 2, pp. 72–74, 2021

work page 2021
[36]

High-Dimensional Continuous Control Using Generalized Advantage Estimation

J. Schulman, P. Moritz, S. Levine, M. Jordan, and P. Abbeel, “High- dimensional continuous control using generalized advantage estimation,” arXiv preprint arXiv:1506.02438, 2015

work page internal anchor Pith review Pith/arXiv arXiv 2015
[37]

Collaborative reinforcement learning based unmanned aerial vehicle (UA V) trajectory design for 3D UA V tracking,

Y . Zhu, M. Chen, S. Wang, Y . Hu, Y . Liu, and C. Yin, “Collaborative reinforcement learning based unmanned aerial vehicle (UA V) trajectory design for 3D UA V tracking,”IEEE Transactions on Mobile Computing, vol. 23, no. 12, pp. 10 787–10 802, 2024

work page 2024
[38]

Technical specification group radio access network: Study on enhanced LTE support for aerial vehicles,

J. Meredith, “Technical specification group radio access network: Study on enhanced LTE support for aerial vehicles,” 2015

work page 2015
[39]

A scheme for robust distributed sensor fusion based on average consensus,

L. Xiao, S. Boyd, and S. Lall, “A scheme for robust distributed sensor fusion based on average consensus,” inIPSN 2005. Fourth International Symposium on Information Processing in Sensor Networks, 2005.IEEE, 2005, pp. 63–70

work page 2005
[40]

Genetic algorithms,

J. H. Holland, “Genetic algorithms,”Scientific american, vol. 267, no. 1, pp. 66–73, 1992

work page 1992

[1] [1]

On the road to 6G: Visions, requirements, key technologies, and testbeds,

C.-X. Wang, X. You, X. Gao, X. Zhu, Z. Li, C. Zhang, H. Wang, Y . Huang, Y . Chen, H. Haaset al., “On the road to 6G: Visions, requirements, key technologies, and testbeds,”IEEE Communications Surveys & Tutorials, vol. 25, no. 2, pp. 905–974, 2023

work page 2023

[2] [2]

Op- erator’s perspective on 6G: 6G services, vision, and spectrum,

M. Na, J. Lee, G. Choi, T. Yu, J. Choi, J. Lee, and S. Bahk, “Op- erator’s perspective on 6G: 6G services, vision, and spectrum,”IEEE Communications Magazine, vol. 62, no. 8, pp. 178–184, 2024

work page 2024

[3] [3]

ISAC– a survey on its layered architecture, technologies, standardizations, prototypes and testbeds,

X. Luo, Q. Lin, R. Zhang, H.-H. Chen, X. Wang, and M. Huang, “ISAC– a survey on its layered architecture, technologies, standardizations, prototypes and testbeds,”IEEE Communications Surveys & Tutorials, 2025

work page 2025

[4] [4]

Bedrock models in com- munication and sensing: Advancing generalization, transferability, and performance,

C. Luo, L. Xiang, J. Hu, and K. Yang, “Bedrock models in com- munication and sensing: Advancing generalization, transferability, and performance,”arXiv preprint arXiv:2503.08220, 2025

work page arXiv 2025

[5] [5]

Simac: A semantic-driven integrated multimodal sensing and communication framework,

Y . Peng, L. Xiang, K. Yang, F. Jiang, K. Wang, and D. O. Wu, “Simac: A semantic-driven integrated multimodal sensing and communication framework,”IEEE Journal on Selected Areas in Communications, pp. 1–1, 2025

work page 2025

[6] [6]

Extended target adaptive beamforming for isac: A perspective of pre- dictive error ellipse,

S. Zhou, L. Xiang, Y . Wang, K. Yang, K. K. Wong, and C.-B. Chae, “Extended target adaptive beamforming for isac: A perspective of pre- dictive error ellipse,”IEEE Transactions on Wireless Communications, vol. 25, pp. 10 604–10 617, 2026

work page 2026

[7] [7]

Toward integrated sensing and communications for 6G: Key enabling technolo- gies, standardization, and challenges,

A. Kaushik, R. Singh, S. Dayarathna, R. Senanayake, M. Di Renzo, M. Dajer, H. Ji, Y . Kim, V . Sciancalepore, A. Zapponeet al., “Toward integrated sensing and communications for 6G: Key enabling technolo- gies, standardization, and challenges,”IEEE Communications Standards Magazine, vol. 8, no. 2, pp. 52–59, 2024

work page 2024

[8] [8]

Toward realization of low-altitude economy networks: Core architecture, integrated technologies, and future direc- tions,

Y . Wang, G. Sun, Z. Sun, J. Wang, J. Li, C. Zhao, J. Wu, S. Liang, M. Yin, P. Wanget al., “Toward realization of low-altitude economy networks: Core architecture, integrated technologies, and future direc- tions,”arXiv preprint arXiv:2504.21583, 2025

work page arXiv 2025

[9] [9]

State-of-the-art and future research challenges in UA V swarms,

S. Javed, A. Hassan, R. Ahmad, W. Ahmed, R. Ahmed, A. Saadat, and M. Guizani, “State-of-the-art and future research challenges in UA V swarms,”IEEE Internet of Things Journal, vol. 11, no. 11, pp. 19 023– 19 045, 2024

work page 2024

[10] [10]

Integrated sensing and communication for low altitude econ- omy: Opportunities and challenges,

Y . Jiang, X. Li, G. Zhu, H. Li, J. Deng, K. Han, C. Shen, Q. Shi, and R. Zhang, “Integrated sensing and communication for low altitude econ- omy: Opportunities and challenges,”IEEE Communications Magazine, 2025

work page 2025

[11] [11]

An overview of cellular ISAC for low-altitude UA V: New opportunities and challenges,

Y . Song, Y . Zeng, Y . Yang, Z. Ren, G. Cheng, X. Xu, J. Xu, S. Jin, and R. Zhang, “An overview of cellular ISAC for low-altitude UA V: New opportunities and challenges,”IEEE Communications Magazine, 2025

work page 2025

[12] [12]

Joint maneuver and beamforming design for UA V-enabled integrated sensing and communication,

Z. Lyu, G. Zhu, and J. Xu, “Joint maneuver and beamforming design for UA V-enabled integrated sensing and communication,”IEEE Trans- actions on Wireless Communications, vol. 22, no. 4, pp. 2424–2440, 2022

work page 2022

[13] [13]

Joint beamforming and UA V trajectory optimization for covert communications in ISAC networks,

D. Deng, W. Zhou, X. Li, D. B. Da Costa, D. W. K. Ng, and A. Nallanathan, “Joint beamforming and UA V trajectory optimization for covert communications in ISAC networks,”IEEE Transactions on Wireless Communications, 2024

work page 2024

[14] [14]

Throughput maximization for UA V-enabled integrated periodic sensing and com- munication,

K. Meng, Q. Wu, S. Ma, W. Chen, K. Wang, and J. Li, “Throughput maximization for UA V-enabled integrated periodic sensing and com- munication,”IEEE Transactions on Wireless Communications, vol. 22, no. 1, pp. 671–687, 2022

work page 2022

[15] [15]

ISAC from the sky: UA V trajectory design for joint communication and target localization,

X. Jing, F. Liu, C. Masouros, and Y . Zeng, “ISAC from the sky: UA V trajectory design for joint communication and target localization,”IEEE Transactions on Wireless Communications, vol. 23, no. 10, pp. 12 857– 12 872, 2024

work page 2024

[16] [16]

Beamforming-based achievable rate maximization in ISAC system for multi-UA V networking,

S. Zhou, L. Xiang, K. Yang, K. K. Wong, D. O. Wu, and C.-B. Chae, “Beamforming-based achievable rate maximization in ISAC system for multi-UA V networking,”arXiv preprint arXiv:2507.21895, 2025

work page arXiv 2025

[17] [17]

Sensing and communication in UA V cellular networks: Design and optimization,

C. Diaz-Vilor, M. A. Almasi, A. M. Abdelhady, A. Celik, A. M. Eltawil, and H. Jafarkhani, “Sensing and communication in UA V cellular networks: Design and optimization,”IEEE Transactions on Wireless Communications, vol. 23, no. 6, pp. 5456–5472, 2023

work page 2023

[18] [18]

UA V assisted integrated sensing and communications for internet of things: 3D trajectory optimization and resource allocation,

Z. Liu, X. Liu, Y . Liu, V . C. Leung, and T. S. Durrani, “UA V assisted integrated sensing and communications for internet of things: 3D trajectory optimization and resource allocation,”IEEE Transactions on Wireless Communications, vol. 23, no. 8, pp. 8654–8667, 2024

work page 2024

[19] [19]

Efficient UA V hovering, resource allocation, and trajectory design for ISAC with limited backhaul capacity,

A. Khalili, A. Rezaei, D. Xu, F. Dressler, and R. Schober, “Efficient UA V hovering, resource allocation, and trajectory design for ISAC with limited backhaul capacity,”IEEE Transactions on Wireless Communica- tions, 2024

work page 2024

[20] [20]

ISAC enabled cooperative detection for cellular-connected UA V network,

Y . Wang, K. Zu, L. Xiang, Q. Zhang, Z. Feng, J. Hu, and K. Yang, “ISAC enabled cooperative detection for cellular-connected UA V network,” IEEE Transactions on Wireless Communications, 2024

work page 2024

[21] [21]

Markov decision processes,

F. Garcia and E. Rachelson, “Markov decision processes,”Markov Decision Processes in Artificial Intelligence, pp. 1–38, 2013

work page 2013

[22] [22]

Radio resource management for cellular- connected UA V: A learning approach,

Y . Li and A. H. Aghvami, “Radio resource management for cellular- connected UA V: A learning approach,”IEEE Transactions on Commu- nications, vol. 71, no. 5, pp. 2784–2800, 2023

work page 2023

[23] [23]

Path planning for cellular- connected UA V: A DRL solution with quantum-inspired experience replay,

Y . Li, A. H. Aghvami, and D. Dong, “Path planning for cellular- connected UA V: A DRL solution with quantum-inspired experience replay,”IEEE Transactions on Wireless Communications, vol. 21, no. 10, pp. 7897–7912, 2022

work page 2022

[24] [24]

Energy-efficient UA V-driven multi-access edge computing: a distributed many-agent perspective,

Y . Li, A. Madhukumar, T. Z. H. Ernest, G. Zheng, W. Saad, and A. H. Aghvami, “Energy-efficient UA V-driven multi-access edge computing: a distributed many-agent perspective,”IEEE Transactions on Communi- cations, 2025

work page 2025

[25] [25]

MARL based UA Vs’ trajectory and beamforming optimization for ISAC system,

Q. Gao, R. Zhong, H. Shin, and Y . Liu, “MARL based UA Vs’ trajectory and beamforming optimization for ISAC system,”IEEE Internet of Things Journal, 2024

work page 2024

[26] [26]

Distributed UA V swarm for device-free integrated sensing and communication relying on multi-agent reinforcement learning,

Z. Xie, Z. Wang, Z. Zhang, J. Wang, Z. Jiang, and Z. Han, “Distributed UA V swarm for device-free integrated sensing and communication relying on multi-agent reinforcement learning,”IEEE Transactions on Vehicular Technology, 2024

work page 2024

[27] [27]

Joint UA V trajectory and radcom task schedule for IVNs: A game-embedding multi-agent deep reinforcement learning approach,

S. Cheng, X. Lin, X. Li, and J. Wang, “Joint UA V trajectory and radcom task schedule for IVNs: A game-embedding multi-agent deep reinforcement learning approach,”IEEE Transactions on Wireless Com- munications, 2024

work page 2024

[28] [28]

Deep reinforce- ment learning based resource allocation and trajectory planning in inte- grated sensing and communications UA V network,

Y . Qin, Z. Zhang, X. Li, W. Huangfu, and H. Zhang, “Deep reinforce- ment learning based resource allocation and trajectory planning in inte- grated sensing and communications UA V network,”IEEE Transactions on Wireless Communications, vol. 22, no. 11, pp. 8158–8169, 2023

work page 2023

[29] [29]

AoI-aware air- ground mobile crowdsensing by multi-agent curriculum learning with collaborative observation augmentation,

Y . Ye, Y . Tian, C. H. Liu, L. Dong, G. Qi, and D. Wu, “AoI-aware air- ground mobile crowdsensing by multi-agent curriculum learning with collaborative observation augmentation,”IEEE Transactions on Mobile Computing, no. 01, pp. 1–13, 2025

work page 2025

[30] [30]

Heterogeneous-agent reinforcement learning,

Y . Zhong, J. G. Kuba, X. Feng, S. Hu, J. Ji, and Y . Yang, “Heterogeneous-agent reinforcement learning,”Journal of Machine Learning Research, vol. 25, no. 32, pp. 1–67, 2024

work page 2024

[31] [31]

Joint transmit designs for coexistence of MIMO wireless communications and sparse sensing radars in clutter,

B. Li and A. P. Petropulu, “Joint transmit designs for coexistence of MIMO wireless communications and sparse sensing radars in clutter,” IEEE Transactions on Aerospace and Electronic Systems, vol. 53, no. 6, pp. 2846–2864, 2017

work page 2017

[32] [32]

Optimal training for residual self-interference for full-duplex one-way relays,

X. Li, C. Tepedelenlio ˘glu, and H. S ¸enol, “Optimal training for residual self-interference for full-duplex one-way relays,”IEEE Transactions on Communications, vol. 66, no. 12, pp. 5976–5989, 2018

work page 2018

[33] [33]

Sensing as 14 a service in 6G perceptive networks: A unified framework for ISAC resource allocation,

F. Dong, F. Liu, Y . Cui, W. Wang, K. Han, and Z. Wang, “Sensing as 14 a service in 6G perceptive networks: A unified framework for ISAC resource allocation,”IEEE Transactions on Wireless Communications, vol. 22, no. 5, pp. 3522–3536, 2022

work page 2022

[34] [34]

Radar-assisted predictive beamforming for vehicular links: Communication served by sensing,

F. Liu, W. Yuan, C. Masouros, and J. Yuan, “Radar-assisted predictive beamforming for vehicular links: Communication served by sensing,” IEEE Transactions on Wireless Communications, vol. 19, no. 11, pp. 7704–7719, 2020

work page 2020

[35] [35]

Industry tip: Picking the minimum process noise variance for your NCV track filter,

W. Blair, “Industry tip: Picking the minimum process noise variance for your NCV track filter,”IEEE Aerospace and Electronic Systems Magazine, vol. 36, no. 2, pp. 72–74, 2021

work page 2021

[36] [36]

High-Dimensional Continuous Control Using Generalized Advantage Estimation

J. Schulman, P. Moritz, S. Levine, M. Jordan, and P. Abbeel, “High- dimensional continuous control using generalized advantage estimation,” arXiv preprint arXiv:1506.02438, 2015

work page internal anchor Pith review Pith/arXiv arXiv 2015

[37] [37]

Collaborative reinforcement learning based unmanned aerial vehicle (UA V) trajectory design for 3D UA V tracking,

Y . Zhu, M. Chen, S. Wang, Y . Hu, Y . Liu, and C. Yin, “Collaborative reinforcement learning based unmanned aerial vehicle (UA V) trajectory design for 3D UA V tracking,”IEEE Transactions on Mobile Computing, vol. 23, no. 12, pp. 10 787–10 802, 2024

work page 2024

[38] [38]

Technical specification group radio access network: Study on enhanced LTE support for aerial vehicles,

J. Meredith, “Technical specification group radio access network: Study on enhanced LTE support for aerial vehicles,” 2015

work page 2015

[39] [39]

A scheme for robust distributed sensor fusion based on average consensus,

L. Xiao, S. Boyd, and S. Lall, “A scheme for robust distributed sensor fusion based on average consensus,” inIPSN 2005. Fourth International Symposium on Information Processing in Sensor Networks, 2005.IEEE, 2005, pp. 63–70

work page 2005

[40] [40]

Genetic algorithms,

J. H. Holland, “Genetic algorithms,”Scientific american, vol. 267, no. 1, pp. 66–73, 1992

work page 1992