pith. machine review for the scientific record. sign in

arxiv: 2604.07687 · v1 · submitted 2026-04-09 · 💻 cs.LG · cs.AI

Recognition: no theorem link

Joint Task Offloading, Inference Optimization and UAV Trajectory Planning for Generative AI Empowered Intelligent Transportation Digital Twin

Authors on Pith no claims yet

Pith reviewed 2026-05-10 18:11 UTC · model grok-4.3

classification 💻 cs.LG cs.AI
keywords UAV trajectory planningtask offloadinggenerative AIdigital twindiffusion modelsreinforcement learningintelligent transportationMarkov decision process
0
0 comments X

The pith

A reinforcement learning algorithm jointly optimizes UAV task offloading, diffusion inference, and trajectories to maximize digital twin utility.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper aims to solve the challenge of maintaining high-fidelity, low-delay digital twins for intelligent transportation by having UAVs run generative AI models on sensor data. It sets up a joint optimization of offloading decisions, inference parameters, and flight paths as a system utility maximization problem balancing fidelity and delay. By modeling this as a heterogeneous-agent Markov decision process, the authors develop the SU-HATD3 algorithm based on deep reinforcement learning. Numerical experiments indicate that this method delivers higher utility and quicker convergence compared to baseline algorithms. Readers would care because effective digital twins can support better traffic management and safety.

Core claim

The authors claim that the system utility maximization problem for GAI-empowered ITDT, involving DMI task offloading, inference optimization, and UAV trajectory planning, can be effectively addressed by formulating it as a heterogeneous-agent MDP and solving it with the proposed SU-HATD3 algorithm, which achieves superior system utility and faster convergence than several baseline algorithms.

What carries the argument

The SU-HATD3 algorithm, a sequential update-based heterogeneous-agent twin delayed deep deterministic policy gradient method that learns near-optimal policies for the joint decisions under dynamic conditions.

If this is right

  • If the algorithm works as claimed, UAVs can dynamically adapt offloading and paths to maintain better fidelity-delay tradeoffs in changing environments.
  • Improved convergence means the system can respond faster to new tasks or mobility changes.
  • The joint approach avoids suboptimal separate optimizations of offloading, inference, and trajectories.
  • Higher utility supports more valuable data transformation for the digital twin updates.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Similar joint optimization frameworks could apply to other edge AI scenarios with mobile agents like autonomous vehicles.
  • Testing the approach with real sensor data and actual diffusion model runtimes on UAV hardware would strengthen the results.
  • If the utility function is adjusted for different priorities, the algorithm might generalize to other digital twin applications.
  • The heterogeneous agent modeling suggests scalability to multiple UAVs working together.

Load-bearing premise

The heterogeneous-agent MDP formulation and the chosen utility function based on fidelity-delay tradeoff accurately reflect the actual network dynamics, channel conditions, and task statistics in real transportation environments.

What would settle it

Deploying the SU-HATD3 algorithm on physical UAVs processing real roadside sensor data and measuring actual digital twin update fidelity and delays against baselines; if no significant improvement is observed, the claim would be falsified.

Figures

Figures reproduced from arXiv: 2604.07687 by Bingqi Zhang, Junchuan Fan, Qian Chen, Rong Yu, Xiaohuan Li, Xumin Huang.

Figure 1
Figure 1. Figure 1: System model. inference optimization and UAV trajectory planning under dynamic environments and limited UAV computing resources. Furthermore, while some studies utilize MARL to address UAV scheduling, they often encounter challenges such as training instability and slow convergence, particularly in large￾scale heterogeneous systems remain significant limitations. In contrast, our work introduces an enhance… view at source ↗
Figure 2
Figure 2. Figure 2: The diagram of the proposed SU-HATD3 algorithm. [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Convergence analysis under different algorithms, learning rates, and scenario scales. [PITH_FULL_IMAGE:figures/full_fig_p009_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: System utility of various algorithms under different settings. [PITH_FULL_IMAGE:figures/full_fig_p009_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Comparison of the average fidelity and average delay for 5 UAVs and [PITH_FULL_IMAGE:figures/full_fig_p010_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: UAV movement trajectory with N = 2, R = 40 (The start and end points are marked with text and red stars, respectively). task completion delay. Specifically, when the UAV computing capability is set to 2000, SU-HATD3 outperforms MADDPG, MATD3, and HADDPG by 150%, 33%, and 8%, respectively. 3) Comparison of different performances [PITH_FULL_IMAGE:figures/full_fig_p010_6.png] view at source ↗
read the original abstract

To implement the intelligent transportation digital twin (ITDT), unmanned aerial vehicles (UAVs) are scheduled to process the sensing data from the roadside sensors. At this time, generative artificial intelligence (GAI) technologies such as diffusion models are deployed on the UAVs to transform the raw sensing data into the high-quality and valuable. Therefore, we propose the GAI-empowered ITDT. The dynamic processing of a set of diffusion model inference (DMI) tasks on the UAVs with dynamic mobility simultaneously influences the DT updating fidelity and delay. In this paper, we investigate a joint optimization problem of DMI task offloading, inference optimization and UAV trajectory planning as the system utility maximization (SUM) problem to address the fidelity-delay tradeoff for the GAI-empowered ITDT. To seek a solution to the problem under the network dynamics, we model the SUM problem as the heterogeneous-agent Markov decision process, and propose the sequential update-based heterogeneous-agent twin delayed deep deterministic policy gradient (SU-HATD3) algorithm, which can quickly learn a near-optimal solution. Numerical results demonstrate that compared with several baseline algorithms, the proposed algorithm has great advantages in improving the system utility and convergence rate.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 1 minor

Summary. The paper claims that modeling the joint problem of DMI task offloading, inference optimization, and UAV trajectory planning as a system utility maximization (SUM) objective in a GAI-empowered intelligent transportation digital twin can be solved via a heterogeneous-agent MDP formulation, for which the proposed SU-HATD3 algorithm yields higher system utility and faster convergence than baselines in numerical simulations.

Significance. If the simulation results prove robust, the work would offer a concrete multi-agent RL method (SU-HATD3) for handling the fidelity-delay tradeoff in UAV-assisted generative-AI digital twins, extending TD3-style algorithms to heterogeneous agents with sequential updates. The numerical demonstration of improved convergence is a modest algorithmic contribution, but the absence of real-world traces, ablation studies, or sensitivity analysis on modeling choices limits broader significance to the specific simulation setting.

major comments (3)
  1. [Numerical Results] Numerical Results section: the central claim that SU-HATD3 'has great advantages' in system utility and convergence rate rests on simulations whose network parameters, baseline implementations, number of runs, and statistical significance (error bars) are not reported, so the evidence for superiority cannot be evaluated.
  2. [Problem formulation] Problem formulation (SUM objective): the utility function encodes a fidelity-delay tradeoff whose precise weighting and functional forms are chosen by the authors; performance gains are therefore measured against a metric that can be tuned to favor the proposed algorithm, creating a circularity that requires explicit sensitivity analysis or alternative weightings to resolve.
  3. [MDP modeling] Heterogeneous-agent MDP modeling: the formulation assumes specific task statistics, channel dynamics, and diffusion-model inference latencies without any sensitivity analysis or real-world trace validation; the reported ranking of SU-HATD3 could therefore be an artifact of the chosen simulation parameters rather than a robust property of the algorithm.
minor comments (1)
  1. [Abstract] Abstract: the phrase 'several baseline algorithms' should name the specific baselines used so readers can immediately assess the comparison.

Simulated Author's Rebuttal

3 responses · 1 unresolved

We thank the referee for the constructive comments on our manuscript. We address each major comment point by point below, indicating planned revisions where appropriate.

read point-by-point responses
  1. Referee: [Numerical Results] Numerical Results section: the central claim that SU-HATD3 'has great advantages' in system utility and convergence rate rests on simulations whose network parameters, baseline implementations, number of runs, and statistical significance (error bars) are not reported, so the evidence for superiority cannot be evaluated.

    Authors: We agree that the current presentation of numerical results is insufficient for rigorous evaluation. In the revised manuscript, we will add a dedicated table listing all network parameters, UAV dynamics, diffusion model settings, and simulation hyperparameters. We will also provide pseudocode or explicit implementation details for each baseline, report all results as averages over 20 independent runs with error bars indicating standard deviation, and include pairwise statistical significance tests (e.g., Welch's t-test) to support the claimed advantages. revision: yes

  2. Referee: [Problem formulation] Problem formulation (SUM objective): the utility function encodes a fidelity-delay tradeoff whose precise weighting and functional forms are chosen by the authors; performance gains are therefore measured against a metric that can be tuned to favor the proposed algorithm, creating a circularity that requires explicit sensitivity analysis or alternative weightings to resolve.

    Authors: The utility function is derived from the fidelity-delay requirements of GAI-empowered intelligent transportation digital twins, with weights selected to reflect typical operational priorities in the literature. To eliminate any appearance of circularity, we will add a sensitivity analysis subsection that varies the weighting coefficients over a wide range and tests alternative functional forms (e.g., linear vs. logarithmic delay penalties). The revised results will demonstrate that SU-HATD3 maintains its relative advantage across these variations. revision: yes

  3. Referee: [MDP modeling] Heterogeneous-agent MDP modeling: the formulation assumes specific task statistics, channel dynamics, and diffusion-model inference latencies without any sensitivity analysis or real-world trace validation; the reported ranking of SU-HATD3 could therefore be an artifact of the chosen simulation parameters rather than a robust property of the algorithm.

    Authors: The MDP formulation employs standard models for Poisson task arrivals, Rayleigh fading channels, and diffusion inference latencies drawn from prior UAV and generative-AI studies. While the work is simulation-based and does not incorporate proprietary real-world traces, we will include a new sensitivity study that perturbs task rates, channel coherence times, and inference latency distributions by ±30 %. This analysis will confirm that the performance ordering remains consistent. We will also explicitly discuss the simulation assumptions and their limitations in the revised text. revision: partial

standing simulated objections not resolved
  • Provision of real-world traces for validation, as the study is conducted entirely in simulation due to the unavailability of suitable public UAV-GAI datasets.

Circularity Check

0 steps flagged

No circularity; standard RL simulation evaluation on author-defined utility

full rationale

The paper formulates a joint optimization as a system utility maximization problem, casts it as a heterogeneous-agent MDP, introduces the SU-HATD3 algorithm to solve it, and reports numerical simulation results showing higher utility and faster convergence than baselines. This chain is self-contained: the utility function is an explicit modeling choice, the MDP is a direct translation of the problem, and the numerical results are direct empirical outcomes of running the proposed optimizer versus alternatives on that same model. No step reduces by construction to a prior fit, self-citation, or renamed input; the reported superiority is an observable simulation outcome rather than a definitional tautology. No load-bearing self-citations or uniqueness theorems appear in the provided text.

Axiom & Free-Parameter Ledger

3 free parameters · 2 axioms · 0 invented entities

The central claim depends on several modeling choices and hyperparameters that are not derived from first principles or external data.

free parameters (3)
  • utility weights between fidelity and delay
    The system utility function combines fidelity and delay; the relative weighting is chosen by the authors to produce the reported gains.
  • diffusion model inference step count and offloading thresholds
    These control the fidelity-delay tradeoff and are optimized inside the RL policy but ultimately tuned to the simulation environment.
  • learning rates and network sizes for SU-HATD3
    Standard RL hyperparameters that are fitted to achieve the claimed convergence and utility improvements.
axioms (2)
  • domain assumption The heterogeneous-agent MDP accurately represents the joint dynamics of task arrivals, wireless channels, UAV mobility, and diffusion inference latency.
    Invoked when the SUM problem is cast as an MDP; no external validation or sensitivity analysis is provided.
  • domain assumption Baseline algorithms are implemented with comparable hyperparameter tuning effort.
    Required for the claim of 'great advantages' over baselines.

pith-pipeline@v0.9.0 · 5531 in / 1506 out tokens · 62310 ms · 2026-05-10T18:11:33.890705+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

34 extracted references · 1 canonical work pages

  1. [1]

    A graph-assisted digital-twin-driven multiagent shared offloading for internet of vehicles,

    M. Zahangir Alam, S. Rahman, M. Asif Bin Khaled, A. Islam, and A. Jamalipour, “A graph-assisted digital-twin-driven multiagent shared offloading for internet of vehicles,”IEEE Internet of Things Journal, vol. 12, no. 11, pp. 17 349–17 363, 2025

  2. [2]

    Cooperative digital twins enhanced uav topology optimization for multi-target tracking,

    L. Zhou, S. Leng, Z. Xiong, D. Niyato, Z. Han, and T. Q. S. Quek, “Cooperative digital twins enhanced uav topology optimization for multi-target tracking,”IEEE Transactions on Communications, pp. 1– 1, 2025

  3. [3]

    When digital twin meets generative ai: Intelligent closed-loop network management,

    X. Huang, H. Yang, C. Zhou, M. He, X. Shen, and W. Zhuang, “When digital twin meets generative ai: Intelligent closed-loop network management,”IEEE Network, vol. 39, no. 5, pp. 272–279, 2025

  4. [4]

    Unleashing the power of edge-cloud generative ai in mobile networks: A survey of aigc services,

    M. Xu, H. Du, D. Niyato, J. Kang, Z. Xiong, S. Mao, Z. Han, A. Jamalipour, D. I. Kim, X. Shen, V . C. M. Leung, and H. V . Poor, “Unleashing the power of edge-cloud generative ai in mobile networks: A survey of aigc services,”IEEE Communications Surveys & Tutorials, vol. 26, no. 2, pp. 1127–1170, 2024

  5. [5]

    Robust roadside perception: An automated data synthesis pipeline minimizing human annotation,

    R. Zhang, D. Meng, L. Bassett, S. Shen, Z. Zou, and H. X. Liu, “Robust roadside perception: An automated data synthesis pipeline minimizing human annotation,”IEEE Transactions on Intelligent Vehicles, vol. 9, no. 10, pp. 6232–6241, 2024

  6. [6]

    Dmdat: Diffusion model-based data augmentation technique for vision-based accident detection in vehicular networks,

    S. Sai, U. Mittal, and V . Chamola, “Dmdat: Diffusion model-based data augmentation technique for vision-based accident detection in vehicular networks,”IEEE Transactions on Vehicular Technology, vol. 74, no. 2, pp. 2241–2250, 2025

  7. [7]

    A diffusion model for traffic data imputation,

    B. Lu, Q. Miao, Y . Liu, T. S. Tamir, H. Zhao, X. Zhang, Y . Lv, and F.-Y . Wang, “A diffusion model for traffic data imputation,”IEEE/CAA Journal of Automatica Sinica, vol. 12, no. 3, pp. 606–617, 2025

  8. [8]

    Defending uavs against adversarial attacks using diffusion model,

    L. Yi, J. Song, Y . Huan, S. Liu, J. Wang, A. Tolba, J. Ding, and C. Li, “Defending uavs against adversarial attacks using diffusion model,” IEEE Transactions on Vehicular Technology, pp. 1–14, 2025

  9. [9]

    Task assignment and exploration optimization for low altitude uav rescue via generative ai enhanced multi-agent reinforcement learning,

    X. Tang, Q. Chen, W. Weng, C. Jin, Z. Liu, J. Wang, G. Sun, X. Li, and D. Niyato, “Task assignment and exploration optimization for low altitude uav rescue via generative ai enhanced multi-agent reinforcement learning,”IEEE Transactions on Mobile Computing, pp. 1–17, 2025

  10. [10]

    Heterogeneous multi-agent reinforcement learning based on modularized policy network,

    H. T. Kim and J. Park, “Heterogeneous multi-agent reinforcement learning based on modularized policy network,”Expert Systems with Applications, vol. 284, p. 127856, 2025. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0957417425014782

  11. [11]

    Dnn task assignment in uav networks: A generative ai enhanced multiagent reinforcement learning approach,

    X. Tang, Q. Chen, W. Weng, B. Liao, J. Wang, X. Cao, and X. Li, “Dnn task assignment in uav networks: A generative ai enhanced multiagent reinforcement learning approach,”IEEE Internet of Things Journal, vol. 12, no. 10, pp. 13 340–13 352, 2025

  12. [12]

    Heterogeneous-agent reinforcement learning,

    Y . Zhong, J. G. Kuba, X. Feng, S. Hu, J. Ji, and Y . Yang, “Heterogeneous-agent reinforcement learning,”Journal of Machine Learning Research, vol. 25, no. 32, pp. 1–67, 2024. [Online]. Available: http://jmlr.org/papers/v25/23-0488.html

  13. [13]

    Toward transportation digital twin systems for traffic safety and mobility: A review,

    M. S. Irfan, S. Dasgupta, and M. Rahman, “Toward transportation digital twin systems for traffic safety and mobility: A review,”IEEE Internet of Things Journal, vol. 11, no. 14, pp. 24 581–24 603, 2024

  14. [14]

    Harnessing digital twin technol- ogy for adaptive traffic signal control: Improving signalized intersection performance and user satisfaction,

    S. Dasgupta, M. Rahman, and S. Jon, “Harnessing digital twin technol- ogy for adaptive traffic signal control: Improving signalized intersection performance and user satisfaction,”IEEE Internet of Things Journal, vol. 11, no. 22, pp. 36 596–36 618, 2024

  15. [15]

    Digital twins in the intelligent transport systems,

    A. Rudskoy, I. Ilin, and A. Prokhorov, “Digital twins in the intelligent transport systems,”Transportation Research Procedia, vol. 54, pp. 927–935, 2021, international Scientific Siberian Transport Forum - TransSiberia 2020. [Online]. Available: https://www.sciencedirect.com/ science/article/pii/S235214652100332X

  16. [16]

    A digital twin in transportation: Real-time synergy of traffic data streams and simulation for virtualizing motorway dynamics,

    K. Ku ˇsi´c, R. Schumann, and E. Ivanjko, “A digital twin in transportation: Real-time synergy of traffic data streams and simulation for virtualizing motorway dynamics,”Advanced Engineering Informatics, vol. 55, p. 101858, 2023

  17. [17]

    Generative ai empowered network digital twins: Architecture, technologies, and applications,

    T. Li, Q. Long, H. Chai, S. Zhang, F. Jiang, H. Liu, W. Huang, D. Jin, and Y . Li, “Generative ai empowered network digital twins: Architecture, technologies, and applications,”ACM Comput. Surv., vol. 57, no. 6, Feb. 2025. [Online]. Available: https://doi.org/10.1145/3711682

  18. [18]

    Digital twin-assisted adaptive federated multi-agent drl with genai for optimized resource allocation in iov networks,

    P. Singh, B. Hazarika, K. Singh, W.-J. Huang, and T. Q. Duong, “Digital twin-assisted adaptive federated multi-agent drl with genai for optimized resource allocation in iov networks,” in2025 IEEE Wireless Communications and Networking Conference (WCNC), 2025, pp. 01–06

  19. [19]

    Hybrid trafficai: A generative ai framework for real-time traffic simulation and adaptive behavior modeling,

    H. Bilal, A. Rehman, M. S. Aslam, I. Ullah, W.-J. Chang, N. Kumar, and A. M. Almuhaideb, “Hybrid trafficai: A generative ai framework for real-time traffic simulation and adaptive behavior modeling,”IEEE Transactions on Intelligent Transportation Systems, pp. 1–17, 2025

  20. [20]

    Gentwin: Generative ai-powered digital twinning for adaptive management in iot networks,

    K. Duran, H. Shin, T. Q. Duong, and B. Canberk, “Gentwin: Generative ai-powered digital twinning for adaptive management in iot networks,” IEEE Transactions on Cognitive Communications and Networking, vol. 11, no. 2, pp. 1053–1063, 2025

  21. [21]

    Generative spatial artificial intelligence for sustainable smart cities: A pioneering large flow model for urban digital twin,

    J. Huang, S. E. Bibri, and P. Keel, “Generative spatial artificial intelligence for sustainable smart cities: A pioneering large flow model for urban digital twin,”Environmental Science and Ecotechnology, vol. 24, p. 100526, 2025. [Online]. Available: https://www.sciencedirect. com/science/article/pii/S2666498425000043

  22. [22]

    Cooperative digital twins for uav-based scenarios,

    L. Zhou, S. Leng, Q. Wang, T. Q. Quek, and M. Guizani, “Cooperative digital twins for uav-based scenarios,”IEEE Communications Magazine, vol. 63, no. 3, pp. 40–46, 2025

  23. [23]

    Over- the-air federated learning in digital twins empowered uav swarms,

    B. Jiang, J. Du, C. Jiang, Z. Han, A. Alhammadi, and M. Debbah, “Over- the-air federated learning in digital twins empowered uav swarms,”IEEE Transactions on Wireless Communications, vol. 23, no. 11, pp. 17 619– 17 634, 2024

  24. [24]

    Uav-assisted digital-twin synchronization with tiny-machine-learning-based semantic communications,

    J. Tang, J. Nie, J. Bai, J. Xu, S. Li, Y . Zhang, and Y . Yuan, “Uav-assisted digital-twin synchronization with tiny-machine-learning-based semantic communications,”IEEE Internet of Things Journal, vol. 11, no. 17, pp. 28 437–28 451, 2024

  25. [25]

    Building digital twin networks for heterogeneous uav clusters: A low-latency agent gateway selection method,

    L. Luo, F. Tang, H. Chen, S. Zhang, X. Chen, and M. Zhao, “Building digital twin networks for heterogeneous uav clusters: A low-latency agent gateway selection method,”IEEE Transactions on Cognitive Communications and Networking, pp. 1–1, 2025

  26. [26]

    Dynamic data-driven digital twin network construction and calibration for aav swarms,

    X. Liu, Z. Li, L. Lei, G. Shen, S. Cai, and X. Liu, “Dynamic data-driven digital twin network construction and calibration for aav swarms,”IEEE Internet of Things Journal, vol. 12, no. 12, pp. 19 397–19 413, 2025

  27. [27]

    Diffusion-based reinforcement learning for edge-enabled ai-generated content services,

    H. Du, Z. Li, D. Niyato, J. Kang, Z. Xiong, H. Huang, and S. Mao, “Diffusion-based reinforcement learning for edge-enabled ai-generated content services,”IEEE Transactions on Mobile Computing, vol. 23, no. 9, pp. 8902–8918, 2024

  28. [28]

    Gai-iov: Bridging generative ai and vehicular networks for ubiquitous edge intelligence,

    G. Xie, Z. Xiong, X. Zhang, R. Xie, S. Guo, M. Guizani, and H. Vin- cent Poor, “Gai-iov: Bridging generative ai and vehicular networks for ubiquitous edge intelligence,”IEEE Transactions on Wireless Commu- nications, vol. 23, no. 10, pp. 12 799–12 814, 2024

  29. [29]

    Stackelberg game-based computation offloading in social and cognitive industrial internet of 12 things,

    F. Li, H. Yao, J. Du, C. Jiang, and Y . Qian, “Stackelberg game-based computation offloading in social and cognitive industrial internet of 12 things,”IEEE Transactions on Industrial Informatics, vol. 16, no. 8, pp. 5444–5455, 2020

  30. [30]

    Profit maximization for multi-time-scale hierarchical drl-based joint optimization in mec-enabled air-ground integrated networks,

    J. Du, J. Xu, A. Sun, J. Kang, Y . Hu, F. Richard Yu, and V . C. M. Leung, “Profit maximization for multi-time-scale hierarchical drl-based joint optimization in mec-enabled air-ground integrated networks,”IEEE Transactions on Communications, vol. 73, no. 3, pp. 1591–1606, 2025

  31. [31]

    A happo based task offloading strategy in heterogeneous air-ground collaborative mec networks,

    T. Wen, X. Wang, and Q. Chen, “A happo based task offloading strategy in heterogeneous air-ground collaborative mec networks,” in 2025 IEEE/CIC International Conference on Communications in China (ICCC), 2025, pp. 1–6

  32. [32]

    Task offloading in uav-assisted mobile cloud-edge computing networks: An aop-aware happo approach,

    H. Zhang, J. Du, C. Jiang, J. Wang, F. Bader, and M. Debbah, “Task offloading in uav-assisted mobile cloud-edge computing networks: An aop-aware happo approach,”IEEE Transactions on Vehicular Technol- ogy, vol. 74, no. 9, pp. 14 745–14 759, 2025

  33. [33]

    Multi-agent learning-based optimal task offloading and uav trajectory planning for agin-power iot,

    P. Qin, Y . Fu, Y . Xie, K. Wu, X. Zhang, and X. Zhao, “Multi-agent learning-based optimal task offloading and uav trajectory planning for agin-power iot,”IEEE Transactions on Communications, vol. 71, no. 7, pp. 4005–4017, 2023

  34. [34]

    Cooperative multi-agent deep reinforce- ment learning for computation offloading in digital twin satellite edge networks,

    Z. Ji, S. Wu, and C. Jiang, “Cooperative multi-agent deep reinforce- ment learning for computation offloading in digital twin satellite edge networks,”IEEE Journal on Selected Areas in Communications, vol. 41, no. 11, pp. 3414–3429, 2023