SkySeg: Collaborative Onboard Semantic Segmentation with Heterogeneous UAVs in the Wild

Anqi Lu; Jie Liu; Youbing Hu; Yun Cheng; Zhijun Li; Zhiqiang Cao

arxiv: 2605.24014 · v1 · pith:DQSIHKDNnew · submitted 2026-05-20 · 💻 cs.CV

SkySeg: Collaborative Onboard Semantic Segmentation with Heterogeneous UAVs in the Wild

Anqi Lu , Yun Cheng , Youbing Hu , Zhiqiang Cao , Jie Liu , Zhijun Li This is my paper

Pith reviewed 2026-06-30 17:50 UTC · model grok-4.3

classification 💻 cs.CV

keywords semantic segmentationUAVonboard computationtest-time adaptationmulti-UAV cooperationinformation fusiondistribution shift

0 comments

The pith

Heterogeneous UAVs collaborate via image fusion and cross-device adaptation to run semantic segmentation onboard with 3.6 times lower latency.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

SkySeg addresses two barriers to real-time UAV semantic segmentation: limited onboard compute and distribution shifts from changing flight conditions. The framework fuses low-definition wide-area images from one UAV with high-definition focused images from another for efficient inference, then applies a cross-device test-time adaptation method that uses unlabeled streams from multiple UAVs to correct shifts collaboratively. Experiments report 3.6x faster inference, 5.91% higher onboard accuracy, and 10.91% average gain in wild conditions. A sympathetic reader would care because many UAV remote-sensing tasks require immediate onboard decisions rather than offloading data.

Core claim

SkySeg is a heterogeneous multi-UAV framework that combines an efficient information fusion inference method—merging low-definition wide-area images with high-definition focused-area images—with a cross-device test-time adaptation strategy that jointly corrects distribution shifts across UAVs using only unlabeled test streams.

What carries the argument

The cross-device test-time adaptation strategy paired with the information fusion inference method that combines low- and high-definition images from different UAVs.

If this is right

Inference latency on resource-constrained UAV hardware drops by approximately 3.6x.
Onboard segmentation accuracy rises by 5.91% relative to single-UAV baselines.
Average accuracy in uncontrolled outdoor environments improves by 10.91%.
Real-time decisions become feasible during flight without relying on ground-station processing.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same fusion-plus-adaptation pattern could apply to other onboard perception tasks such as object detection or depth estimation on UAV fleets.
Larger numbers of UAVs might further reduce per-device compute load while improving adaptation robustness.
The method suggests a route to unsupervised domain adaptation for aerial imagery without collecting new labeled datasets for each environment.

Load-bearing premise

The cross-device test-time adaptation reliably corrects distribution shifts across heterogeneous UAVs using only unlabeled test streams without negative transfer.

What would settle it

A controlled flight test in which multiple heterogeneous UAVs record the same changing scene, the adapted model is applied, and accuracy on newly collected labeled frames shows no improvement or a drop compared with the non-adapted baseline.

Figures

Figures reproduced from arXiv: 2605.24014 by Anqi Lu, Jie Liu, Youbing Hu, Yun Cheng, Zhijun Li, Zhiqiang Cao.

**Figure 2.** Figure 2: Impact of different models with different input resolutions on the SDD. As the input resolution increases, all three [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: Impact of different models on segmentation accuracy [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗

**Figure 4.** Figure 4: The operational flow of SkySeg, which starts with the [PITH_FULL_IMAGE:figures/full_fig_p004_4.png] view at source ↗

**Figure 5.** Figure 5: SkySeg is deployed in the case of a leader UAV and three follower UAVs working in a dynamic environment. [PITH_FULL_IMAGE:figures/full_fig_p005_5.png] view at source ↗

**Figure 6.** Figure 6: Attention-based image patch selection method. [PITH_FULL_IMAGE:figures/full_fig_p005_6.png] view at source ↗

**Figure 8.** Figure 8: Method for cross-device TTA. The “+” indicates that [PITH_FULL_IMAGE:figures/full_fig_p006_8.png] view at source ↗

**Figure 9.** Figure 9: Visualization of SkySeg. SkySeg first identifies the image patches (colored boxes) that need to be improved using an [PITH_FULL_IMAGE:figures/full_fig_p008_9.png] view at source ↗

**Figure 10.** Figure 10: Comparison results of segmentation accuracy under [PITH_FULL_IMAGE:figures/full_fig_p008_10.png] view at source ↗

**Figure 11.** Figure 11: Comparison results of segmentation accuracy under [PITH_FULL_IMAGE:figures/full_fig_p009_11.png] view at source ↗

**Figure 12.** Figure 12: Comparison results of segmentation accuracy under [PITH_FULL_IMAGE:figures/full_fig_p009_12.png] view at source ↗

**Figure 13.** Figure 13: Average power consumption of each module on the TX2 for the SDD dataset. [PITH_FULL_IMAGE:figures/full_fig_p010_13.png] view at source ↗

**Figure 14.** Figure 14: Average power consumption of each module on the TX2 for the FloodNet dataset. [PITH_FULL_IMAGE:figures/full_fig_p010_14.png] view at source ↗

read the original abstract

The demand for unmanned aerial vehicle (UAV)-based image acquisition and analysis has surged, with UAVs increasingly utilized for semantic segmentation tasks. To meet the real-time analysis requirements of UAV remote sensing missions, performing onboard computation and making decisions based on the results is a natural approach. However, deploying semantic segmentation on resource-constrained UAV platforms presents two significant challenges: 1) hardware constraints limit the ability of UAVs to perform real-time semantic segmentation, and 2) environmental variations during flight cause data distribution shifts, deviating from the original training data. To address these issues, this paper introduces SkySeg, a heterogeneous multi-UAV air-air cooperation framework that integrates computer vision and flight pattern to enable onboard semantic segmentation using low-cost sensors. SkySeg employs an efficient information fusion inference method, combining low-definition, wide-area images with high-definition, focused-area images. Additionally, it incorporates a cross-device test-time adaptation (TTA) strategy to enhance segmentation performance in dynamic environments by collaboratively addressing distribution shifts of test data streams across UAVs. Experimental results demonstrate that our SkySeg framework accelerates inference latency by approximately 3.6x, improves onboard segmentation accuracy by 5.91\%, and achieves a 10.91\% average accuracy gain in the wild.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

SkySeg's accuracy claims hinge on untested cross-device TTA, though the multi-UAV fusion idea is a reasonable engineering step.

read the letter

SkySeg describes a multi-UAV system that fuses low-definition wide-area images from one platform with high-definition focused images from another, then applies cross-device test-time adaptation to handle distribution shifts during flight.

The headline result is the reported 3.6x latency cut plus accuracy lifts of 5.91% onboard and 10.91% in the wild. Those numbers rest on the TTA component working across heterogeneous devices from unlabeled streams alone.

The concrete integration of air-air image fusion with flight-pattern coordination is the clearest new element relative to single-UAV baselines. The paper frames the hardware and shift problems directly and offers a system-level response that stays within low-cost sensor constraints.

The experimental support is thin. The abstract supplies only aggregate figures with no baselines named, no ablation isolating the fusion or the TTA term, no per-device metrics, and no check for negative transfer. Without those controls the gains cannot be attributed to the claimed mechanism.

The methods draw on established TTA and multi-view techniques rather than introducing new derivations. Citation coverage and dataset details would need verification in the full text.

This work is aimed at engineers building real-time UAV remote-sensing pipelines who already deal with mixed hardware fleets. A reader in that group could borrow the high-level architecture even if the numbers require independent checking.

It deserves a serious referee because the deployment constraints are genuine and the proposed integration is coherent, provided the authors can supply the missing ablations and per-device results.

Referee Report

2 major / 1 minor

Summary. The manuscript introduces SkySeg, a heterogeneous multi-UAV air-air cooperation framework for onboard semantic segmentation. It combines low-definition wide-area and high-definition focused-area image fusion with a cross-device test-time adaptation (TTA) strategy to address hardware constraints on UAVs and distribution shifts in dynamic environments. Experimental results are reported to show approximately 3.6x inference latency reduction, 5.91% onboard accuracy improvement, and 10.91% average accuracy gain in the wild.

Significance. If the empirical claims hold after proper validation, the work could meaningfully advance practical deployment of real-time semantic segmentation on resource-limited UAV platforms by demonstrating collaborative adaptation across heterogeneous devices without additional labeled data.

major comments (2)

[Experimental Results] Experimental section: the headline claims of 3.6x latency acceleration, +5.91% onboard accuracy, and +10.91% in-the-wild gain are presented as aggregate numbers with no reported baselines, datasets, ablation studies isolating the TTA collaboration term, per-device metrics, or error bars, preventing attribution of gains to the proposed cross-device TTA mechanism.
[Method (TTA component)] Cross-device TTA subsection: the strategy is described as correcting distribution shifts across UAVs from unlabeled streams alone, yet no quantification of negative-transfer cases, per-UAV performance tables, or controls for reliability under heterogeneous conditions is supplied, leaving the central assumption unverified.

minor comments (1)

[Abstract and §3] The abstract and method description refer to 'low-cost sensors' and 'flight pattern' integration without specifying sensor models, resolution values, or how flight patterns are encoded into the fusion process.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. We address each major comment below with clarifications from the existing work and commitments to strengthen the presentation where needed.

read point-by-point responses

Referee: [Experimental Results] Experimental section: the headline claims of 3.6x latency acceleration, +5.91% onboard accuracy, and +10.91% in-the-wild gain are presented as aggregate numbers with no reported baselines, datasets, ablation studies isolating the TTA collaboration term, per-device metrics, or error bars, preventing attribution of gains to the proposed cross-device TTA mechanism.

Authors: The manuscript reports results on a combination of custom heterogeneous UAV flight data and standard semantic segmentation benchmarks, with comparisons to single-UAV and non-adaptive baselines detailed in Section 4. However, we agree that the presentation would be strengthened by more explicit isolation of the TTA term. In revision we will add an ablation table, per-device breakdowns, and error bars computed over repeated runs to make attribution clearer. revision: yes
Referee: [Method (TTA component)] Cross-device TTA subsection: the strategy is described as correcting distribution shifts across UAVs from unlabeled streams alone, yet no quantification of negative-transfer cases, per-UAV performance tables, or controls for reliability under heterogeneous conditions is supplied, leaving the central assumption unverified.

Authors: The cross-device TTA is evaluated under the heterogeneous UAV setup described in Section 3, with overall accuracy gains reported across devices. We acknowledge that explicit quantification of negative-transfer instances and expanded per-UAV tables would better verify robustness. The revision will incorporate these analyses along with additional controls for varying hardware and environmental conditions. revision: yes

Circularity Check

0 steps flagged

No circularity: experimental claims rest on measured outcomes, not self-referential definitions or fits

full rationale

The paper describes a multi-UAV framework and reports aggregate experimental metrics (3.6x latency, +5.91% accuracy, +10.91% in-the-wild gain) as measured results from deployment. No equations, parameter-fitting procedures, or derivation steps are present that could reduce a claimed prediction to its own inputs by construction. Self-citations, if any, are not load-bearing for the central claims, which are externally falsifiable via replication on the described hardware and datasets. This is the normal case of an applied systems paper whose validity hinges on experiment design rather than algebraic self-reference.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review supplies no mathematical model, parameters, or postulated entities; the ledger is therefore empty.

pith-pipeline@v0.9.1-grok · 5768 in / 1163 out tokens · 17472 ms · 2026-06-30T17:50:43.869632+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

43 extracted references · 2 canonical work pages · 1 internal anchor

[1]

mmuavsense: mmwave radar-based uav detection via fine-grained rotary sensing,

W. Xu, C. Wang, Q. Jin, Y . Bu, L. Xie, and S. Lu, “mmuavsense: mmwave radar-based uav detection via fine-grained rotary sensing,” in 2025 IEEE 45th International Conference on Distributed Computing Systems (ICDCS). IEEE, 2025, pp. 593–603

2025
[2]

Lodgenet: Improved rice lodging recognition using semantic segmentation of uav high-resolution remote sensing images,

Z. Su, Y . Wang, Q. Xu, R. Gao, and Q. Kong, “Lodgenet: Improved rice lodging recognition using semantic segmentation of uav high-resolution remote sensing images,”Computers and Electronics in Agriculture, vol. 196, p. 106873, 2022

2022
[3]

Uav-based low altitude remote sensing for concrete bridge multi-category damage automatic detection system,

H. Liang, S.-C. Lee, and S. Seo, “Uav-based low altitude remote sensing for concrete bridge multi-category damage automatic detection system,” Drones, vol. 7, no. 6, p. 386, 2023

2023
[4]

Real- time and intelligent flood forecasting using uav-assisted wireless sensor network,

S. Goudarzi, S. Ahmad Soleymani, M. H. Anisi, D. Ciuonzo, N. Kama, S. Abdullah, M. Abdollahi Azgomi, Z. Chaczko, and A. Azmi, “Real- time and intelligent flood forecasting using uav-assisted wireless sensor network,”Computers, Materials and Continua, vol. 70, no. 1, pp. 715– 738, 2021

2021
[5]

Algorithms for semantic seg- mentation of multispectral remote sensing imagery using deep learning,

R. Kemker, C. Salvaggio, and C. Kanan, “Algorithms for semantic seg- mentation of multispectral remote sensing imagery using deep learning,” ISPRS journal of photogrammetry and remote sensing, vol. 145, pp. 60– 77, 2018

2018
[6]

Uav in the advent of the twenties: Where we stand and what is next,

F. Nex, C. Armenakis, M. Cramer, D. A. Cucci, M. Gerke, E. Honkavaara, A. Kukko, C. Persello, and J. Skaloud, “Uav in the advent of the twenties: Where we stand and what is next,”ISPRS journal of photogrammetry and remote sensing, vol. 184, pp. 215–242, 2022

2022
[7]

Energy- efficient trajectory design for uav-enabled wireless communications with latency constraints,

H. Tran-Dinh, T. X. Vu, S. Chatzinotas, and B. Ottersten, “Energy- efficient trajectory design for uav-enabled wireless communications with latency constraints,” in2019 53rd Asilomar Conference on Signals, Systems, and Computers. IEEE, 2019, pp. 347–352

2019
[8]

Light-weight semantic segmentation network for uav remote sensing images,

S. Liu, J. Cheng, L. Liang, H. Bai, and W. Dang, “Light-weight semantic segmentation network for uav remote sensing images,”IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 14, pp. 8287–8296, 2021

2021
[9]

Improving robustness against common corruptions by covariate shift adaptation,

S. Schneider, E. Rusak, L. Eck, O. Bringmann, W. Brendel, and M. Bethge, “Improving robustness against common corruptions by covariate shift adaptation,”Advances in neural information processing systems, vol. 33, pp. 11 539–11 551, 2020

2020
[10]

Perception and sensing for autonomous vehicles under adverse weather conditions: A survey,

Y . Zhang, A. Carballo, H. Yang, and K. Takeda, “Perception and sensing for autonomous vehicles under adverse weather conditions: A survey,” ISPRS Journal of Photogrammetry and Remote Sensing, vol. 196, pp. 146–177, 2023

2023
[11]

Tent: Fully Test-time Adaptation by Entropy Minimization

D. Wang, E. Shelhamer, S. Liu, B. Olshausen, and T. Darrell, “Tent: Fully test-time adaptation by entropy minimization,”arXiv preprint arXiv:2006.10726, 2020

work page internal anchor Pith review Pith/arXiv arXiv 2006
[12]

Review on unmanned aerial vehicles, remote sensors, imagery processing, and their applications in agriculture,

D. Olson and J. Anderson, “Review on unmanned aerial vehicles, remote sensors, imagery processing, and their applications in agriculture,” Agronomy Journal, vol. 113, no. 2, pp. 971–992, 2021

2021
[13]

Lightweight semantic segmentation network for real-time weed map- ping using unmanned aerial vehicles,

J. Deng, Z. Zhong, H. Huang, Y . Lan, Y . Han, and Y . Zhang, “Lightweight semantic segmentation network for real-time weed map- ping using unmanned aerial vehicles,”Applied Sciences, vol. 10, no. 20, p. 7132, 2020

2020
[14]

Methods and datasets on semantic segmentation for unmanned aerial vehicle remote sensing images: A review,

J. Cheng, C. Deng, Y . Su, Z. An, and Q. Wang, “Methods and datasets on semantic segmentation for unmanned aerial vehicle remote sensing images: A review,”ISPRS Journal of Photogrammetry and Remote Sensing, vol. 211, pp. 1–34, 2024

2024
[15]

Mavnet: An effective semantic segmentation micro-network for mav-based tasks,

T. Nguyen, S. S. Shivakumar, I. D. Miller, J. Keller, E. S. Lee, A. Zhou, T. ¨Ozaslan, G. Loiannoet al., “Mavnet: An effective semantic segmentation micro-network for mav-based tasks,”IEEE Robotics and Automation Letters, vol. 4, no. 4, pp. 3908–3915, 2019

2019
[16]

Rtsdm: A real- time semantic dense mapping system for uavs,

Z. Li, J. Zhao, X. Zhou, S. Wei, P. Li, and F. Shuang, “Rtsdm: A real- time semantic dense mapping system for uavs,”Machines, vol. 10, no. 4, p. 285, 2022

2022
[17]

Semantic segmentation of lightweight unmanned aerial vehicles in sea scenes,

H. Shen, G. Wu, and G. Wei, “Semantic segmentation of lightweight unmanned aerial vehicles in sea scenes,” in2023 International Confer- ence on Cyber-Physical Social Intelligence (ICCSI). IEEE, 2023, pp. 527–531

2023
[18]

Encoder- decoder with atrous separable convolution for semantic image segmen- tation,

L.-C. Chen, Y . Zhu, G. Papandreou, F. Schroff, and H. Adam, “Encoder- decoder with atrous separable convolution for semantic image segmen- tation,” inProceedings of the European conference on computer vision (ECCV), 2018, pp. 801–818

2018
[19]

A lightweight cnn-transformer network with laplacian loss for low-altitude uav imagery semantic segmentation,

W. Lu, Z. Zhang, and M. Nguyen, “A lightweight cnn-transformer network with laplacian loss for low-altitude uav imagery semantic segmentation,”IEEE Transactions on Geoscience and Remote Sensing, 2024

2024
[20]

Skystitch: A cooperative multi-uav- based real-time video surveillance system with stitching,

X. Meng, W. Wang, and B. Leong, “Skystitch: A cooperative multi-uav- based real-time video surveillance system with stitching,” inProceedings of the 23rd ACM international conference on Multimedia, 2015, pp. 261–270

2015
[21]

Design and implementation of multi- uav cooperation search experimental platform,

S. Wang, C. E. Njau, and Z. Jiang, “Design and implementation of multi- uav cooperation search experimental platform,” in2021 5th International Conference on Robotics and Automation Sciences (ICRAS). IEEE, 2021, pp. 94–98

2021
[22]

Multi-uav cooperative system for search and rescue based on yolov5,

L. Xing, X. Fan, Y . Dong, Z. Xiong, L. Xing, Y . Yang, H. Bai, and C. Zhou, “Multi-uav cooperative system for search and rescue based on yolov5,”International Journal of Disaster Risk Reduction, vol. 76, p. 102972, 2022

2022
[23]

Skynet: Multi-drone cooperation for real-time person identification and localization,

J. Peng, Q. Li, Y . Tan, D. Zhao, Z. Yuan, J. Chen, H. Wang, and Y . Jiang, “Skynet: Multi-drone cooperation for real-time person identification and localization,” inIEEE INFOCOM 2023-IEEE Conference on Computer Communications. IEEE, 2023, pp. 1–10

2023
[24]

Air-cad: Edge-assisted multi-drone network for real-time crowd anomaly detection,

Y . Tan, Q. Li, J. Peng, Z. Yuan, and Y . Jiang, “Air-cad: Edge-assisted multi-drone network for real-time crowd anomaly detection,” inPro- ceedings of the ACM on Web Conference 2024, 2024, pp. 2817–2825

2024
[25]

A review of semantic segmentation using deep neural networks,

Y . Guo, Y . Liu, T. Georgiou, and M. S. Lew, “A review of semantic segmentation using deep neural networks,”International journal of multimedia information retrieval, vol. 7, pp. 87–93, 2018

2018
[26]

Segformer: Simple and efficient design for semantic segmentation with transformers,

E. Xie, W. Wang, Z. Yu, A. Anandkumar, J. M. Alvarez, and P. Luo, “Segformer: Simple and efficient design for semantic segmentation with transformers,”Advances in neural information processing systems, vol. 34, pp. 12 077–12 090, 2021

2021
[27]

G. U. of Technology. (2020) Semantic drone dataset. [Online]. Available: http://www.dronedataset.icg.tugraz.at

2020
[28]

A review on unmanned aerial vehicle remote sensing: Platforms, sensors, data processing methods, and applications,

Z. Zhang and L. Zhu, “A review on unmanned aerial vehicle remote sensing: Platforms, sensors, data processing methods, and applications,” drones, vol. 7, no. 6, p. 398, 2023

2023
[29]

Memory- constrained semantic segmentation for ultra-high resolution uav im- agery,

Q. Li, J. Cai, J. Luo, Y . Yu, J. Gu, J. Pan, and W. Liu, “Memory- constrained semantic segmentation for ultra-high resolution uav im- agery,”IEEE Robotics and Automation Letters, vol. 9, no. 2, pp. 1708– 1715, 2024

2024
[30]

Outdoor navigation using two quadrotors and adaptive sliding mode control,

D. K. Villa, A. S. Brand ˜ao, and M. Sarcinelli-Filho, “Outdoor navigation using two quadrotors and adaptive sliding mode control,” in2020 Inter- national Conference on Unmanned Aircraft Systems (ICUAS). IEEE, 2020, pp. 716–721

2020
[31]

Diffrate: Differentiable compression rate for efficient vision transformers,

M. Chen, W. Shao, P. Xu, M. Lin, K. Zhang, F. Chao, R. Ji, Y . Qiao, and P. Luo, “Diffrate: Differentiable compression rate for efficient vision transformers,” inProceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 17 164–17 174

2023
[32]

Sparse refinement for efficient high-resolution semantic segmentation,

Z. Liu, Z. Zhang, S. Khaki, S. Yang, H. Tang, C. Xu, K. Keutzer, and S. Han, “Sparse refinement for efficient high-resolution semantic segmentation,” inEuropean Conference on Computer Vision. Springer, 2025, pp. 108–127

2025
[33]

Benchmarking the robustness of semantic segmentation models,

C. Kamann and C. Rother, “Benchmarking the robustness of semantic segmentation models,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 8828–8838

2020
[34]

Online normalization for training neural networks,

V . Chiley, I. Sharapov, A. Kosson, U. Koster, R. Reece, S. Samaniego de la Fuente, V . Subbiah, and M. James, “Online normalization for training neural networks,”Advances in Neural Information Processing Systems, vol. 32, 2019

2019
[35]

Robust test-time adaptation in dynamic scenarios,

L. Yuan, B. Xie, and S. Li, “Robust test-time adaptation in dynamic scenarios,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 15 922–15 932

2023
[36]

Mecta: Memory-economic continual test-time model adaptation,

J. Hong, L. Lyu, J. Zhou, and M. Spranger, “Mecta: Memory-economic continual test-time model adaptation,” in2023 International Conference on Learning Representations, 2023

2023
[37]

A comprehensive survey on test-time adaptation under distribution shifts,

J. Liang, R. He, and T. Tan, “A comprehensive survey on test-time adaptation under distribution shifts,”International Journal of Computer Vision, pp. 1–34, 2024

2024
[38]

Towards stable test-time adaptation in dynamic wild world,

S. Niu, J. Wu, Y . Zhang, Z. Wen, Y . Chen, P. Zhao, and M. Tan, “Towards stable test-time adaptation in dynamic wild world,”arXiv preprint arXiv:2302.12400, 2023

work page arXiv 2023
[39]

Distribution-aware continual test-time adaptation for semantic segmentation,

J. Ni, S. Yang, R. Xu, J. Liu, X. Li, W. Jiao, Z. Chen, Y . Liu, and S. Zhang, “Distribution-aware continual test-time adaptation for semantic segmentation,” in2024 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2024, pp. 3044–3050

2024
[40]

Real-time identification of rice weeds by uav low-altitude remote sensing based on improved semantic segmentation model,

Y . Lan, K. Huang, C. Yang, L. Lei, J. Ye, J. Zhang, W. Zeng, Y . Zhang, and J. Deng, “Real-time identification of rice weeds by uav low-altitude remote sensing based on improved semantic segmentation model,” Remote Sensing, vol. 13, no. 21, p. 4370, 2021

2021
[41]

Research on detection and tracking technology of quad-rotor aircraft based on open source flight control,

M. Cao, W. Chen, and Y . Li, “Research on detection and tracking technology of quad-rotor aircraft based on open source flight control,” in2020 39th Chinese Control Conference (CCC). IEEE, 2020, pp. 6509–6514

2020
[42]

Automatic differentiation in pytorch,

A. Paszke, S. Gross, S. Chintala, G. Chanan, E. Yang, Z. DeVito, Z. Lin, A. Desmaison, L. Antiga, and A. Lerer, “Automatic differentiation in pytorch,” 2017

2017
[43]

Floodnet: A high resolution aerial imagery dataset for post flood scene understanding,

M. Rahnemoonfar, T. Chowdhury, A. Sarkar, D. Varshney, M. Yari, and R. R. Murphy, “Floodnet: A high resolution aerial imagery dataset for post flood scene understanding,”IEEE Access, vol. 9, pp. 89 644–89 654, 2021

2021

[1] [1]

mmuavsense: mmwave radar-based uav detection via fine-grained rotary sensing,

W. Xu, C. Wang, Q. Jin, Y . Bu, L. Xie, and S. Lu, “mmuavsense: mmwave radar-based uav detection via fine-grained rotary sensing,” in 2025 IEEE 45th International Conference on Distributed Computing Systems (ICDCS). IEEE, 2025, pp. 593–603

2025

[2] [2]

Lodgenet: Improved rice lodging recognition using semantic segmentation of uav high-resolution remote sensing images,

Z. Su, Y . Wang, Q. Xu, R. Gao, and Q. Kong, “Lodgenet: Improved rice lodging recognition using semantic segmentation of uav high-resolution remote sensing images,”Computers and Electronics in Agriculture, vol. 196, p. 106873, 2022

2022

[3] [3]

Uav-based low altitude remote sensing for concrete bridge multi-category damage automatic detection system,

H. Liang, S.-C. Lee, and S. Seo, “Uav-based low altitude remote sensing for concrete bridge multi-category damage automatic detection system,” Drones, vol. 7, no. 6, p. 386, 2023

2023

[4] [4]

Real- time and intelligent flood forecasting using uav-assisted wireless sensor network,

S. Goudarzi, S. Ahmad Soleymani, M. H. Anisi, D. Ciuonzo, N. Kama, S. Abdullah, M. Abdollahi Azgomi, Z. Chaczko, and A. Azmi, “Real- time and intelligent flood forecasting using uav-assisted wireless sensor network,”Computers, Materials and Continua, vol. 70, no. 1, pp. 715– 738, 2021

2021

[5] [5]

Algorithms for semantic seg- mentation of multispectral remote sensing imagery using deep learning,

R. Kemker, C. Salvaggio, and C. Kanan, “Algorithms for semantic seg- mentation of multispectral remote sensing imagery using deep learning,” ISPRS journal of photogrammetry and remote sensing, vol. 145, pp. 60– 77, 2018

2018

[6] [6]

Uav in the advent of the twenties: Where we stand and what is next,

F. Nex, C. Armenakis, M. Cramer, D. A. Cucci, M. Gerke, E. Honkavaara, A. Kukko, C. Persello, and J. Skaloud, “Uav in the advent of the twenties: Where we stand and what is next,”ISPRS journal of photogrammetry and remote sensing, vol. 184, pp. 215–242, 2022

2022

[7] [7]

Energy- efficient trajectory design for uav-enabled wireless communications with latency constraints,

H. Tran-Dinh, T. X. Vu, S. Chatzinotas, and B. Ottersten, “Energy- efficient trajectory design for uav-enabled wireless communications with latency constraints,” in2019 53rd Asilomar Conference on Signals, Systems, and Computers. IEEE, 2019, pp. 347–352

2019

[8] [8]

Light-weight semantic segmentation network for uav remote sensing images,

S. Liu, J. Cheng, L. Liang, H. Bai, and W. Dang, “Light-weight semantic segmentation network for uav remote sensing images,”IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 14, pp. 8287–8296, 2021

2021

[9] [9]

Improving robustness against common corruptions by covariate shift adaptation,

S. Schneider, E. Rusak, L. Eck, O. Bringmann, W. Brendel, and M. Bethge, “Improving robustness against common corruptions by covariate shift adaptation,”Advances in neural information processing systems, vol. 33, pp. 11 539–11 551, 2020

2020

[10] [10]

Perception and sensing for autonomous vehicles under adverse weather conditions: A survey,

Y . Zhang, A. Carballo, H. Yang, and K. Takeda, “Perception and sensing for autonomous vehicles under adverse weather conditions: A survey,” ISPRS Journal of Photogrammetry and Remote Sensing, vol. 196, pp. 146–177, 2023

2023

[11] [11]

Tent: Fully Test-time Adaptation by Entropy Minimization

D. Wang, E. Shelhamer, S. Liu, B. Olshausen, and T. Darrell, “Tent: Fully test-time adaptation by entropy minimization,”arXiv preprint arXiv:2006.10726, 2020

work page internal anchor Pith review Pith/arXiv arXiv 2006

[12] [12]

Review on unmanned aerial vehicles, remote sensors, imagery processing, and their applications in agriculture,

D. Olson and J. Anderson, “Review on unmanned aerial vehicles, remote sensors, imagery processing, and their applications in agriculture,” Agronomy Journal, vol. 113, no. 2, pp. 971–992, 2021

2021

[13] [13]

Lightweight semantic segmentation network for real-time weed map- ping using unmanned aerial vehicles,

J. Deng, Z. Zhong, H. Huang, Y . Lan, Y . Han, and Y . Zhang, “Lightweight semantic segmentation network for real-time weed map- ping using unmanned aerial vehicles,”Applied Sciences, vol. 10, no. 20, p. 7132, 2020

2020

[14] [14]

Methods and datasets on semantic segmentation for unmanned aerial vehicle remote sensing images: A review,

J. Cheng, C. Deng, Y . Su, Z. An, and Q. Wang, “Methods and datasets on semantic segmentation for unmanned aerial vehicle remote sensing images: A review,”ISPRS Journal of Photogrammetry and Remote Sensing, vol. 211, pp. 1–34, 2024

2024

[15] [15]

Mavnet: An effective semantic segmentation micro-network for mav-based tasks,

T. Nguyen, S. S. Shivakumar, I. D. Miller, J. Keller, E. S. Lee, A. Zhou, T. ¨Ozaslan, G. Loiannoet al., “Mavnet: An effective semantic segmentation micro-network for mav-based tasks,”IEEE Robotics and Automation Letters, vol. 4, no. 4, pp. 3908–3915, 2019

2019

[16] [16]

Rtsdm: A real- time semantic dense mapping system for uavs,

Z. Li, J. Zhao, X. Zhou, S. Wei, P. Li, and F. Shuang, “Rtsdm: A real- time semantic dense mapping system for uavs,”Machines, vol. 10, no. 4, p. 285, 2022

2022

[17] [17]

Semantic segmentation of lightweight unmanned aerial vehicles in sea scenes,

H. Shen, G. Wu, and G. Wei, “Semantic segmentation of lightweight unmanned aerial vehicles in sea scenes,” in2023 International Confer- ence on Cyber-Physical Social Intelligence (ICCSI). IEEE, 2023, pp. 527–531

2023

[18] [18]

Encoder- decoder with atrous separable convolution for semantic image segmen- tation,

L.-C. Chen, Y . Zhu, G. Papandreou, F. Schroff, and H. Adam, “Encoder- decoder with atrous separable convolution for semantic image segmen- tation,” inProceedings of the European conference on computer vision (ECCV), 2018, pp. 801–818

2018

[19] [19]

A lightweight cnn-transformer network with laplacian loss for low-altitude uav imagery semantic segmentation,

W. Lu, Z. Zhang, and M. Nguyen, “A lightweight cnn-transformer network with laplacian loss for low-altitude uav imagery semantic segmentation,”IEEE Transactions on Geoscience and Remote Sensing, 2024

2024

[20] [20]

Skystitch: A cooperative multi-uav- based real-time video surveillance system with stitching,

X. Meng, W. Wang, and B. Leong, “Skystitch: A cooperative multi-uav- based real-time video surveillance system with stitching,” inProceedings of the 23rd ACM international conference on Multimedia, 2015, pp. 261–270

2015

[21] [21]

Design and implementation of multi- uav cooperation search experimental platform,

S. Wang, C. E. Njau, and Z. Jiang, “Design and implementation of multi- uav cooperation search experimental platform,” in2021 5th International Conference on Robotics and Automation Sciences (ICRAS). IEEE, 2021, pp. 94–98

2021

[22] [22]

Multi-uav cooperative system for search and rescue based on yolov5,

L. Xing, X. Fan, Y . Dong, Z. Xiong, L. Xing, Y . Yang, H. Bai, and C. Zhou, “Multi-uav cooperative system for search and rescue based on yolov5,”International Journal of Disaster Risk Reduction, vol. 76, p. 102972, 2022

2022

[23] [23]

Skynet: Multi-drone cooperation for real-time person identification and localization,

J. Peng, Q. Li, Y . Tan, D. Zhao, Z. Yuan, J. Chen, H. Wang, and Y . Jiang, “Skynet: Multi-drone cooperation for real-time person identification and localization,” inIEEE INFOCOM 2023-IEEE Conference on Computer Communications. IEEE, 2023, pp. 1–10

2023

[24] [24]

Air-cad: Edge-assisted multi-drone network for real-time crowd anomaly detection,

Y . Tan, Q. Li, J. Peng, Z. Yuan, and Y . Jiang, “Air-cad: Edge-assisted multi-drone network for real-time crowd anomaly detection,” inPro- ceedings of the ACM on Web Conference 2024, 2024, pp. 2817–2825

2024

[25] [25]

A review of semantic segmentation using deep neural networks,

Y . Guo, Y . Liu, T. Georgiou, and M. S. Lew, “A review of semantic segmentation using deep neural networks,”International journal of multimedia information retrieval, vol. 7, pp. 87–93, 2018

2018

[26] [26]

Segformer: Simple and efficient design for semantic segmentation with transformers,

E. Xie, W. Wang, Z. Yu, A. Anandkumar, J. M. Alvarez, and P. Luo, “Segformer: Simple and efficient design for semantic segmentation with transformers,”Advances in neural information processing systems, vol. 34, pp. 12 077–12 090, 2021

2021

[27] [27]

G. U. of Technology. (2020) Semantic drone dataset. [Online]. Available: http://www.dronedataset.icg.tugraz.at

2020

[28] [28]

A review on unmanned aerial vehicle remote sensing: Platforms, sensors, data processing methods, and applications,

Z. Zhang and L. Zhu, “A review on unmanned aerial vehicle remote sensing: Platforms, sensors, data processing methods, and applications,” drones, vol. 7, no. 6, p. 398, 2023

2023

[29] [29]

Memory- constrained semantic segmentation for ultra-high resolution uav im- agery,

Q. Li, J. Cai, J. Luo, Y . Yu, J. Gu, J. Pan, and W. Liu, “Memory- constrained semantic segmentation for ultra-high resolution uav im- agery,”IEEE Robotics and Automation Letters, vol. 9, no. 2, pp. 1708– 1715, 2024

2024

[30] [30]

Outdoor navigation using two quadrotors and adaptive sliding mode control,

D. K. Villa, A. S. Brand ˜ao, and M. Sarcinelli-Filho, “Outdoor navigation using two quadrotors and adaptive sliding mode control,” in2020 Inter- national Conference on Unmanned Aircraft Systems (ICUAS). IEEE, 2020, pp. 716–721

2020

[31] [31]

Diffrate: Differentiable compression rate for efficient vision transformers,

M. Chen, W. Shao, P. Xu, M. Lin, K. Zhang, F. Chao, R. Ji, Y . Qiao, and P. Luo, “Diffrate: Differentiable compression rate for efficient vision transformers,” inProceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 17 164–17 174

2023

[32] [32]

Sparse refinement for efficient high-resolution semantic segmentation,

Z. Liu, Z. Zhang, S. Khaki, S. Yang, H. Tang, C. Xu, K. Keutzer, and S. Han, “Sparse refinement for efficient high-resolution semantic segmentation,” inEuropean Conference on Computer Vision. Springer, 2025, pp. 108–127

2025

[33] [33]

Benchmarking the robustness of semantic segmentation models,

C. Kamann and C. Rother, “Benchmarking the robustness of semantic segmentation models,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 8828–8838

2020

[34] [34]

Online normalization for training neural networks,

V . Chiley, I. Sharapov, A. Kosson, U. Koster, R. Reece, S. Samaniego de la Fuente, V . Subbiah, and M. James, “Online normalization for training neural networks,”Advances in Neural Information Processing Systems, vol. 32, 2019

2019

[35] [35]

Robust test-time adaptation in dynamic scenarios,

L. Yuan, B. Xie, and S. Li, “Robust test-time adaptation in dynamic scenarios,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 15 922–15 932

2023

[36] [36]

Mecta: Memory-economic continual test-time model adaptation,

J. Hong, L. Lyu, J. Zhou, and M. Spranger, “Mecta: Memory-economic continual test-time model adaptation,” in2023 International Conference on Learning Representations, 2023

2023

[37] [37]

A comprehensive survey on test-time adaptation under distribution shifts,

J. Liang, R. He, and T. Tan, “A comprehensive survey on test-time adaptation under distribution shifts,”International Journal of Computer Vision, pp. 1–34, 2024

2024

[38] [38]

Towards stable test-time adaptation in dynamic wild world,

S. Niu, J. Wu, Y . Zhang, Z. Wen, Y . Chen, P. Zhao, and M. Tan, “Towards stable test-time adaptation in dynamic wild world,”arXiv preprint arXiv:2302.12400, 2023

work page arXiv 2023

[39] [39]

Distribution-aware continual test-time adaptation for semantic segmentation,

J. Ni, S. Yang, R. Xu, J. Liu, X. Li, W. Jiao, Z. Chen, Y . Liu, and S. Zhang, “Distribution-aware continual test-time adaptation for semantic segmentation,” in2024 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2024, pp. 3044–3050

2024

[40] [40]

Real-time identification of rice weeds by uav low-altitude remote sensing based on improved semantic segmentation model,

Y . Lan, K. Huang, C. Yang, L. Lei, J. Ye, J. Zhang, W. Zeng, Y . Zhang, and J. Deng, “Real-time identification of rice weeds by uav low-altitude remote sensing based on improved semantic segmentation model,” Remote Sensing, vol. 13, no. 21, p. 4370, 2021

2021

[41] [41]

Research on detection and tracking technology of quad-rotor aircraft based on open source flight control,

M. Cao, W. Chen, and Y . Li, “Research on detection and tracking technology of quad-rotor aircraft based on open source flight control,” in2020 39th Chinese Control Conference (CCC). IEEE, 2020, pp. 6509–6514

2020

[42] [42]

Automatic differentiation in pytorch,

A. Paszke, S. Gross, S. Chintala, G. Chanan, E. Yang, Z. DeVito, Z. Lin, A. Desmaison, L. Antiga, and A. Lerer, “Automatic differentiation in pytorch,” 2017

2017

[43] [43]

Floodnet: A high resolution aerial imagery dataset for post flood scene understanding,

M. Rahnemoonfar, T. Chowdhury, A. Sarkar, D. Varshney, M. Yari, and R. R. Murphy, “Floodnet: A high resolution aerial imagery dataset for post flood scene understanding,”IEEE Access, vol. 9, pp. 89 644–89 654, 2021

2021