arxiv: 2605.13516 · v1 · submitted 2026-05-13 · 📡 eess.SP

Recognition: 2 theorem links

· Lean Theorem

Sensing-Assisted LoS/NLoS Identification in Dynamic UAV Positioning Systems

Huijuan Qiao , Lu Bai , Mingran Sun , Mengyuan Lu , Jiajing Chen , Xiang Cheng

Authors on Pith no claims yet

Pith reviewed 2026-05-14 18:13 UTC · model grok-4.3

classification 📡 eess.SP

keywords UAV positioningLoS/NLoS identificationmulti-modal sensingfeature fusion networkchannel impulse responseRGB imagestrilaterationfew-shot generalization

0 comments

The pith

A dual-input fusion network identifies LoS/NLoS conditions for UAVs at up to 97.69 percent accuracy by combining RGB images with channel impulse responses.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes a sensing-assisted method to determine whether a UAV maintains a clear line of sight to the ground or encounters blockage in urban settings. Accurate identification supports better positioning because blocked signals degrade distance estimates used in trilateration. The authors built a new multi-modal dataset covering two urban scenarios and multiple flight altitudes, then trained a network that fuses visual and radio-signal features despite their different formats. The resulting system outperforms single-data-type baselines, maintains performance with limited samples, and stays stable under image noise. If the method holds, UAV navigation systems can correct for blocked paths in real time and achieve substantially lower location errors.

Core claim

The authors construct a multi-modal sensing-communication dataset for urban UAV-to-ground links and introduce a dual-input feature fusion network that jointly processes RGB images and channel impulse response data to classify line-of-sight versus non-line-of-sight conditions. This yields identification accuracy up to 97.69 percent, at least 3.59 percent above CIR-only or RGB-only baselines, strong few-shot generalization below 200 target samples, and roughly 0.5 percent accuracy loss under Gaussian noise of variance 0.35 on the images. When the classifier output guides trilateration, positioning error falls by approximately 70 percent in a crossroad scenario.

What carries the argument

The dual-input feature fusion network that bridges heterogeneous RGB image and CIR representations to extract and combine sensing and communication features for LoS/NLoS classification.

If this is right

Identification accuracy reaches 97.69 percent and exceeds single-modality baselines by at least 3.59 percent.
Performance stabilizes near full-sample levels with fewer than 200 target samples and beats baselines with fewer than 100 samples across scenarios and altitudes.
Accuracy degrades by only about 0.5 percent when Gaussian noise of variance 0.35 is added to the RGB images.
Trilateration positioning error is reduced by approximately 70 percent when the LoS/NLoS labels are used in a crossroad scenario.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same fusion approach could be adapted to predict link quality or suggest rerouting paths around blocked zones during flight.
Integration into onboard UAV processors might allow real-time adjustment of transmission power or frequency when NLoS is detected.
Extending the dataset to include additional sensor streams such as LiDAR could raise identification reliability in denser city environments.

Load-bearing premise

The constructed multi-modal dataset faithfully represents real-world urban UAV-to-ground propagation across the tested altitudes and the fusion network merges the two data types without creating artifacts that overstate accuracy.

What would settle it

Record new real-world UAV flights in an urban location outside the original dataset, apply the trained network, and observe whether identification accuracy falls below 90 percent or the positioning error reduction drops below 50 percent.

Figures

Figures reproduced from arXiv: 2605.13516 by Huijuan Qiao, Jiajing Chen, Lu Bai, Mengyuan Lu, Mingran Sun, Xiang Cheng.

**Figure 2.** Figure 2: The wide lane scenario of the constructed dataset. [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: Scheme of the proposed dual-input feature fusion network for LoS/NLoS identification. [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗

**Figure 4.** Figure 4: Identification results in crossroad scenario. (a) Result of a single [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗

**Figure 6.** Figure 6: Cross-scenario few-shot generalization performance of LoS/NLoS [PITH_FULL_IMAGE:figures/full_fig_p008_6.png] view at source ↗

**Figure 7.** Figure 7: Cross-altitude few-shot generalization performance of LoS/NLoS [PITH_FULL_IMAGE:figures/full_fig_p009_7.png] view at source ↗

**Figure 8.** Figure 8: Cross-altitude few-shot generalization performance of LoS/NLoS [PITH_FULL_IMAGE:figures/full_fig_p010_8.png] view at source ↗

**Figure 10.** Figure 10: Generalization performance of LoS/NLoS identification under [PITH_FULL_IMAGE:figures/full_fig_p011_10.png] view at source ↗

**Figure 11.** Figure 11: ToA-based trilateration positioning and NLoS-induced error. [PITH_FULL_IMAGE:figures/full_fig_p012_11.png] view at source ↗

**Figure 12.** Figure 12: CDF of positioning error based on different LoS/NLoS identification [PITH_FULL_IMAGE:figures/full_fig_p012_12.png] view at source ↗

read the original abstract

In this paper, a sensing-assisted non-line-of-sight (NLoS) identification method for dynamic uncrewed aerial vehicle (UAV) positioning is proposed for the first time. For urban UAV-to-ground scenarios, a new multi-modal sensing-communication integrated dataset is constructed to support line-of-sight (LoS)/NLoS identification, covering two typical urban scenarios and a wide range of flight altitudes. Based on the constructed dataset, a novel dual-input feature fusion network is proposed, which addresses the challenge of heterogeneous representations between RGB images and channel impulse response (CIR) data to enable the joint extraction and fusion of sensing and communication features for LoS/NLoS identification. Simulation results show that the identification accuracy can reach up to 97.69%, while achieving an improvement of at least 3.59% compared to traditional CIR-only and RGB-only methods. Moreover, strong few-shot generalization is observed, as the proposed method stabilizes and approaches full-sample performance with fewer than 200 target samples and exceeds traditional CIR-only and RGB-only methods with fewer than 100 target samples in all cross-scenario and cross-altitude experiments. Even under Gaussian noise with a variance of 0.35 applied to RGB images, the accuracy degradation remains approximately 0.5%. By utilizing the proposed LoS/NLoS identification method, the error of trilateration positioning can be reduced by approximately 70% in a crossroad scenario, verifying the utility of the proposed method.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper builds a new multi-modal dataset and dual-input fusion network for LoS/NLoS identification in UAV links, reaching 97.69% simulated accuracy and 70% positioning error reduction, but the gains rest on unvalidated ray-tracing and image synthesis.

read the letter

The main advance is the construction of a multi-modal dataset pairing RGB images with channel impulse responses across two urban scenes and a range of altitudes, paired with a dual-input network that fuses the heterogeneous features for LoS/NLoS detection. In simulation this yields 97.69% accuracy, at least 3.59% better than CIR-only or RGB-only baselines, plus stable few-shot performance below 200 samples and only 0.5% drop under added Gaussian noise on the images. The 70% trilateration error cut in the crossroad case is a direct practical payoff if the identification step holds.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes a sensing-assisted LoS/NLoS identification method for dynamic UAV positioning in urban environments. It introduces a newly constructed multi-modal dataset combining RGB images and channel impulse response (CIR) data across two scenarios and a range of altitudes, along with a dual-input feature fusion neural network to jointly extract and combine sensing and communication features. Simulation results claim identification accuracy up to 97.69% (at least 3.59% above CIR-only and RGB-only baselines), strong few-shot generalization (stabilizing near full performance with <200 target samples), robustness to Gaussian noise on images, and an approximately 70% reduction in trilateration positioning error in a crossroad scenario.

Significance. If the reported performance holds under real-world conditions, the work could meaningfully advance integrated sensing-communication systems for UAV navigation by showing that multi-modal fusion can improve both identification accuracy and downstream positioning in dynamic urban settings. The few-shot generalization results are particularly relevant for practical UAV deployments where labeled data may be limited.

major comments (2)

[Dataset Construction] Dataset Construction section: The multi-modal dataset is generated entirely via simulation (ray-tracing for CIR, rendering for RGB) with no cross-validation or comparison against real UAV flight measurements or field data. This is load-bearing for the central claims because the 97.69% accuracy, few-shot results, noise robustness, and 70% positioning-error reduction all depend on the simulated urban propagation conditions accurately capturing real effects such as dynamic foliage, precise antenna patterns, and camera noise at altitude.
[Simulation Results] Simulation Results section (and associated tables/figures): The headline metrics (97.69% accuracy, 3.59% improvement, few-shot curves, and 70% error reduction) are presented without reported train/validation/test split ratios, number of Monte Carlo trials, error bars, or statistical significance tests (e.g., p-values) for the cross-scenario and cross-altitude comparisons. This makes it impossible to assess whether the gains are robust or could be artifacts of a particular data partition.

minor comments (2)

[Abstract] The abstract and results would benefit from a brief statement of the dual-input network architecture (e.g., backbone choices, fusion mechanism) to allow readers to assess complexity without immediately consulting the methods.
[Figures] Figure captions for the positioning-error plots should explicitly state the baseline methods used for the 70% reduction claim and the number of Monte Carlo realizations.

Simulated Author's Rebuttal

2 responses · 1 unresolved

We are grateful to the referee for the thorough review and valuable suggestions. We provide detailed responses to the major comments and commit to revisions that address the identified issues while maintaining the integrity of our simulation-based study.

read point-by-point responses

Referee: [Dataset Construction] Dataset Construction section: The multi-modal dataset is generated entirely via simulation (ray-tracing for CIR, rendering for RGB) with no cross-validation or comparison against real UAV flight measurements or field data. This is load-bearing for the central claims because the 97.69% accuracy, few-shot results, noise robustness, and 70% positioning-error reduction all depend on the simulated urban propagation conditions accurately capturing real effects such as dynamic foliage, precise antenna patterns, and camera noise at altitude.

Authors: We recognize that our dataset relies on simulation, which is a limitation for validating against real-world conditions. In the revised version, we will add a comprehensive discussion in the Dataset Construction section detailing the ray-tracing and rendering parameters used, their alignment with established models (e.g., 3GPP), and the assumptions made regarding urban environments. We will also include a dedicated limitations subsection addressing potential gaps such as dynamic foliage and camera noise. However, acquiring real UAV measurements is beyond the scope of this work due to logistical and regulatory challenges. Thus, this will be a partial revision. revision: partial
Referee: [Simulation Results] Simulation Results section (and associated tables/figures): The headline metrics (97.69% accuracy, 3.59% improvement, few-shot curves, and 70% error reduction) are presented without reported train/validation/test split ratios, number of Monte Carlo trials, error bars, or statistical significance tests (e.g., p-values) for the cross-scenario and cross-altitude comparisons. This makes it impossible to assess whether the gains are robust or could be artifacts of a particular data partition.

Authors: We agree with this observation and will revise the Simulation Results section to include the following details: the data split ratios (70% training, 15% validation, 15% testing), the number of Monte Carlo trials (e.g., 100 runs for averaging), error bars representing standard deviation in relevant figures, and results of statistical tests such as t-tests to confirm the significance of the performance improvements. These changes will enhance the reproducibility and credibility of our results. revision: yes

standing simulated objections not resolved

Real-world validation of the simulated dataset against actual UAV flight data, which we cannot provide in this revision as the study is simulation-based.

Circularity Check

0 steps flagged

No circularity: empirical network trained on newly constructed multi-modal dataset

full rationale

The paper constructs a new multi-modal dataset covering urban UAV-to-ground scenarios and proposes a dual-input feature fusion network to perform LoS/NLoS identification from RGB images and CIR data. All reported results (97.69% accuracy, few-shot generalization, noise robustness, and 70% positioning-error reduction) are obtained by training and evaluating this network on the described dataset, with direct comparisons to CIR-only and RGB-only baselines. No equations, uniqueness theorems, ansatzes, or self-citations are used to derive predictions; the central claims rest on empirical performance rather than any reduction of outputs to inputs by construction. The derivation chain is therefore self-contained and non-circular.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on the representativeness of the new dataset and the ability of the fusion network to handle heterogeneous inputs; both are domain assumptions rather than derived results.

free parameters (1)

neural network architecture hyperparameters
The dual-input feature fusion network requires choices of layer sizes, fusion strategy, and training settings that are tuned to the collected data.

axioms (1)

domain assumption The two urban scenarios and range of flight altitudes in the dataset are representative of typical UAV-to-ground conditions.
Invoked when claiming generalization across cross-scenario and cross-altitude experiments.

pith-pipeline@v0.9.0 · 5583 in / 1374 out tokens · 45575 ms · 2026-05-14T18:13:21.034167+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

a novel dual-input feature fusion network... ViT branch for RGB images and CNN branch for CIR matrices
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Simulation results show that the identification accuracy can reach up to 97.69%

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

30 extracted references · 2 canonical work pages · 2 internal anchors

[1]

Low-altitude intelligent transportation: System ar- chitecture, infrastructure, and key technologies,

C. Huanget al., “Low-altitude intelligent transportation: System ar- chitecture, infrastructure, and key technologies,”Journal of Industrial Information Integration, vol. 42, Nov. 2024, Art. no. 100694

2024
[2]

An efficient UA V localization technique based on particle swarm optimization,

W. Zhanget al., “An efficient UA V localization technique based on particle swarm optimization,”IEEE Trans. Veh. Technol., vol. 71, no. 9, pp. 9544–9557, Sep. 2022

2022
[3]

Vision-based UA V self-positioning in low-altitude urban environments,

M. Dai, E. Zheng, Z. Feng, L. Qi, J. Zhuang, and W. Yang, “Vision-based UA V self-positioning in low-altitude urban environments,”IEEE Trans. Image Process., vol. 33, pp. 493–508, Dec. 2023

2023
[4]

UA V- assistd wireless localization for search and rescue,

M. Atif, R. Ahmad, W. Ahmad, L. Zhao, and J. J. P. C. Rodrigues, “UA V- assistd wireless localization for search and rescue,”IEEE Syst. J., vol. 15, no. 3, pp. 3261–3272, Sep. 2021

2021
[5]

Dynamic positioning of UA Vs to improve network coverage in V ANETs,

M. M. Islam, M. T. R. Khan, M. M. Saad, M. A. Tariq, and D. Kim, “Dynamic positioning of UA Vs to improve network coverage in V ANETs,”Veh. Commun., vol. 36, Aug. 2022, Art. no. 100498

2022
[6]

Bensky,Wireless Positioning Technologies and Applications

A. Bensky,Wireless Positioning Technologies and Applications. Norwood, MA, USA: Artech House, 2008

2008
[7]

Decimeter-accuracy positioning for drones using two-stage trilateration in a GPS-denied environment,

Y .-E. Chen, H.-H. Liew, J.-C. Chao, and R.-B. Wu, “Decimeter-accuracy positioning for drones using two-stage trilateration in a GPS-denied environment,”IEEE Internet Things J., vol. 10, no. 9, pp. 8319–8326, May 2023. IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. XX, NO. XX, XX 2026 13

2023
[8]

Enhancing GNSS positioning in urban environments: A transformer-based NLOS detection and adaptive weighting approach,

W. Zhaiet al., “Enhancing GNSS positioning in urban environments: A transformer-based NLOS detection and adaptive weighting approach,” IEEE Internet Things J., vol. 12, no. 20, pp. 43521–43539, Aug. 2025

2025
[9]

Deep learning and hybrid fusion-based LOS/NLOS identification in substation scenarios for power internet of things,

T. Zhou, Y . Wanget al., “Deep learning and hybrid fusion-based LOS/NLOS identification in substation scenarios for power internet of things,”IEEE Internet Things J., vol. 11, no. 20, pp. 33903–33914, Jul. 2024

2024
[10]

Vehicle-to-vehicle radio channel characterization in crossroad scenarios,

R. Heet al., “Vehicle-to-vehicle radio channel characterization in crossroad scenarios,”IEEE Trans. Veh. Technol., vol. 65, no. 8, pp. 5850– 5861, Aug. 2016

2016
[11]

A RSSI/PDR-based probabilistic position selection algorithm with NLOS identification for indoor localization,

K. Han, H. S. Xing, Z. L. Deng, and Y . C. Du, “A RSSI/PDR-based probabilistic position selection algorithm with NLOS identification for indoor localization,”ISPRS Int. J. Geo-Inf., vol. 7, no. 6, Jun. 2018, Art. no. 232

2018
[12]

Measurement and analysis of NLOS identification metrics for WLAN systems,

E. Almazrouei, N. A. Sindi, and S. R. Al-Araji, “Measurement and analysis of NLOS identification metrics for WLAN systems,” inProc. Pers., Indoor Mobile Radio Commun., Washington, DC, USA, Jun. 2015, pp. 280–284

2015
[13]

NLOS identification and mitigation for localization based on UWB experimental data,

S. Marano, W. M. Gifford, H. Wymeerschet al., “NLOS identification and mitigation for localization based on UWB experimental data,”IEEE J. Sel. Areas Commun., vol. 28, no. 7, pp. 1026–1035, Sep. 2010

2010
[14]

Non-line-of-sight identification for UWB indoor positioning systems using support vector machines,

J. B. Kristensen, M. M. Ginard, O. K. Jensenet al., “Non-line-of-sight identification for UWB indoor positioning systems using support vector machines,” inProc. 2019 IEEE MTT-S International Wireless Symposium (IWS), Guangzhou, China, May 2019, pp. 1–3

2019
[15]

Machine learning-enabled LOS/NLOS identification for MIMO systems in dynamic environments,

C. Huang, A. F. Molisch, R. Heet al., “Machine learning-enabled LOS/NLOS identification for MIMO systems in dynamic environments,” IEEE Trans. Wireless Commun., vol. 19, no. 6, pp. 3643–3657, Jun. 2020

2020
[16]

Channel non-line-of-sight identification based on convolutional neural networks,

Q. Zheng, R. He, B. Aiet al., “Channel non-line-of-sight identification based on convolutional neural networks,”IEEE Wireless Commun. Lett., vol. 9, no. 9, pp. 1500–1504, Sep. 2020

2020
[17]

Intelligent multi-modal sensing-communication inte- gration: Synesthesia of machines,

X. Chenget al., “Intelligent multi-modal sensing-communication inte- gration: Synesthesia of machines,”IEEE Commun. Surveys Tuts., vol. 26, no. 1, pp. 258–301, Nov. 2024

2024
[18]

Multi-modal intelligent channel modeling: A new modeling paradigm via synesthesia of machines,

L. Bai, Z. Huang, M. Sun, X. Cheng, and L. Cui, “Multi-modal intelligent channel modeling: A new modeling paradigm via synesthesia of machines,”IEEE Commun. Surveys Tuts., vol. 28, pp. 2612–2649, Apr. 2025

2025
[19]

Multi-modal intelligent channel modeling framework for 6G-enabled networked intelligent systems,

L. Bai, Z. Han, X. Cai, and X. Cheng, “Multi-modal intelligent channel modeling framework for 6G-enabled networked intelligent systems,” IEEE Wireless Commun. Mag., early access, 2026

2026
[20]

Airsim: High-fidelity visual and physical simulation for autonomous vehicles,

S. Shah, D. Dey, C. Lovett, and A. Kapoor, “Airsim: High-fidelity visual and physical simulation for autonomous vehicles,” inField and Service Robotics: Results of the 11th International Conference, pp. 621–635. Springer, 2018

2018
[21]

Remcom Wireless InSite, Jan. 2017. Accessed: Mar. 2022. [Online]. Available: https://www.remcom.com/wireless-insite-em-propagation- software

2017
[22]

A lightweight CIR-based CNN with MLP for NLOS/LOS identification in a UWB positioning system,

M. Si, Y . Wang, H. Siljak, C. Seow, and H. Yang, “A lightweight CIR-based CNN with MLP for NLOS/LOS identification in a UWB positioning system,”IEEE Commun. Lett., vol. 27, no. 5, pp. 1332–1336, May 2023

2023
[23]

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

A. Dosovitskiyet al., “An image is worth 16x16 words: Transformers for image recognition at scale,” 2020,arXiv: 2010.11929

work page internal anchor Pith review Pith/arXiv arXiv 2020
[24]

Efficient processing of deep neural networks: A tutorial and survey,

V . Sze, Y .-H. Chen, T.-J. Yang, and J. S. Emer, “Efficient processing of deep neural networks: A tutorial and survey,”Proc. IEEE, vol. 105, no. 12, pp. 2295–2329, Dec. 2017

2017
[25]

Deep learning,

Y . LeCun, Y . Bengio, and G. Hinton, “Deep learning,”Nature, vol. 521, pp. 436–444, May 2015

2015
[26]

Cross-entropy loss function: Theo- retical analysis and applications,

A. Mao, M. Mohri, and Y . Zhong, “Cross-entropy loss function: Theo- retical analysis and applications,” inInternational conference on Machine learning, vol. 202, pp. 23803–23828, Jul. 2023

2023
[27]

Generalizing from a few examples: A survey on few-shot learning,

Y . Wang, Q. Yao, J. Kwok, and L. M. Ni, “Generalizing from a few examples: A survey on few-shot learning,”ACM Comput. Surveys, vol. 53, no. 3, pp. 1–34, Jun. 2020

2020
[28]

Benchmarking Neural Network Robustness to Common Corruptions and Perturbations

D. Hendrycks, and T. G. Dietterich, “Benchmarking neural network robustness to common corruptions and perturbations,” 2019,arXiv: 1903.12261

work page internal anchor Pith review Pith/arXiv arXiv 2019
[29]

J ¨ahne,Digital Image Processing

B. J ¨ahne,Digital Image Processing. Berlin, Germany: Springer, 2005

2005
[30]

A Comprehensive Survey on Short-Distance Localization of UA Vs,

L. Kramari ´c, N. Jelu ˇsi´c, T. Radi ˇsi´c, M. Mu ˇstra, “A Comprehensive Survey on Short-Distance Localization of UA Vs,”Drones, vol. 9, no. 3, pp. 188, Mar. 2025

2025