Deep Learning-Based Computer Vision for Beam Selection and Proactive Blockage Prediction
Pith reviewed 2026-05-08 16:28 UTC · model grok-4.3
The pith
RGB imagery fused with power profiles enables 98.96% beam prediction accuracy and over 98% blockage forecasting in mmWave systems
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We address propagation loss through a novel vision-aided beam selection framework that integrates RGB imagery with received power profiles for efficient transmitter identification and beam prediction. This framework achieves 98.96% top-5 beam prediction accuracy, surpassing current state-of-the-art methods by at least 6% across all metrics. We address penetration loss through a proactive blockage prediction framework using a modified object tracker with weighted centroid-based depth estimation. This represents the first analysis of simultaneous non-uniform mobility of both transmitters and obstacles. Evaluated on completely unseen data, this framework achieves over 98% accuracy in predicting
What carries the argument
Vision-aided beam selection framework that integrates RGB imagery with received power profiles for transmitter identification and beam prediction, paired with a modified object tracker using weighted centroid-based depth estimation for proactive blockage forecasting
Load-bearing premise
RGB imagery combined with received power profiles will be available in real time at both transmitter and receiver and that models trained on the authors' datasets will generalize to arbitrary real-world mobility patterns and lighting conditions.
What would settle it
Testing the trained models on new outdoor data with sudden lighting changes and unpredictable simultaneous movements of transmitters and obstacles, then checking whether top-5 beam accuracy drops below 90% or blockage prediction accuracy falls below 90% for three-frame horizons.
Figures
read the original abstract
Millimeter-wave communication faces two critical challenges: propagation losses requiring costly narrow-beam alignment, and penetration losses causing link failures from blocked line-of-sight paths. We address propagation loss through a novel vision-aided beam selection framework that integrates RGB imagery with received power profiles for efficient transmitter identification and beam prediction. This framework achieves 98.96% top-5 beam prediction accuracy, surpassing current state-of-the-art methods by at least 6% across all metrics. We address penetration loss through a proactive blockage prediction framework using a modified object tracker with weighted centroid-based depth estimation. This represents the first analysis of simultaneous non-uniform mobility of both transmitters and obstacles. Evaluated on completely unseen data, this framework achieves over 98% accuracy in predicting blockages up to three frames ahead, establishing strong performance benchmarks.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes two deep learning-based computer vision frameworks for millimeter-wave communications: a vision-aided beam selection method that fuses RGB imagery with received power profiles to achieve 98.96% top-5 beam prediction accuracy (outperforming prior SOTA by at least 6% across metrics), and a proactive blockage prediction system based on a modified object tracker with weighted centroid depth estimation that reaches over 98% accuracy on unseen data for up to three frames ahead, including the first reported analysis of simultaneous non-uniform mobility between transmitter and obstacles.
Significance. If the empirical results hold under the reported evaluation protocol, the work offers a practical contribution to mmWave system design by demonstrating how standard camera feeds can reduce beam alignment overhead and preempt link outages. Explicit dataset splits, re-implemented baselines on identical data, and architecture details strengthen the reproducibility of the performance claims. The focus on real-time mobility scenarios addresses a relevant deployment gap in 5G/6G networks.
minor comments (3)
- Abstract: the statement that the beam selection framework 'surpasses current state-of-the-art methods by at least 6% across all metrics' would be clearer if the specific metrics (e.g., top-1, top-3) and the exact SOTA references being compared were named inline rather than left to the reader to locate in the results section.
- §4 (blockage prediction): the description of the 'weighted centroid-based depth estimation' would benefit from an explicit equation or pseudocode step showing how the weights are computed from the tracker output, as the current prose leaves the weighting rule ambiguous for replication.
- The manuscript would be strengthened by adding a short paragraph on inference latency and memory footprint of the two models on embedded hardware, given the real-time requirements implied by the proactive prediction task.
Simulated Author's Rebuttal
We thank the referee for the positive evaluation of our manuscript, the recognition of its contributions to vision-aided beam selection and proactive blockage prediction, and the recommendation for minor revision. We appreciate the emphasis on reproducibility and the relevance to 5G/6G deployment scenarios.
Circularity Check
No significant circularity identified
full rationale
The paper presents purely empirical results from supervised deep learning models trained on RGB imagery combined with received power profiles for beam selection and a modified object tracker for proactive blockage prediction. All headline metrics (98.96% top-5 beam accuracy and >98% blockage prediction on unseen data) are obtained via explicit dataset splits and held-out evaluation, with no equations, derivations, fitted parameters renamed as predictions, or self-citation chains that reduce any claim to its own inputs by construction. The work is self-contained against external benchmarks through re-implemented baselines and standard ML evaluation protocols.
Axiom & Free-Parameter Ledger
free parameters (1)
- neural network weights and biases
Reference graph
Works this paper leans on
-
[1]
Wireless communications and applications above 100 GHz: Opportuni- ties and challenges for 6G and beyond,
T. S. Rappaport, Y. Xing, O. Kanhere, S. Ju, A. Madanayake, S. Mandal, A. Alkhateeb, and G. C. Trichopoulos, “Wireless communications and applications above 100 GHz: Opportuni- ties and challenges for 6G and beyond,” IEEE access, vol. 7, pp. 78 729–78 757, 2019
2019
-
[3]
Modeling and analyzing millimeter wave cellular systems,
J. G. Andrews, T. Bai, M. N. Kulkarni, A. Alkhateeb, A. K. Gupta, and R. W. Heath, “Modeling and analyzing millimeter wave cellular systems,” IEEE Transactions on Communications, vol. 65, no. 1, pp. 403–430, 2016
2016
-
[4]
Machine learning for millimeter wave and terahertz beam management: A survey and open challenges,
M. Q. Khan, A. Gaber, P. Schulz, and G. Fettweis, “Machine learning for millimeter wave and terahertz beam management: A survey and open challenges,” IEEE Access, vol. 11, pp. 11 880– 11 902, 2023
2023
-
[5]
Machine learning for reliable mmWave systems: Blockage prediction and proactive handoff,
A. Alkhateeb, I. Beltagy, and S. Alex, “Machine learning for reliable mmWave systems: Blockage prediction and proactive handoff,” in 2018 IEEE Global conference on signal and infor- mation processing (GlobalSIP). IEEE, 2018, pp. 1055–1059
2018
-
[6]
Beam manage- ment in millimeter-wave communications for 5G and beyond,
Y.-N. R. Li, B. Gao, X. Zhang, and K. Huang, “Beam manage- ment in millimeter-wave communications for 5G and beyond,” IEEE Access, vol. 8, pp. 13 282–13 293, 2020
2020
-
[7]
Hierarchical codebook design for beamforming training in millimeter-wave communica- tion,
Z. Xiao, T. He, P. Xia, and X.-G. Xia, “Hierarchical codebook design for beamforming training in millimeter-wave communica- tion,” IEEE Transactions on Wireless Communications, vol. 15, no. 5, pp. 3380–3392, 2016
2016
-
[8]
Millimeter wave beamforming for wireless backhaul and access in small cell networks,
S. Hur, T. Kim, D. J. Love, J. V. Krogmeier, T. A. Thomas, and A. Ghosh, “Millimeter wave beamforming for wireless backhaul and access in small cell networks,” IEEE transactions on communications, vol. 61, no. 10, pp. 4391–4403, 2013
2013
-
[9]
Wideband millimeter- wave beam training with true-time-delay array architecture,
H. Yan, V. Boljanovic, and D. Cabric, “Wideband millimeter- wave beam training with true-time-delay array architecture,” in 2019 53rd Asilomar Conference on Signals, Systems, and Computers. IEEE, 2019, pp. 1447–1452. 14
2019
-
[10]
Terahertz communications: An array-of- subarrays solution,
C. Lin and G. Y. L. Li, “Terahertz communications: An array-of- subarrays solution,” IEEE Communications Magazine, vol. 54, no. 12, pp. 124–131, 2016
2016
-
[11]
An efficient nocturnal scenarios beamforming based on multi-modal enhanced by object detection,
J. Nie, Y. Cui, T. Yu, J. Mu, W. Yuan, and X. Jing, “An efficient nocturnal scenarios beamforming based on multi-modal enhanced by object detection,” in 2023 IEEE Globecom Work- shops (GC Wkshps). IEEE, 2023, pp. 515–520
2023
-
[12]
Environment semantic aided communication: A real world demonstration for beam prediction,
S. Imran, G. Charan, and A. Alkhateeb, “Environment semantic aided communication: A real world demonstration for beam prediction,” in 2023 IEEE International Conference on Com- munications Workshops (ICC Workshops). IEEE, 2023, pp. 48–53
2023
-
[13]
Computer vision-aided beam- forming for 6G wireless communications: Dataset and training perspective,
S. Kim, Y. Ahn, and B. Shim, “Computer vision-aided beam- forming for 6G wireless communications: Dataset and training perspective,” in ICC 2024-IEEE International Conference on Communications. IEEE, 2024, pp. 672–677
2024
-
[14]
LiDAR data for deep learning-based mmWave beam-selection,
A. Klautau, N. González-Prelcic, and R. W. Heath, “LiDAR data for deep learning-based mmWave beam-selection,” IEEE Wireless Communications Letters, vol. 8, no. 3, pp. 909–912, 2019
2019
-
[15]
Position-aided beam prediction in the real world: How useful gps locations actually are?
J. Morais, A. Bchboodi, H. Pezeshki, and A. Alkhateeb, “Position-aided beam prediction in the real world: How useful gps locations actually are?” in ICC 2023-IEEE International Conference on Communications. IEEE, 2023, pp. 1824–1829
2023
-
[16]
Deep learning for fast and reliable initial access in ai- driven 6G mmWave networks,
T. S. Cousik, V. K. Shah, T. Erpek, Y. E. Sagduyu, and J. H. Reed, “Deep learning for fast and reliable initial access in ai- driven 6G mmWave networks,” IEEE Transactions on Network Science and Engineering, vol. 11, no. 6, pp. 5668–5680, 2022
2022
-
[17]
Deep learning assisted calibrated beam training for millimeter-wave com- munication systems,
K. Ma, D. He, H. Sun, Z. Wang, and S. Chen, “Deep learning assisted calibrated beam training for millimeter-wave com- munication systems,” IEEE Transactions on Communications, vol. 69, no. 10, pp. 6706–6721, 2021
2021
-
[18]
Deep learning for beam training in millimeter wave massive MIMO systems,
C. Qi, Y. Wang, and G. Y. Li, “Deep learning for beam training in millimeter wave massive MIMO systems,” IEEE Transactions on Wireless Communications, 2020
2020
-
[19]
Deep learning for mmWave beam and blockage prediction using sub-6 GHz channels,
M. Alrabeiah and A. Alkhateeb, “Deep learning for mmWave beam and blockage prediction using sub-6 GHz channels,” IEEE Transactions on Communications, vol. 68, no. 9, pp. 5504–5518, 2020
2020
-
[20]
Integrated millimeter wave and sub-6 GHz wireless networks: A roadmap for joint mobile broadband and ultra-reliable low-latency com- munications,
O. Semiari, W. Saad, M. Bennis, and M. Debbah, “Integrated millimeter wave and sub-6 GHz wireless networks: A roadmap for joint mobile broadband and ultra-reliable low-latency com- munications,” IEEE Wireless Communications, vol. 26, no. 2, pp. 109–115, 2019
2019
-
[21]
Improved handover through dual connectivity in 5G mmWave mobile networks,
M. Polese, M. Giordani, M. Mezzavilla, S. Rangan, and M. Zorzi, “Improved handover through dual connectivity in 5G mmWave mobile networks,” IEEE Journal on Selected Areas in Commu- nications, vol. 35, no. 9, pp. 2069–2084, 2017
2069
-
[22]
Dynamic multi-connectivity performance in ultra-dense urban mmWave deployments,
V. Petrov, D. Solomitckii, A. Samuylov, M. A. Lema, M. Gapeyenko, D. Moltchanov, S. Andreev, V. Naumov, K. Samouylov, M. Dohler et al., “Dynamic multi-connectivity performance in ultra-dense urban mmWave deployments,” IEEE Journal on Selected Areas in Communications, vol. 35, no. 9, pp. 2038–2055, 2017
2038
-
[23]
Early warning of mmwave signal blockage and aoa transition using sub-6 ghz observations,
Z. Ali, A. Duel-Hallen, and H. Hallen, “Early warning of mmwave signal blockage and aoa transition using sub-6 ghz observations,” IEEE Communications Letters, vol. 24, no. 1, pp. 207–211, 2019
2019
-
[24]
Deep learning for moving blockage prediction using real mmWave measurements,
S. Wu, M. Alrabeiah, A. Hredzak, C. Chakrabarti, and A. Alkhateeb, “Deep learning for moving blockage prediction using real mmWave measurements,” in ICC 2022 - IEEE Inter- national Conference on Communications, 2022, pp. 3753–3758
2022
-
[25]
Radar aided proactive block- age prediction in real-world millimeter wave systems,
U. Demirhan and A. Alkhateeb, “Radar aided proactive block- age prediction in real-world millimeter wave systems,” in ICC 2022-IEEE International Conference on Communications. IEEE, 2022, pp. 4547–4552
2022
-
[26]
LiDAR-aided mobile blockage prediction in real-world millimeter wave systems,
S. Wu, C. Chakrabarti, and A. Alkhateeb, “LiDAR-aided mobile blockage prediction in real-world millimeter wave systems,” in 2022 IEEE Wireless Communications and Networking Confer- ence (WCNC). IEEE, 2022, pp. 2631–2636
2022
-
[27]
Computer vision aided blockage prediction in real-world millimeter wave deployments,
G. Charan and A. Alkhateeb, “Computer vision aided blockage prediction in real-world millimeter wave deployments,” in 2022 IEEE Globecom Workshops (GC Wkshps). IEEE, 2022, pp. 1711–1716
2022
-
[28]
Generative AI-enabled blockage prediction for robust dual-band mmWave communication,
M. Ghassemi, H. Zhang, A. Afana, A. Bin Sediq, and M. Erol- Kantarci, “Generative AI-enabled blockage prediction for robust dual-band mmWave communication,” in ICC 2025 - IEEE International Conference on Communications, 2025, pp. 476– 481
2025
-
[29]
Generative AI-enabled blockage prediction for ro- bust dual-band mmWave communication,
M. Ghassemi, H. Zhang, A. Afana, A. B. Sediq, and M. Erol- Kantarci, “Generative AI-enabled blockage prediction for ro- bust dual-band mmWave communication,” arXiv preprint arXiv:2501.11763, 2025
-
[30]
Millimeter wave base stations with cameras: Vision-aided beam and blockage prediction,
M. Alrabeiah, A. Hredzak, and A. Alkhateeb, “Millimeter wave base stations with cameras: Vision-aided beam and blockage prediction,” in 2020 IEEE 91st vehicular technology conference (VTC2020-Spring). IEEE, 2020, pp. 1–5
2020
-
[31]
User identification: A key en- abler for multi-user vision-aided communications,
G. Charan and A. Alkhateeb, “User identification: A key en- abler for multi-user vision-aided communications,” IEEE Open Journal of the Communications Society, 2023
2023
-
[32]
Deepsense 6G: A large-scale real-world multi-modal sensing and communication dataset,
A. Alkhateeb, G. Charan, T. Osman, A. Hredzak, J. Morais, U. Demirhan, and N. Srinivas, “Deepsense 6G: A large-scale real-world multi-modal sensing and communication dataset,” IEEE Communications Magazine, vol. 61, no. 9, pp. 122–128, 2023
2023
-
[33]
Deep learning based computer-vision for enhanced beamform- ing,
S. Karunasena, E. Khordad, T. Drummond, and R. Senanayake, “Deep learning based computer-vision for enhanced beamform- ing,” in 2025 IEEE International Conference on Communica- tions Workshops (ICC Workshops), 2025, pp. 1646–1651
2025
-
[34]
Microsoft COCO: Com- mon objects in context,
T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ra- manan, P. Dollár, and C. L. Zitnick, “Microsoft COCO: Com- mon objects in context,” in Computer Vision–ECCV 2014, Proceedings, Part V 13. Springer, 2014, pp. 740–755
2014
-
[35]
YOLOv10: Real -Time End -to-End Object Detection,
L. L. Ao Wang, Hui Chen, “YOLOv10: Real-time end-to-end object detection,” arXiv preprint arXiv:2405.14458, 2024
-
[36]
Deep OC- Sort: Multi-pedestrian tracking by adaptive re-identification,
G. Maggiolino, A. Ahmad, J. Cao, and K. Kitani, “Deep OC- Sort: Multi-pedestrian tracking by adaptive re-identification,” in 2023 IEEE International Conference on Image Processing (ICIP), 2023, pp. 3025–3029
2023
-
[37]
Towards robust monocular depth estimation: Mixing datasets for zero-shot cross-dataset transfer,
R. Ranftl, K. Lasinger, D. Hafner, K. Schindler, and V. Koltun, “Towards robust monocular depth estimation: Mixing datasets for zero-shot cross-dataset transfer,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 3, 2022
2022
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.