Multimodal Learning for MIMO Beam Prediction Based on Variational Inference

Zijian Zheng , Wenqiang Yi , Hyundong Shin , Arumugam Nallanathan

Authors on Pith no claims yet

classification 📡 eess.SP

keywords multimodalpredictionbeamdatalearningaccuracyalignmentfeature

read the original abstract

Accurate beam prediction is essential for mitigating signalling overhead and latency in integrated sensing and communication-enabled massive multi-input multi-output systems. With the aid of multimodal learning, the prediction accuracy can be enhanced by leveraging the complementary information from other existing sensors, but the practical deployment is often constrained by the high cost of acquiring semantically aligned multimodal datasets. This paper proposes a variational-inference-based multimodal framework that decouples the optimization problem into modular feature extraction and cross-modal semantic alignment. Specifically, we develop a two-stage training strategy where the model utilises abundant unimodal data for representation learning before performing refined alignment on limited multimodal samples. This design enhances data efficiency and ensures robust feature fusion under sensing uncertainties. Experimental results on the DeepSense6G dataset demonstrate that the proposed framework achieves competitive beam prediction accuracy and maintains high reliability, while only requiring 20% of the multimodal training data compared to conventional end-to-end benchmarks.

This paper has not been read by Pith yet.

Multimodal Learning for MIMO Beam Prediction Based on Variational Inference

discussion (0)