Enhancing 6G Wireless Intelligence: Do LLMs Work for CSI Prediction?
Pith reviewed 2026-05-13 17:02 UTC · model grok-4.3
The pith
An LLM for OTFS channel prediction performs better when given the maximum Doppler frequency as a physical descriptor.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The proposed physics-aware LLM-based predictor learns the temporal evolution of OTFS channel coefficients from historical channel observations while incorporating mobility-related physical descriptors such as maximum Doppler frequency, achieving lower normalized mean square error than classical deep learning predictors and LLM-based predictors without physical descriptors across user velocities of 100 to 500 km/h.
What carries the argument
The physics-aware LLM predictor that combines historical OTFS channel observations with the scalar maximum Doppler frequency to model temporal channel evolution.
If this is right
- Accurate channel prediction reduces the need for frequent pilot transmissions in high-mobility OTFS links.
- LLMs can serve as effective sequence models for wireless channel forecasting when supplied with basic mobility parameters.
- Performance gains hold across a wide range of velocities up to 500 km/h in simulated environments.
Where Pith is reading between the lines
- Minimal physical inputs like a single Doppler scalar may allow LLMs to generalize better across varying wireless conditions than purely data-driven approaches.
- Real-world validation would require testing on measured channels that include hardware impairments and complex scattering not captured in the simulations.
Load-bearing premise
That adding only the scalar maximum Doppler frequency supplies enough physical knowledge to generalize across real-world scattering environments and hardware impairments not present in the simulations.
What would settle it
A direct comparison of prediction NMSE on real measured high-mobility channels containing unmodeled effects such as hardware distortions or non-isotropic scattering would show whether the accuracy advantage disappears.
Figures
read the original abstract
In high-mobility 6G scenarios, rapidly time-varying channels lead to very short coherence times, which makes conventional pilot-based channel state information (CSI) estimation approaches prone to outdated information or excessive pilot overhead. Therefore, channel prediction becomes essential in such dynamic wireless systems. To address this challenge, large language models (LLMs) are emerging learning frameworks that have recently attracted attention for CSI prediction due to their strong sequence modeling capability and ability to generalize across different environments. This paper proposes an LLM-based framework for channel prediction in high-mobility orthogonal time frequency space (OTFS) communication systems. In this work, we develop a physics-aware LLM-based predictor that learns the temporal evolution of OTFS channel coefficients from historical channel observations while incorporating mobility-related physical descriptors (e.g., maximum Doppler frequency) to achieve accurate prediction of future channel states in rapidly time-varying environments. The effectiveness of the proposed framework is evaluated through extensive simulations under user velocities ranging from 100 to 500 km/h. Numerical results show that the proposed method consistently achieves lower normalized mean square error (NMSE) compared with both classical deep learning predictors and LLM-based predictors without physical channel descriptors. These results demonstrate the advantage of integrating mobility-related channel knowledge with LLM-based sequence modeling for channel prediction in highly dynamic OTFS systems.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes a physics-aware LLM-based framework for CSI prediction in high-mobility OTFS systems. It augments historical channel observations with mobility-related physical descriptors such as maximum Doppler frequency to model temporal channel evolution and reports lower NMSE than classical deep-learning predictors and non-physics LLM baselines across simulations with user velocities 100–500 km/h.
Significance. If the NMSE advantage survives distribution shift, the work would usefully demonstrate how LLMs can be augmented with lightweight physical descriptors for channel prediction in 6G high-mobility scenarios, potentially reducing pilot overhead. The current evidence, however, is confined to simulations in which the supplied Doppler scalar exactly matches the channel-generation model, limiting the assessed significance.
major comments (2)
- [Numerical Results] Simulation Setup (Numerical Results): The maximum Doppler frequency is provided as an explicit scalar input that is identical to the parameter controlling temporal correlation in the OTFS channel generator. This creates a risk that reported NMSE gains reflect exploitation of simulator leakage rather than learned generalization across scattering environments or hardware impairments. A concrete test with mismatched Doppler values, altered power-delay profiles, or non-stationary channels is required to support the central claim of robust physics-aware prediction.
- [Numerical Results] Baseline Comparison: The paper states that the proposed method outperforms 'LLM-based predictors without physical channel descriptors,' yet the exact architecture, input formatting, and training protocol of these baselines are not specified in sufficient detail to confirm they are fairly matched in capacity and optimization. This detail is load-bearing for the claim that the performance gain is attributable to the physics descriptor rather than implementation differences.
minor comments (1)
- [Abstract] The abstract and introduction would benefit from a brief statement of how the scalar Doppler descriptor is tokenized and injected into the LLM (e.g., as an additional embedding or prompt prefix) to clarify the 'physics-aware' mechanism.
Simulated Author's Rebuttal
We thank the referee for the constructive comments. We address each major point below and will revise the manuscript to strengthen the claims.
read point-by-point responses
-
Referee: The maximum Doppler frequency is provided as an explicit scalar input that is identical to the parameter controlling temporal correlation in the OTFS channel generator. This creates a risk that reported NMSE gains reflect exploitation of simulator leakage rather than learned generalization across scattering environments or hardware impairments. A concrete test with mismatched Doppler values, altered power-delay profiles, or non-stationary channels is required to support the central claim of robust physics-aware prediction.
Authors: We acknowledge this valid concern regarding potential simulator leakage. The Doppler scalar is supplied as a lightweight physical descriptor to inform the model of expected temporal dynamics, but we agree that matched parameters alone do not fully demonstrate robustness. In the revision we will add new experiments using intentionally mismatched Doppler inputs (e.g., 10-20% offset from the true value), altered power-delay profiles drawn from a different distribution, and non-stationary channel realizations. These results will be reported to substantiate generalization beyond the original simulation setup. revision: yes
-
Referee: The paper states that the proposed method outperforms 'LLM-based predictors without physical channel descriptors,' yet the exact architecture, input formatting, and training protocol of these baselines are not specified in sufficient detail to confirm they are fairly matched in capacity and optimization. This detail is load-bearing for the claim that the performance gain is attributable to the physics descriptor rather than implementation differences.
Authors: We agree that insufficient baseline detail weakens the attribution of gains to the physics descriptors. The revised manuscript will expand Section IV to specify the baseline LLM architectures (layer count, attention heads, hidden dimension), exact input formatting (tokenization of historical CSI sequences without Doppler or other descriptors), and training protocols (optimizer, learning-rate schedule, batch size, and number of epochs). This will confirm that baselines are capacity-matched and that observed improvements arise from the physics-aware augmentation. revision: yes
Circularity Check
No circularity: empirical NMSE comparison stands on independent simulation benchmarks
full rationale
The paper reports an empirical result: an LLM predictor supplied with the scalar maximum Doppler frequency achieves lower NMSE than classical DL and plain-LLM baselines when tested on OTFS channels generated at 100–500 km/h. No derivation chain, equation, or self-citation is presented that reduces the reported NMSE value to a fitted parameter by construction. The Doppler scalar is an explicit, externally supplied input that matches the simulator’s generation parameter, but the task remains a genuine forward prediction of future channel coefficients; the performance delta is measured against held-out realizations and is not tautological. Because the central claim is a comparative numerical outcome rather than a closed-form identity or a uniqueness theorem imported from the authors’ prior work, the evaluation is self-contained against the stated simulation benchmarks and receives the default non-circularity finding.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption OTFS channel coefficients evolve according to a time-varying model governed by maximum Doppler frequency
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
we develop a physics-aware LLM-based predictor that learns the temporal evolution of OTFS channel coefficients from historical channel observations while incorporating mobility-related physical descriptors (e.g., maximum Doppler frequency)
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
the physical descriptor is represented by the maximum Doppler frequency, which reflects the mobility-induced temporal variation of the channel
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Directi on estimation of the attacked signal in PBCH of 5G NR,
M. Kazemian, T. Dagiuklas, and J. Jasperneite, “Directi on estimation of the attacked signal in PBCH of 5G NR,” IEEE Communications Letters , vol. 28, no. 7, pp. 1639–1643, Jul. 2024
work page 2024
-
[2]
A low complexity peak-to-average power ratio reduction sc heme using gray codes,
M. Kazemian, P . V arahram, S. J. B. Hashim, B. M. Ali, and R. Farrell, “A low complexity peak-to-average power ratio reduction sc heme using gray codes,” Wireless Personal Communications, vol. 88, no. 2, pp. 223– 239, May. 2016
work page 2016
-
[3]
Channel prediction in r apidly time- varying OTFS systems using FAR models,
M. Kazemian and J. Jasperneite, “Channel prediction in r apidly time- varying OTFS systems using FAR models,” IEEE Wireless Communi- cations Letters , vol. 15, pp. 715–719, Feb. 2026
work page 2026
-
[4]
Channel estimatio n and turbo equalization for coded OTFS and OFDM: A comparison,
X. Huang, A. Farhang, and R.-R. Chen, “Channel estimatio n and turbo equalization for coded OTFS and OFDM: A comparison,” IEEE Wireless Communications Letters , vol. 12, no. 9, pp. 1613–1617, Sep. 2023
work page 2023
-
[5]
D. Ying and F. Y e, “Deep learning supported path predicti on and channel estimation for MIMO-OTFS system with high delay resolution ,” IEEE Transactions on V ehicular Technology, vol. 74, no. 3, pp. 3584–3597, Mar. 2025
work page 2025
-
[6]
Basis expansion extrapolation-based long-term chan nel prediction for massive MIMO OTFS systems,
Y . Zhang, X. Zhu, Y . Liu, Y . L. Guan, D. Gonz´ alez G., and V . K. N. Lau, “Basis expansion extrapolation-based long-term chan nel prediction for massive MIMO OTFS systems,” IEEE Transactions on Wireless Communications, vol. 25, pp. 2280–2296, Jan. 2026
work page 2026
-
[7]
Hybrid CNN- transformer based sparse channel prediction for high-mobi lity OTFS systems,
Z. Guan, W. Wen, P . Wu, C. Wang, and M. Xia, “Hybrid CNN- transformer based sparse channel prediction for high-mobi lity OTFS systems,” IEEE Wireless Communications Letters , vol. 15, pp. 215–219, Mar. 2026
work page 2026
-
[8]
A conditional variatio nal framework for channel prediction in high-mobility 6G OTFS networks,
M. Kazemian and J. Jasperneite, “A conditional variatio nal framework for channel prediction in high-mobility 6G OTFS networks,” Jan. 2026. [Online]. Available: https://doi.org/10.48550/arXiv.2601.03084
-
[9]
PETformer: Lon g-term time series forecasting via placeholder-enhanced transfo rmer,
S. Lin, W. Lin, W. Wu, S. Wang, and Y . Wang, “PETformer: Lon g-term time series forecasting via placeholder-enhanced transfo rmer,” IEEE Transactions on Emerging Topics in Computational Intellig ence, vol. 9, no. 2, pp. 1189–1201, Apr. 2025
work page 2025
-
[10]
Y . Jin, Y . Wu, Y . Gao, S. Zhang, S. Xu, and C.-X. Wang, “Lin former: A linear-based lightweight transformer architecture for ti me-aware MIMO channel prediction,” IEEE Transactions on Wireless Communications , vol. 24, no. 9, pp. 7177–7190, Sep. 2025
work page 2025
-
[11]
LLM 4CP: Adapting large language models for channel prediction,
B. Liu, X. Liu, S. Gao, X. Cheng, and L. Y ang, “LLM 4CP: Adapting large language models for channel prediction,” Journal of Communica- tions and Information Networks , vol. 9, no. 2, pp. 113–125, Jun. 2024
work page 2024
-
[12]
Sensing- assisted channel prediction in complex wireless environments: An LL M-based approach,
J. He, Z. Ren, J. Y ao, H. Hu, T. X. Han, and J. Xu, “Sensing- assisted channel prediction in complex wireless environments: An LL M-based approach,” IEEE Wireless Communications Letters , vol. 14, no. 12, pp. 3857–3861, Dec. 2025
work page 2025
-
[13]
Large la nguage model-driven channel prediction in cell-free mMIMO system s,
B. Chong, H. Lu, D. Niyato, and A. Nallanathan, “Large la nguage model-driven channel prediction in cell-free mMIMO system s,” IEEE Journal on Selected Areas in Communications , vol. 44, pp. 3412–3426, Jan. 2026
work page 2026
-
[14]
FAS-LLM: Large language model-based channel prediction for OTFS-enabled satellite- FAS links,
H. Y ang, S. Lambotharan, and M. Derakhshani, “FAS-LLM: Large language model-based channel prediction for OTFS-enabled satellite- FAS links,” IEEE Journal on Selected Areas in Communications , vol. 44, pp. 2952–2963, Jan. 2026
work page 2026
-
[15]
Z. Li, Q. Y ang, Z. Xiong, Z. Shi, and T. Q. S. Quek, “Bridgi ng the modality gap: Enhancing channel prediction with semantica lly aligned LLMs and knowledge distillation,” IEEE Journal on Selected Areas in Communications, vol. 44, pp. 3382–3396, Feb. 2026
work page 2026
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.