pith. sign in

arxiv: 2604.04028 · v1 · submitted 2026-04-05 · 📡 eess.SP

Enhancing 6G Wireless Intelligence: Do LLMs Work for CSI Prediction?

Pith reviewed 2026-05-13 17:02 UTC · model grok-4.3

classification 📡 eess.SP
keywords OTFS channel predictionLLM-based predictors6G high-mobilityCSI estimationmaximum Doppler frequencynormalized mean square errorphysics-aware models
0
0 comments X

The pith

An LLM for OTFS channel prediction performs better when given the maximum Doppler frequency as a physical descriptor.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tests whether large language models can predict future channel state information in high-mobility OTFS systems. It introduces a version that adds the scalar maximum Doppler frequency to the input sequence. Simulations at speeds from 100 to 500 km/h show this physics-aware version yields lower normalized mean square error than both standard deep learning predictors and LLMs without the Doppler input. The result matters because short coherence times in fast-moving 6G scenarios make accurate prediction essential for avoiding outdated CSI or high pilot costs.

Core claim

The proposed physics-aware LLM-based predictor learns the temporal evolution of OTFS channel coefficients from historical channel observations while incorporating mobility-related physical descriptors such as maximum Doppler frequency, achieving lower normalized mean square error than classical deep learning predictors and LLM-based predictors without physical descriptors across user velocities of 100 to 500 km/h.

What carries the argument

The physics-aware LLM predictor that combines historical OTFS channel observations with the scalar maximum Doppler frequency to model temporal channel evolution.

If this is right

  • Accurate channel prediction reduces the need for frequent pilot transmissions in high-mobility OTFS links.
  • LLMs can serve as effective sequence models for wireless channel forecasting when supplied with basic mobility parameters.
  • Performance gains hold across a wide range of velocities up to 500 km/h in simulated environments.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Minimal physical inputs like a single Doppler scalar may allow LLMs to generalize better across varying wireless conditions than purely data-driven approaches.
  • Real-world validation would require testing on measured channels that include hardware impairments and complex scattering not captured in the simulations.

Load-bearing premise

That adding only the scalar maximum Doppler frequency supplies enough physical knowledge to generalize across real-world scattering environments and hardware impairments not present in the simulations.

What would settle it

A direct comparison of prediction NMSE on real measured high-mobility channels containing unmodeled effects such as hardware distortions or non-isotropic scattering would show whether the accuracy advantage disappears.

Figures

Figures reproduced from arXiv: 2604.04028 by J\"urgen Jasperneite, Mohsen Kazemian.

Figure 1
Figure 1. Figure 1: NMSE performance versus user velocity for different [PITH_FULL_IMAGE:figures/full_fig_p006_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: NMSE performance versus prediction horizon (number [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗
read the original abstract

In high-mobility 6G scenarios, rapidly time-varying channels lead to very short coherence times, which makes conventional pilot-based channel state information (CSI) estimation approaches prone to outdated information or excessive pilot overhead. Therefore, channel prediction becomes essential in such dynamic wireless systems. To address this challenge, large language models (LLMs) are emerging learning frameworks that have recently attracted attention for CSI prediction due to their strong sequence modeling capability and ability to generalize across different environments. This paper proposes an LLM-based framework for channel prediction in high-mobility orthogonal time frequency space (OTFS) communication systems. In this work, we develop a physics-aware LLM-based predictor that learns the temporal evolution of OTFS channel coefficients from historical channel observations while incorporating mobility-related physical descriptors (e.g., maximum Doppler frequency) to achieve accurate prediction of future channel states in rapidly time-varying environments. The effectiveness of the proposed framework is evaluated through extensive simulations under user velocities ranging from 100 to 500 km/h. Numerical results show that the proposed method consistently achieves lower normalized mean square error (NMSE) compared with both classical deep learning predictors and LLM-based predictors without physical channel descriptors. These results demonstrate the advantage of integrating mobility-related channel knowledge with LLM-based sequence modeling for channel prediction in highly dynamic OTFS systems.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript proposes a physics-aware LLM-based framework for CSI prediction in high-mobility OTFS systems. It augments historical channel observations with mobility-related physical descriptors such as maximum Doppler frequency to model temporal channel evolution and reports lower NMSE than classical deep-learning predictors and non-physics LLM baselines across simulations with user velocities 100–500 km/h.

Significance. If the NMSE advantage survives distribution shift, the work would usefully demonstrate how LLMs can be augmented with lightweight physical descriptors for channel prediction in 6G high-mobility scenarios, potentially reducing pilot overhead. The current evidence, however, is confined to simulations in which the supplied Doppler scalar exactly matches the channel-generation model, limiting the assessed significance.

major comments (2)
  1. [Numerical Results] Simulation Setup (Numerical Results): The maximum Doppler frequency is provided as an explicit scalar input that is identical to the parameter controlling temporal correlation in the OTFS channel generator. This creates a risk that reported NMSE gains reflect exploitation of simulator leakage rather than learned generalization across scattering environments or hardware impairments. A concrete test with mismatched Doppler values, altered power-delay profiles, or non-stationary channels is required to support the central claim of robust physics-aware prediction.
  2. [Numerical Results] Baseline Comparison: The paper states that the proposed method outperforms 'LLM-based predictors without physical channel descriptors,' yet the exact architecture, input formatting, and training protocol of these baselines are not specified in sufficient detail to confirm they are fairly matched in capacity and optimization. This detail is load-bearing for the claim that the performance gain is attributable to the physics descriptor rather than implementation differences.
minor comments (1)
  1. [Abstract] The abstract and introduction would benefit from a brief statement of how the scalar Doppler descriptor is tokenized and injected into the LLM (e.g., as an additional embedding or prompt prefix) to clarify the 'physics-aware' mechanism.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments. We address each major point below and will revise the manuscript to strengthen the claims.

read point-by-point responses
  1. Referee: The maximum Doppler frequency is provided as an explicit scalar input that is identical to the parameter controlling temporal correlation in the OTFS channel generator. This creates a risk that reported NMSE gains reflect exploitation of simulator leakage rather than learned generalization across scattering environments or hardware impairments. A concrete test with mismatched Doppler values, altered power-delay profiles, or non-stationary channels is required to support the central claim of robust physics-aware prediction.

    Authors: We acknowledge this valid concern regarding potential simulator leakage. The Doppler scalar is supplied as a lightweight physical descriptor to inform the model of expected temporal dynamics, but we agree that matched parameters alone do not fully demonstrate robustness. In the revision we will add new experiments using intentionally mismatched Doppler inputs (e.g., 10-20% offset from the true value), altered power-delay profiles drawn from a different distribution, and non-stationary channel realizations. These results will be reported to substantiate generalization beyond the original simulation setup. revision: yes

  2. Referee: The paper states that the proposed method outperforms 'LLM-based predictors without physical channel descriptors,' yet the exact architecture, input formatting, and training protocol of these baselines are not specified in sufficient detail to confirm they are fairly matched in capacity and optimization. This detail is load-bearing for the claim that the performance gain is attributable to the physics descriptor rather than implementation differences.

    Authors: We agree that insufficient baseline detail weakens the attribution of gains to the physics descriptors. The revised manuscript will expand Section IV to specify the baseline LLM architectures (layer count, attention heads, hidden dimension), exact input formatting (tokenization of historical CSI sequences without Doppler or other descriptors), and training protocols (optimizer, learning-rate schedule, batch size, and number of epochs). This will confirm that baselines are capacity-matched and that observed improvements arise from the physics-aware augmentation. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical NMSE comparison stands on independent simulation benchmarks

full rationale

The paper reports an empirical result: an LLM predictor supplied with the scalar maximum Doppler frequency achieves lower NMSE than classical DL and plain-LLM baselines when tested on OTFS channels generated at 100–500 km/h. No derivation chain, equation, or self-citation is presented that reduces the reported NMSE value to a fitted parameter by construction. The Doppler scalar is an explicit, externally supplied input that matches the simulator’s generation parameter, but the task remains a genuine forward prediction of future channel coefficients; the performance delta is measured against held-out realizations and is not tautological. Because the central claim is a comparative numerical outcome rather than a closed-form identity or a uniqueness theorem imported from the authors’ prior work, the evaluation is self-contained against the stated simulation benchmarks and receives the default non-circularity finding.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The work rests on standard assumptions of wide-sense stationary uncorrelated scattering channels and perfect knowledge of maximum Doppler frequency; no new entities are postulated and no free parameters are fitted beyond ordinary training.

axioms (1)
  • domain assumption OTFS channel coefficients evolve according to a time-varying model governed by maximum Doppler frequency
    Invoked when the physics descriptor is added to the LLM input

pith-pipeline@v0.9.0 · 5532 in / 1131 out tokens · 29755 ms · 2026-05-13T17:02:08.655954+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

15 extracted references · 15 canonical work pages

  1. [1]

    Directi on estimation of the attacked signal in PBCH of 5G NR,

    M. Kazemian, T. Dagiuklas, and J. Jasperneite, “Directi on estimation of the attacked signal in PBCH of 5G NR,” IEEE Communications Letters , vol. 28, no. 7, pp. 1639–1643, Jul. 2024

  2. [2]

    A low complexity peak-to-average power ratio reduction sc heme using gray codes,

    M. Kazemian, P . V arahram, S. J. B. Hashim, B. M. Ali, and R. Farrell, “A low complexity peak-to-average power ratio reduction sc heme using gray codes,” Wireless Personal Communications, vol. 88, no. 2, pp. 223– 239, May. 2016

  3. [3]

    Channel prediction in r apidly time- varying OTFS systems using FAR models,

    M. Kazemian and J. Jasperneite, “Channel prediction in r apidly time- varying OTFS systems using FAR models,” IEEE Wireless Communi- cations Letters , vol. 15, pp. 715–719, Feb. 2026

  4. [4]

    Channel estimatio n and turbo equalization for coded OTFS and OFDM: A comparison,

    X. Huang, A. Farhang, and R.-R. Chen, “Channel estimatio n and turbo equalization for coded OTFS and OFDM: A comparison,” IEEE Wireless Communications Letters , vol. 12, no. 9, pp. 1613–1617, Sep. 2023

  5. [5]

    Deep learning supported path predicti on and channel estimation for MIMO-OTFS system with high delay resolution ,

    D. Ying and F. Y e, “Deep learning supported path predicti on and channel estimation for MIMO-OTFS system with high delay resolution ,” IEEE Transactions on V ehicular Technology, vol. 74, no. 3, pp. 3584–3597, Mar. 2025

  6. [6]

    Basis expansion extrapolation-based long-term chan nel prediction for massive MIMO OTFS systems,

    Y . Zhang, X. Zhu, Y . Liu, Y . L. Guan, D. Gonz´ alez G., and V . K. N. Lau, “Basis expansion extrapolation-based long-term chan nel prediction for massive MIMO OTFS systems,” IEEE Transactions on Wireless Communications, vol. 25, pp. 2280–2296, Jan. 2026

  7. [7]

    Hybrid CNN- transformer based sparse channel prediction for high-mobi lity OTFS systems,

    Z. Guan, W. Wen, P . Wu, C. Wang, and M. Xia, “Hybrid CNN- transformer based sparse channel prediction for high-mobi lity OTFS systems,” IEEE Wireless Communications Letters , vol. 15, pp. 215–219, Mar. 2026

  8. [8]

    A conditional variatio nal framework for channel prediction in high-mobility 6G OTFS networks,

    M. Kazemian and J. Jasperneite, “A conditional variatio nal framework for channel prediction in high-mobility 6G OTFS networks,” Jan. 2026. [Online]. Available: https://doi.org/10.48550/arXiv.2601.03084

  9. [9]

    PETformer: Lon g-term time series forecasting via placeholder-enhanced transfo rmer,

    S. Lin, W. Lin, W. Wu, S. Wang, and Y . Wang, “PETformer: Lon g-term time series forecasting via placeholder-enhanced transfo rmer,” IEEE Transactions on Emerging Topics in Computational Intellig ence, vol. 9, no. 2, pp. 1189–1201, Apr. 2025

  10. [10]

    Lin former: A linear-based lightweight transformer architecture for ti me-aware MIMO channel prediction,

    Y . Jin, Y . Wu, Y . Gao, S. Zhang, S. Xu, and C.-X. Wang, “Lin former: A linear-based lightweight transformer architecture for ti me-aware MIMO channel prediction,” IEEE Transactions on Wireless Communications , vol. 24, no. 9, pp. 7177–7190, Sep. 2025

  11. [11]

    LLM 4CP: Adapting large language models for channel prediction,

    B. Liu, X. Liu, S. Gao, X. Cheng, and L. Y ang, “LLM 4CP: Adapting large language models for channel prediction,” Journal of Communica- tions and Information Networks , vol. 9, no. 2, pp. 113–125, Jun. 2024

  12. [12]

    Sensing- assisted channel prediction in complex wireless environments: An LL M-based approach,

    J. He, Z. Ren, J. Y ao, H. Hu, T. X. Han, and J. Xu, “Sensing- assisted channel prediction in complex wireless environments: An LL M-based approach,” IEEE Wireless Communications Letters , vol. 14, no. 12, pp. 3857–3861, Dec. 2025

  13. [13]

    Large la nguage model-driven channel prediction in cell-free mMIMO system s,

    B. Chong, H. Lu, D. Niyato, and A. Nallanathan, “Large la nguage model-driven channel prediction in cell-free mMIMO system s,” IEEE Journal on Selected Areas in Communications , vol. 44, pp. 3412–3426, Jan. 2026

  14. [14]

    FAS-LLM: Large language model-based channel prediction for OTFS-enabled satellite- FAS links,

    H. Y ang, S. Lambotharan, and M. Derakhshani, “FAS-LLM: Large language model-based channel prediction for OTFS-enabled satellite- FAS links,” IEEE Journal on Selected Areas in Communications , vol. 44, pp. 2952–2963, Jan. 2026

  15. [15]

    Bridgi ng the modality gap: Enhancing channel prediction with semantica lly aligned LLMs and knowledge distillation,

    Z. Li, Q. Y ang, Z. Xiong, Z. Shi, and T. Q. S. Quek, “Bridgi ng the modality gap: Enhancing channel prediction with semantica lly aligned LLMs and knowledge distillation,” IEEE Journal on Selected Areas in Communications, vol. 44, pp. 3382–3396, Feb. 2026