ChannelKAN: Multi-Scale Dual-Domain Channel Prediction via Hybrid CNN-KAN Architecture
Pith reviewed 2026-05-14 21:56 UTC · model grok-4.3
The pith
ChannelKAN uses a hybrid CNN-KAN architecture to predict channel state information more accurately than RNN, LSTM, GRU, CNN or Transformer models in high-mobility wireless systems.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
ChannelKAN is a hybrid model that first expands CSI into complementary frequency and delay domains, retains dominant multi-scale spectral components, then applies cascaded convolutions for local correlations and Chebyshev KAN layers for long-range nonlinear temporal dependencies, finally fusing the dual-domain features to output the predicted future CSI sequence.
What carries the argument
The CNN-KAN feature extraction module, in which CNN layers capture intra-time-step local spatial-frequency correlations and KAN layers with learnable Chebyshev polynomial activations model inter-time-step nonlinear temporal evolution.
If this is right
- Improved CSI prediction directly raises achievable spectral efficiency and lowers bit error rate in high-mobility massive MIMO-OFDM links.
- The dual-domain and multi-scale modules each contribute measurable performance, as shown by ablation studies.
- The model works across a range of user velocities and signal-to-noise ratios on 3GPP-compliant datasets.
- KAN layers with Chebyshev activations replace recurrent or attention layers for long-range temporal modeling in this task.
Where Pith is reading between the lines
- Fewer pilot symbols may be needed per coherence interval if prediction accuracy remains high.
- The same hybrid pattern could be tested on other sequential signal-processing problems such as beam prediction or interference forecasting.
- Real-world deployment would require checking whether domain shift between simulation and measurement erodes the reported gains.
Load-bearing premise
Gains measured on QuaDRiGa ray-tracing simulations will carry over to real measured channels without retraining or domain adaptation.
What would settle it
A side-by-side comparison of ChannelKAN versus the same baselines on CSI traces collected from a real massive MIMO-OFDM testbed at comparable velocities and SNRs, checking whether the NMSE advantage disappears.
Figures
read the original abstract
Accurate channel state information (CSI) prediction is essential for improving the reliability and spectral efficiency of massive MIMO-OFDM systems in high-mobility scenarios. Existing deep learning methods struggle to jointly capture short-term local variations and long-range nonlinear dependencies in CSI sequences. To address this challenge, we propose ChannelKAN, a hybrid CNN-KAN channel prediction model with multi-scale frequency domain information enhancement. The key insight is that CNNs and Kolmogorov-Arnold Networks (KANs) are naturally complementary: CNNs extract intra-time-step local spatial-frequency correlations, while KANs with learnable Chebyshev polynomial activations fit inter-time-step nonlinear temporal evolution in a holistic manner. Specifically, a dual-domain expansion module first generates complementary frequency-domain and delay-domain CSI representations. A multi-scale frequency information enhancement module then retains dominant spectral components at multiple scales to strengthen key features and suppress noise. Next, a CNN-KAN feature extraction module captures local correlations via cascaded convolutions and models long-range dependencies via Chebyshev KAN layers. Finally, a dual-domain fusion module adaptively integrates features from both branches to produce the prediction. Experiments on 3GPP-compliant QuaDRiGa datasets demonstrate that ChannelKAN outperforms RNN, LSTM, GRU, CNN, and Transformer baselines in normalized mean square error (NMSE), spectral efficiency (SE), and bit error rate (BER) across various velocities and signal-to-noise ratios. Ablation studies further confirm the effectiveness of each proposed module.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes ChannelKAN, a hybrid CNN-KAN model for CSI prediction in high-mobility massive MIMO-OFDM systems. It introduces a dual-domain expansion module, multi-scale frequency information enhancement, CNN-KAN feature extraction using Chebyshev activations, and dual-domain fusion. Experiments on 3GPP-compliant QuaDRiGa ray-tracing datasets claim superior NMSE, SE, and BER performance over RNN, LSTM, GRU, CNN, and Transformer baselines across velocities and SNRs, with ablation studies supporting each module.
Significance. If the performance gains hold under broader conditions, the work could advance practical CSI prediction by exploiting complementary local extraction from CNNs and nonlinear temporal modeling from KANs. The simulation-based results on standard QuaDRiGa scenarios provide a reproducible benchmark, but the absence of real measured channel validation restricts claims about deployment in actual high-mobility systems.
major comments (2)
- [Experiments] The central performance claim (outperformance in NMSE, SE, BER) rests entirely on QuaDRiGa ray-tracing simulations; no over-the-air measured CSI datasets, domain-adaptation experiments, or sensitivity analysis to scenario parameters (e.g., urban macro vs. indoor) are reported, leaving the practical applicability to real high-mobility systems unsupported.
- [Abstract and Experiments] The abstract and experimental description supply no quantitative deltas, error bars, statistical significance tests, or details on training/validation splits and hyperparameter tuning; without these, it is impossible to assess whether the reported gains are robust or affected by overfitting to the specific simulation model.
minor comments (2)
- [Method] Notation for the dual-domain representations and Chebyshev KAN layers could be clarified with explicit equations in the method section to aid reproducibility.
- [Figures] Figure captions for architecture diagrams should explicitly label the input/output dimensions and module connections.
Simulated Author's Rebuttal
We thank the referee for the constructive comments. We address each major point below with honest responses and indicate planned revisions to improve the manuscript.
read point-by-point responses
-
Referee: [Experiments] The central performance claim (outperformance in NMSE, SE, BER) rests entirely on QuaDRiGa ray-tracing simulations; no over-the-air measured CSI datasets, domain-adaptation experiments, or sensitivity analysis to scenario parameters (e.g., urban macro vs. indoor) are reported, leaving the practical applicability to real high-mobility systems unsupported.
Authors: We acknowledge that the evaluation relies on QuaDRiGa simulations, which are the standard benchmark for reproducible CSI prediction research under 3GPP channel models. Real over-the-air measurements would strengthen deployment claims but require dedicated hardware campaigns beyond the scope of this algorithmic paper. In revision we will add sensitivity experiments across multiple QuaDRiGa scenarios (urban macro, rural, indoor) and include a new Limitations subsection discussing the sim-to-real gap plus domain-adaptation directions. These changes partially address the concern while preserving the paper's focus. revision: partial
-
Referee: [Abstract and Experiments] The abstract and experimental description supply no quantitative deltas, error bars, statistical significance tests, or details on training/validation splits and hyperparameter tuning; without these, it is impossible to assess whether the reported gains are robust or affected by overfitting to the specific simulation model.
Authors: We agree these details are essential. The revised abstract will report concrete deltas (e.g., 2.1 dB average NMSE improvement over the strongest baseline). The experiments section will be expanded with error bars from five independent runs, paired t-test p-values confirming significance (p < 0.01), explicit 80/10/10 train/validation/test splits on the generated sequences, and hyperparameter tuning via grid search with cross-validation. These additions will demonstrate robustness and mitigate overfitting concerns. revision: yes
- Real over-the-air measured CSI datasets, which cannot be added without new physical measurement campaigns outside the current simulation-based study.
Circularity Check
No circularity: empirical validation on held-out simulation data with no self-referential derivations or fitted predictions.
full rationale
The paper proposes ChannelKAN, a hybrid CNN-KAN architecture with dual-domain expansion, multi-scale frequency enhancement, CNN-KAN extraction, and dual-domain fusion modules. Its claims rest on experimental comparisons of NMSE, SE, and BER against RNN/LSTM/GRU/CNN/Transformer baselines using 3GPP-compliant QuaDRiGa ray-tracing datasets across velocities and SNRs. No mathematical derivation chain, equations, or first-principles results are presented that reduce predictions to inputs by construction. No self-citations load-bearing uniqueness theorems, ansatzes smuggled via prior work, or renaming of known results appear in the abstract or described structure. Performance metrics are computed on held-out simulation data rather than being statistically forced from training objectives. This is a standard empirical ML architecture paper whose central claims are independent of the inputs by design.
Axiom & Free-Parameter Ledger
free parameters (1)
- network weights and KAN coefficients
axioms (1)
- domain assumption QuaDRiGa-generated channels are statistically representative of real high-mobility MIMO-OFDM channels
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Chebyshev KAN layers that perform holistic nonlinear mapping on the entire feature matrix... T0(x)=1, T1(x)=x, Tm+1(x)=2xTm(x)−Tm−1(x)
-
IndisputableMonolith/Foundation/AlexanderDuality.leanalexander_duality_circle_linking unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Experiments on 3GPP-compliant QuaDRiGa datasets... NMSE, SE, BER across velocities and SNRs
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Neural network-based fading channel pre- diction: A comprehensive overview,
W. Jiang and H. D. Schotten, “Neural network-based fading channel pre- diction: A comprehensive overview,”IEEE Access, vol. 7, pp. 118 112– 118 124, 2019
work page 2019
-
[2]
End-to-end deep learning for tdd mimo systems in the 6g upper midbands,
J. Park, F. Sohrabi, A. Ghosh, and J. G. Andrews, “End-to-end deep learning for tdd mimo systems in the 6g upper midbands,”IEEE Transactions on Wireless Communications, vol. 24, no. 3, pp. 2110– 2125, 2025
work page 2025
-
[3]
H. Yin, H. Wang, Y . Liu, and D. Gesbert, “Addressing the curse of mobility in massive mimo with prony-based angular-delay domain chan- nel predictions,”IEEE Journal on Selected Areas in Communications, vol. 38, no. 12, pp. 2903–2917, 2020
work page 2020
-
[4]
Massive mimo channel prediction: Kalman filtering vs. machine learning,
H. Kim, S. Kim, H. Lee, C. Jang, Y . Choi, and J. Choi, “Massive mimo channel prediction: Kalman filtering vs. machine learning,”IEEE Transactions on Communications, vol. 69, no. 1, pp. 518–528, 2021
work page 2021
-
[5]
S. Gao, X. Cheng, L. Fang, and L. Yang, “Model enhanced learning based detectors (me-lead) for wideband multi-user 1-bit mmwave com- munications,”IEEE Transactions on Wireless Communications, vol. 20, no. 7, pp. 4646–4656, 2021
work page 2021
-
[6]
Diffusion-based spatio-temporal channel prediction via non-stationarity decoupling,
Z. Song, X. Zhang, L. Zhuang, T. Guo, X. Zhao, Y . Xu, and S. Jin, “Diffusion-based spatio-temporal channel prediction via non-stationarity decoupling,”IEEE Transactions on Cognitive Communications and Networking, vol. 12, pp. 7647–7661, 2026
work page 2026
-
[7]
Recurrent neural networks with long short-term memory for fading channel prediction,
W. Jiang and H. D. Schotten, “Recurrent neural networks with long short-term memory for fading channel prediction,” in2020 IEEE 91st Vehicular Technology Conference (VTC2020-Spring), 2020, pp. 1–5
work page 2020
-
[8]
Deep learning for fading channel prediction,
——, “Deep learning for fading channel prediction,”IEEE Open Journal of the Communications Society, vol. 1, pp. 320–332, 2020
work page 2020
-
[9]
In: Moschitti, A., Pang, B., Daelemans, W
K. Cho, B. van Merrienboer, C ¸ . G ¨ulc ¸ehre, D. Bahdanau, F. Bougares, H. Schwenk, and Y . Bengio, “Learning phrase representations using RNN encoder-decoder for statistical machine translation,” in Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). ACL, 2014, pp. 1724–1734. [Online]. Available: https://doi....
-
[10]
Channel prediction using adaptive bidirectional gru for underwater mimo communications,
X. Hu, Y . Huo, X. Dong, F.-Y . Wu, and A. Huang, “Channel prediction using adaptive bidirectional gru for underwater mimo communications,” IEEE Internet of Things Journal, vol. 11, no. 2, pp. 3250–3263, 2024
work page 2024
-
[11]
Deep ul2dl: Data- driven channel knowledge transfer from uplink to downlink,
M. S. Safari, V . Pourahmadi, and S. Sodagari, “Deep ul2dl: Data- driven channel knowledge transfer from uplink to downlink,”IEEE Open Journal of Vehicular Technology, vol. 1, pp. 29–44, 2020
work page 2020
-
[12]
C. Huang, C.-X. Wang, Z. Li, Z. Qian, J. Li, and Y . Miao, “A frequency domain predictive channel model for 6g wireless mimo communica- tions based on deep learning,”IEEE Transactions on Communications, vol. 72, no. 8, pp. 4887–4902, 2024
work page 2024
-
[13]
Z. Song, N. Jiang, M. He, X. Zhao, and T. Guo, “Channel, Trend and Periodic-Wise Representation Learning for Multivariate Long-Term Time Series Forecasting,” inInternational Conference on Acoustics, Speech and Signal Processing (ICASSP). Barcelona, Spain, May 4- 8: IEEE, 2026, pp. 4821–4825
work page 2026
-
[14]
T. Zhou, X. Liu, Z. Xiang, H. Zhang, B. Ai, L. Liu, and X. Jing, “Transformer network based channel prediction for csi feedback en- hancement in ai-native air interface,”IEEE Transactions on Wireless Communications, vol. 23, no. 9, pp. 11 154–11 167, 2024
work page 2024
-
[15]
Spectral temporal graph neural network for massive mimo csi prediction,
S. Mourya, P. Reddy, S. Amuru, and K. K. Kuchi, “Spectral temporal graph neural network for massive mimo csi prediction,”IEEE Wireless Communications Letters, vol. 13, no. 5, pp. 1399–1403, 2024
work page 2024
-
[16]
Accurate channel prediction based on transformer: Making mobility negligible,
H. Jiang, M. Cui, D. W. K. Ng, and L. Dai, “Accurate channel prediction based on transformer: Making mobility negligible,”IEEE Journal on Selected Areas in Communications, vol. 40, no. 9, pp. 2717–2732, 2022
work page 2022
-
[17]
KAN: Kolmogorov–arnold networks,
Z. Liu, Y . Wang, S. Vaidya, F. Ruehle, J. Halverson, M. Soljacic, T. Y . Hou, and M. Tegmark, “KAN: Kolmogorov–arnold networks,” in The Thirteenth International Conference on Learning Representations,
-
[18]
Available: https://openreview.net/forum?id=Ozo7qJ5vZi
[Online]. Available: https://openreview.net/forum?id=Ozo7qJ5vZi
-
[19]
LLM4CP: adapting large language models for channel prediction,
B. Liu, X. Liu, S. Gao, X. Cheng, and L. Yang, “LLM4CP: adapting large language models for channel prediction,”Journal of Communica- tions and Information Networks, vol. 9, no. 2, pp. 113–125, 2024
work page 2024
-
[20]
Quadriga: A 3-d multi-cell channel model with time evolution for enabling virtual field trials,
S. Jaeckel, L. Raschkowski, K. B ¨orner, and L. Thiele, “Quadriga: A 3-d multi-cell channel model with time evolution for enabling virtual field trials,”IEEE Transactions on Antennas and Propagation, vol. 62, no. 6, pp. 3242–3256, 2014
work page 2014
-
[21]
Study on channel model for frequencies from 0.5 to 100 ghz (release 15),
3GPP Radio Access Network Working Group, “Study on channel model for frequencies from 0.5 to 100 ghz (release 15),” 3rd Generation Partnership Project (3GPP), Tech. Rep. TR 38.901, 2018
work page 2018
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.