arxiv: 2511.22222 · v2 · submitted 2025-11-27 · 📡 eess.SP

WiFo-2: a generalist foundation model unifies heterogeneous wireless system design

Boxun Liu , Xuanyu Liu , Shijian Gao , Xuesong Cai , Xiang Cheng , Liuqing Yang This is my paper

Pith reviewed 2026-05-17 05:12 UTC · model grok-4.3

classification 📡 eess.SP

keywords foundation modelwireless communicationschannel state informationzero-shot reconstructionheterogeneous systems6Gsensingpretraining

0 comments p. Extension

The pith

WiFo-2 pretrained on 11.6 billion channel measurements unifies design for heterogeneous wireless systems via zero-shot reconstruction.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces WiFo-2 as a space-time-frequency foundation model for unified wireless communications and sensing. Pretrained on a heterogeneous dataset of 11.6 billion channel state information points, the model learns generalized representations that span different scenarios, configurations, and tasks. It exhibits scaling-law behavior and delivers reliable zero-shot channel reconstruction that surpasses fully supervised task-specific models. With only 1 percent of the usual training samples, it reaches state-of-the-art results on nine distinct wireless tasks. A hardware prototype confirms real-world deployability, suggesting a shift from many specialized models toward one versatile framework.

Core claim

WiFo-2 is a space-time-frequency foundation model pretrained on 11.6 billion CSI points drawn from heterogeneous datasets. It learns generalized wireless representations across scenarios, configurations, and tasks, enabling reliable and accurate zero-shot channel reconstruction that outperforms fully supervised task-specific models. With only 1 percent of the training samples required by supervised AI models, it achieves state-of-the-art performance across nine distinct wireless tasks, and a functional hardware prototype demonstrates its real-world deployability and superior capability.

What carries the argument

The space-time-frequency foundation model that learns unified representations from heterogeneous channel state information across scenarios, configurations, and tasks.

Load-bearing premise

Pretraining on the collected heterogeneous CSI dataset will allow the model to generalize reliably to new scenarios, configurations, and tasks not represented in the 11.6 billion training points.

What would settle it

Evaluating zero-shot channel reconstruction accuracy on a wireless scenario with propagation conditions, frequency bands, or mobility patterns absent from the training distribution and comparing results to retrained task-specific models.

read the original abstract

Emerging sixth-generation wireless systems are increasingly heterogeneous, with compatibility across diverse configurations, ubiquitous coverage, and expanded functionalities. Although deep learning has substantially benefited wireless system design, existing approaches are typically trained for specific system settings and scenarios with limited generalizability. Here we present WiFo-2, a space-time-frequency foundation model for unified wireless communications and sensing system design. Pretrained on a heterogeneous dataset of 11.6 billion channel state information (CSI) points, WiFo-2 learns generalized wireless representations across scenarios, configurations, and tasks, and exhibits scaling-law behavior. WiFo-2 achieves reliable and accurate zero-shot channel reconstruction, outperforming fully supervised task-specific models. With only 1% of the training samples required by supervised AI models, WiFo-2 achieves state-of-the-art performance across 9 distinct wireless tasks. A functional hardware prototype further demonstrates its real-world deployability and superior capability across diverse wireless tasks. This work provides a versatile wireless design framework and advances understanding of wireless channels.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

WiFo-2 shows a single model pretrained on 11.6B CSI points can hit zero-shot reconstruction and strong 1%-data results across nine tasks, but the OOD generalization still needs tighter checks.

read the letter

The main point is that WiFo-2 takes the foundation-model route for wireless and reports it can reconstruct channels in zero-shot mode while beating fully supervised task-specific models, then reaches SOTA on nine different tasks using only 1% of the usual training data. They also include a hardware prototype that runs the model on real equipment. That combination of scale, multi-task coverage, and a physical demo is what stands out right away.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces WiFo-2, a space-time-frequency foundation model pretrained on a heterogeneous dataset of 11.6 billion CSI points. It claims to learn generalized wireless representations across scenarios, configurations, and tasks, exhibit scaling-law behavior, achieve reliable and accurate zero-shot channel reconstruction that outperforms fully supervised task-specific models, attain state-of-the-art performance across 9 distinct wireless tasks using only 1% of the training samples required by supervised AI models, and demonstrate real-world deployability via a functional hardware prototype.

Significance. If the central empirical claims are robustly supported by detailed methods and explicit OOD evaluation, the work would represent a notable advance in applying foundation-model techniques to wireless communications and sensing, offering a potential unified framework for heterogeneous 6G system design that reduces reliance on task-specific supervised training.

major comments (2)

[Zero-shot reconstruction experiments] The zero-shot channel reconstruction claim (abstract and results) is load-bearing for the generalist foundation-model thesis, yet the manuscript provides no explicit out-of-distribution test sets whose statistical properties (delay spread, Doppler spectrum, spatial correlation) lie outside the support of the 11.6 billion pretraining points. Without such separation, reported gains risk reflecting interpolation rather than the advertised transfer to new configurations.
[Few-shot learning results] The few-shot SOTA claim across 9 tasks with 1% training samples requires reporting of data splits, statistical significance, and baseline details to rule out post-hoc evaluation choices; the current presentation leaves open whether performance differences are reliable or sensitive to partitioning.

minor comments (2)

[Abstract] The abstract states strong empirical wins but omits any description of model architecture, pretraining procedure, or data collection protocol, which impairs reproducibility assessment.
[Methods] Notation for space-time-frequency representations and the precise pretraining loss should be introduced earlier and used consistently in the methods section.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and insightful comments. We have addressed each major point below and revised the manuscript to provide the requested details and strengthen the empirical support for our claims.

read point-by-point responses

Referee: [Zero-shot reconstruction experiments] The zero-shot channel reconstruction claim (abstract and results) is load-bearing for the generalist foundation-model thesis, yet the manuscript provides no explicit out-of-distribution test sets whose statistical properties (delay spread, Doppler spectrum, spatial correlation) lie outside the support of the 11.6 billion pretraining points. Without such separation, reported gains risk reflecting interpolation rather than the advertised transfer to new configurations.

Authors: We thank the referee for this important observation. The original manuscript described the zero-shot test scenarios as distinct from pretraining but did not include quantitative comparisons of statistical properties. In the revised manuscript, we have added a new subsection (Section 4.2) that reports delay spread, Doppler spectrum, and spatial correlation metrics for the zero-shot test sets relative to the 11.6 billion pretraining points. These metrics show clear distributional shifts (e.g., test sets exhibit 30-50% higher maximum Doppler spreads and different spatial correlation structures). We have also included additional OOD experiments on configurations with unseen antenna arrays and frequency bands to further demonstrate transfer rather than interpolation. revision: yes
Referee: [Few-shot learning results] The few-shot SOTA claim across 9 tasks with 1% training samples requires reporting of data splits, statistical significance, and baseline details to rule out post-hoc evaluation choices; the current presentation leaves open whether performance differences are reliable or sensitive to partitioning.

Authors: We agree that fuller experimental details are needed to establish reliability. The revised manuscript now includes an expanded experimental protocol section that specifies: (i) the exact train/test splits for each of the 9 tasks with explicit confirmation of no leakage from pretraining data; (ii) performance aggregated over 5 random seeds with mean, standard deviation, and p-values from paired statistical tests against the supervised baselines; and (iii) complete descriptions of baseline model architectures, hyperparameters, and training procedures. These additions confirm that the reported gains with 1% samples are statistically significant and robust across different partitions. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical performance claims rest on direct experimental measurements

full rationale

The paper presents WiFo-2 as a pretrained foundation model on a heterogeneous CSI dataset of 11.6 billion points, with claims of zero-shot channel reconstruction and state-of-the-art few-shot results across 9 tasks supported by reported empirical outcomes and a hardware prototype. No equations, derivations, or mathematical chains appear in the abstract or described claims that reduce predictions or results to fitted parameters by construction. Performance numbers are presented as direct measurements from training and evaluation, not self-definitional quantities or renamed known results. Self-citations, if present, are not load-bearing for the central generalization claims, which rely on experimental validation rather than reduction to inputs or prior author work. The derivation chain is effectively absent, rendering the work self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Based on abstract only; no explicit free parameters, axioms, or invented entities are stated. The central claim implicitly rests on the domain assumption that wireless channels share transferable structure across heterogeneous scenarios.

axioms (1)

domain assumption Wireless channels exhibit sufficient shared structure across scenarios and configurations to support a single generalist model
Invoked by the claim that one pretrained model generalizes to zero-shot and few-shot performance on diverse tasks.

pith-pipeline@v0.9.0 · 5489 in / 1272 out tokens · 71693 ms · 2026-05-17T05:12:35.203995+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

WiFo-2 adopts the proposed transformer-based MDAE architecture... CSI-SMoE layer... two-phase pretraining strategy... mixed masking and denoising pretraining tasks
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

LH-CSI... 11.6 billion CSI points... zero-shot generalization split

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 5 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

FARM: Foundational Aerial Radio Map for Intelligent Low-Altitude Networking
eess.SP 2026-04 unverdicted novelty 7.0

FARM is a foundation model combining masked autoencoders and diffusion decoders to estimate high-resolution aerial radio maps from a new multi-band low-altitude dataset, claiming superior accuracy and generalization o...
WiFo-MiSAC: A Wireless Foundation Model for Multimodal Sensing and Communication Integration via Synesthesia of Machines (SoM)
eess.SP 2026-04 unverdicted novelty 6.0

WiFo-MiSAC is a task-agnostic foundation model that unifies multimodal wireless signals via tokenization and self-supervised learning with SS-DMoE to achieve strong few-shot performance on beam prediction and channel ...
AirFM-DDA: Air-Interface Foundation Model in the Delay-Doppler-Angle Domain for AI-Native 6G
cs.LG 2026-04 unverdicted novelty 6.0

AirFM-DDA reparameterizes wireless channel data into the delay-Doppler-angle domain and uses efficient window attention to achieve better zero-shot performance on channel prediction and estimation with lower compute cost.
A Graph Foundation Model for Wireless Resource Allocation
cs.LG 2026-04 unverdicted novelty 6.0

A pre-trained interference-aware graph Transformer model for wireless resource allocation that achieves strong few-shot adaptation to new tasks and scenarios.
Adaptive 3D-RoPE: Physics-Aligned Rotary Positional Encoding for Wireless Foundation Models
eess.SP 2026-05 unverdicted novelty 5.0

Adaptive 3D-RoPE adapts rotary positional encoding to wireless channel physics via learnable 3D frequencies and dynamic CSI control, yielding up to 10.7 dB NMSE gains in scale extrapolation and 1 dB in zero-shot tasks.

Reference graph

Works this paper leans on

49 extracted references · 49 canonical work pages · cited by 5 Pith papers · 2 internal anchors

[1]

Choi, H.W.,et al.: Smart Textile Lighting/Display System With Multifunctional Fibre Devices for Large Scale Smart Home and IoT Applications. Nat. Commun. 13, 814 (2022)

work page 2022
[2]

IEEE Commun

Cheng, X.,et al.: Intelligent Multi-Modal Sensing-Communication Integration: Synesthesia of Machines. IEEE Commun. Surv. Tutor.26, 258–301 (2024)

work page 2024
[3]

IEEE Wireless Commun.27(2), 218–228 (2020) https://doi.org/10.1109/ mwc.001.1900333

Chen, S.,et al.: Vision, Requirements, and Technology Trend of 6G: How to Tackle the Challenges of System Coverage, Capacity, User Data-Rate and Movement Speed. IEEE Wireless Commun.27(2), 218–228 (2020) https://doi.org/10.1109/ mwc.001.1900333

work page 2020
[4]

IEEE Trans

Liu, X., Gao, S., Liu, B., Cheng, X., Yang, L.: LLM4WM: Adapting LLM for Wireless Multi-Tasking. IEEE Trans. Mach. Learn. Commun. Netw.3, 835–847 (2025) https://doi.org/10.1109/TMLCN.2025.3585845

work page doi:10.1109/tmlcn.2025.3585845 2025
[5]

Liu, B., Liu, X., Gao, S., Cheng, X., Yang, L.: LLM4CP: Adapting Large Lan- guage Models for Channel Prediction. J. Commun. Inf. Netw.9(2), 113–125 (2024)

work page 2024
[6]

Li, Y.,et al.: Multi-Representation Domain Attentive Contrastive Learning Based Unsupervised Automatic Modulation Recognition. Nat. Commun.16, 5951 (2025)

work page 2025
[7]

Nature630, 493–500 (2024)

Abramson, J.,et al.: Accurate Structure Prediction of Biomolecular Interactions With AlphaFold 3. Nature630, 493–500 (2024)

work page 2024
[8]

Nature616, 259–265 (2023)

Moor, M., Banerjee, O., Shakeri Hossein Abad, Z., Krumholz, H.M., Leskovec, J., Topol, E.J., Rajpurkar, P.,et al.: Foundation Models for Generalist Medical 40 Artificial Intelligence. Nature616, 259–265 (2023)

work page 2023
[9]

Wu, K.,et al.: A Semantic-Enhanced Multi-Modal Remote Sensing Foundation Model for Earth Observation. Nat. Mach. Intell.7, 1235–1249 (2025)

work page 2025
[10]

Nature644, 1002–1009 (2025)

Binz, M.,et al.: A Foundation Model to Predict and Capture Human Cognition. Nature644, 1002–1009 (2025)

work page 2025
[11]

Xue, B.,et al.: Deep Spectral Component Filtering as a Foundation Model for Spectral Analysis Demonstrated in Metabolic Profiling. Nat. Mach. Intell.7, 743– 757 (2025)

work page 2025
[12]

IEEE Trans

Cheng, X., Liu, B., Liu, X., Liu, E., Huang, Z.: Foundation Model Empowered Synesthesia of Machines (SoM): AI-Native Intelligent Multi-Modal Sensing- Communication Integration. IEEE Trans. Netw. Sci. Eng. (2025) https://doi.org/ 10.1109/TNSE.2025.3587238 . Early Access

work page doi:10.1109/tnse.2025.3587238 2025
[13]

Liu, B., Gao, S., Liu, X., Cheng, X., Yang, L.: WiFo: Wireless Foundation Model for Channel Prediction. Sci. China Inf. Sci.68, 162302 (2025) https://doi.org/ 10.1007/s11432-025-4349-0

work page doi:10.1007/s11432-025-4349-0 2025
[14]

He, Y.,et al.: Generalized Biological Foundation Model With Unified Nucleic Acid and Protein Language. Nat. Mach. Intell. (2025) https://doi.org/10.1038/ s42256-025-01044-4

work page 2025
[15]

Pai, S.,et al.: Foundation Model for Cancer Imaging Biomarkers. Nat. Mach. Intell.6(3), 354–367 (2024) https://doi.org/10.1038/s42256-024-00807-9

work page doi:10.1038/s42256-024-00807-9 2024
[16]

Large Wireless Model (LWM): A Foundation Model for Wireless Channels,

Alikhani, S., Charan, G., Alkhateeb, A.: Large Wireless Model (LWM): A Foundation Model for Wireless Channels. arXiv (2024) 2411.08872

work page arXiv 2024
[17]

IEEE Trans

Salihu, A., Rupp, M., Schwarz, S.: Self-Supervised and Invariant Representa- tions for Wireless Localization. IEEE Trans. Wireless Commun.23(8), 8281–8296 (2024) https://doi.org/10.1109/TWC.2023.3348203

work page doi:10.1109/twc.2023.3348203 2024
[18]

arXiv (2025) 2501.01802

Catak, F.O., Kuzlu, M., Cali, U.: BERT4MIMO: A Foundation Model Using BERT Architecture for Massive MIMO Channel State Information Prediction. arXiv (2025) 2501.01802

work page arXiv 2025
[19]

Mining limited data sufficiently: A bert-inspired approach for csi time series application in wireless communication and sensing,

Zhao, Z., et al.: CSI-BERT2: A BERT-Inspired Framework for Efficient CSI Pre- diction and Classification in Wireless Communication and Sensing. arXiv (2024) 2412.06861

work page arXiv 2024
[20]

arXiv (2025) 2502.11965

Jiang, J., Yu, W., Li, Y., Gao, Y., Xu, S.: A MIMO Wireless Channel Foundation Model via CIR-CSI Consistency. arXiv (2025) 2502.11965

work page arXiv 2025
[21]

IEEE 41 Trans

Jaeckel, S., Raschkowski, L., B¨ orner, K., Thiele, L.: QuaDRiGa: A 3-D Multi-Cell Channel Model With Time Evolution for Enabling Virtual Field Trials. IEEE 41 Trans. Antennas Propag.62(6), 3242–3256 (2014)

work page 2014
[22]

IEEE Trans

Huang, Z.,et al.: A Mixed-Bouncing Based Non-Stationarity and Consistency 6G V2V Channel Model With Continuously Arbitrary Trajectory. IEEE Trans. Wireless Commun.23(2), 1634–1650 (2023)

work page 2023
[23]

In: 2024 IEEE International Conference on Robotics and Automation (ICRA), pp

Yaman, I.,et al.: The LuViRA Dataset: Synchronized Vision, Radio, and Audio Sensors for Indoor Localization. In: 2024 IEEE International Conference on Robotics and Automation (ICRA), pp. 11920–11926 (2024)

work page 2024
[24]

In: WSA 2021; 25th International ITG Workshop on Smart Antennas (2021)

Euchner, F., Gauger, M., D¨ orner, S., Brink, S.: A Distributed Massive MIMO Channel Sounder for ”Big CSI Data”-Driven Machine Learning. In: WSA 2021; 25th International ITG Workshop on Smart Antennas (2021)

work page 2021
[25]

In: 2016 50th Asilomar Conference on Signals, Systems and Computers, pp

Shepard, C., Ding, J., Guerra, R.E., Zhong, L.: Understanding Real Many- Antenna MU-MIMO Channels. In: 2016 50th Asilomar Conference on Signals, Systems and Computers, pp. 461–467 (2016)

work page 2016
[26]

DeepMIMO: A Generic Deep Learning Dataset for Millimeter Wave and Massive MIMO Applications

Alkhateeb, A.: DeepMIMO: A Generic Deep Learning Dataset for Millimeter Wave and Massive MIMO Applications. arXiv (2019) 1902.06435

work page internal anchor Pith review Pith/arXiv arXiv 2019
[27]

https://nvlabs.github.io/sionna/

Hoydis, J., Cammerer, S., Ait Aoudia, F., Nimier-David, M., Maggi, L., Marcus, G., Vem, A., Keller, A.: Sionna. https://nvlabs.github.io/sionna/

work page
[28]

Jiang, H., Cui, M., Ng, D.W.K., Dai, L.: Accurate Channel Prediction Based on Transformer: Making Mobility Negligible. IEEE J. Sel. Areas Commun.40(9), 2717–2732 (2022)

work page 2022
[29]

In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp

Feichtenhofer, C., Fan, H., Malik, J., He, K.: SlowFast Networks for Video Recog- nition. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6202–6211 (2019)

work page 2019
[30]

IEEE Commun

Soltani, M., Pourahmadi, V., Mirzaei, A., Sheikhzadeh, H.: Deep Learning-Based Channel Estimation. IEEE Commun. Lett.23(4), 652–655 (2019)

work page 2019
[31]

IEEE Trans

Luan, D., Thompson, J.S.: Channelformer: Attention-Based Neural Solution for Wireless Channel Estimation and Effective Online Training. IEEE Trans. Wireless Commun.22(10), 6562–6577 (2023)

work page 2023
[32]

IEEE Open J

Jiang, W., Schotten, H.D.: Deep Learning for Fading Channel Prediction. IEEE Open J. Commun. Soc.1, 320–332 (2020)

work page 2020
[33]

Yin, H., Wang, H., Liu, Y., Gesbert, D.: Addressing the Curse of Mobility in Massive MIMO With Prony-Based Angular-Delay Domain Channel Predictions. IEEE J. Sel. Areas Commun.38(12), 2903–2917 (2020)

work page 2020
[34]

In: ICLR 2025: The Thirteenth International Conference on Learning Representations (2025)

Xiaoming, S., Shiyu, W., Yuqi, N., Dianqi, L., Zhou, Y., Qingsong, W., Jin, M.: Time-MoE: Billion-Scale Time Series Foundation Models With Mixture of 42 Experts. In: ICLR 2025: The Thirteenth International Conference on Learning Representations (2025). International Conference on Learning Representations

work page 2025
[35]

IEEE Commun

Sun, Z., Wang, K., Sun, R., Chen, Z.: Channel State Identification in Complex Indoor Environments With ST-CNN and Transfer Learning. IEEE Commun. Lett.27(2), 546–550 (2023)

work page 2023
[36]

IEEE Trans

Alrabeiah, M., Alkhateeb, A.: Deep Learning for mmWave Beam and Blockage Prediction Using Sub-6 GHz Channels. IEEE Trans. Commun.68(9), 5504–5518 (2020)

work page 2020
[37]

In: 2022 IEEE 23rd International Workshop on Signal Processing Advances in Wireless Communications (SPAWC), pp

Salihu, A., Schwarz, S., Rupp, M.: Attention Aided CSI Wireless Localization. In: 2022 IEEE 23rd International Workshop on Signal Processing Advances in Wireless Communications (SPAWC), pp. 1–5 (2022)

work page 2022
[38]

Cheng, X.,et al.: SynthSoM: A Synthetic Intelligent Multi-Modal Sensing- Communication Dataset for Synesthesia of Machines (SoM). Sci. Data12, 819 (2025) https://doi.org/10.1038/s41597-025-05065-x

work page doi:10.1038/s41597-025-05065-x 2025
[39]

arXiv (2025) 2501.11926

Nam, Y., Choi, J.: Multi-Modal Variable-Rate CSI Reconstruction for FDD Massive MIMO Systems. arXiv (2025) 2501.11926

work page arXiv 2025
[40]

IEEE Wireless Commun

Wen, C.-K., Shih, W.-T., Jin, S.: Deep Learning for Massive MIMO CSI Feedback. IEEE Wireless Commun. Lett.7(5), 748–751 (2018) https://doi.org/10.1109/ LWC.2018.2818160

work page arXiv 2018
[41]

IEEE Wireless Commun

Cui, Y., Guo, A., Song, C.: TransNet: Full Attention Network for CSI Feedback in FDD Massive MIMO System. IEEE Wireless Commun. Lett.11(5), 903–907 (2022)

work page 2022
[42]

arXiv (2025) 2505.10134

Pan, G., Huang, K., Chen, H., Zhang, S., H¨ ager, C., Wymeersch, H.: Large Wire- less Localization Model (LWLM): A Foundation Model for Positioning in 6G Networks. arXiv (2025) 2505.10134

work page arXiv 2025
[43]

IEEE Trans

He, H., Wen, C.-K., Jin, S., Li, G.Y.: Model-Driven Deep Learning for MIMO Detection. IEEE Trans. Signal Process.68, 1702–1715 (2020)

work page 2020
[44]

Maaten, L., Hinton, G.E.: Visualizing Data Using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008)

work page 2008
[45]

In: Proc

He, K.,et al.: Masked Autoencoders Are Scalable Vision Learners. In: Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), pp. 16000–16009 (2022)

work page 2022
[46]

Authorea Prepr

Cui, Y., Guo, J., Wen, C.-K., Jin, S., Tong, E.: Leveraging Pre-Trained Large Language Models for CSI Feedback in Massive MIMO Systems. Authorea Prepr

work page
[47]

Yin, H., Wang, H., Liu, Y., Gesbert, D.: Addressing the Curse of Mobility in 43 Massive MIMO With Prony-Based Angular-Delay Domain Channel Predictions. IEEE J. Sel. Areas Commun.38(12), 2903–2917 (2020)

work page 2020
[48]

In: Proc

He, K., Zhang, X., Ren, S., Sun, J.: Deep Residual Learning for Image Recogni- tion. In: Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), pp. 770–778 (2016)

work page 2016
[49]

Scaling Laws for Neural Language Models

Kaplan, J., et al.: Scaling Laws for Neural Language Models. arXiv (2020) 2001.08361 44

work page internal anchor Pith review Pith/arXiv arXiv 2020