Deep Learning for CSI Feedback Based on Superimposed Coding

Bin Cai; Chaojin Qing; Chuan Huang; Jiafan Wang; Qingyao Yang

arxiv: 1907.11836 · v1 · pith:YACI6J3Vnew · submitted 2019-07-27 · 💻 cs.NI · cs.IT· cs.LG· math.IT

Deep Learning for CSI Feedback Based on Superimposed Coding

Chaojin Qing , Bin Cai , Qingyao Yang , Jiafan Wang , Chuan Huang This is my paper

Pith reviewed 2026-05-24 15:13 UTC · model grok-4.3

classification 💻 cs.NI cs.ITcs.LGmath.IT

keywords CSI feedbacksuperimposed codingdeep learningmassive MIMOmulti-task neural networkMMSEFDD

0 comments

The pith

A multi-task neural network trained at one SNR and power coefficient improves downlink CSI estimation from superimposed signals while maintaining uplink data detection across varying conditions.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper combines deep learning with superimposed coding to reduce uplink bandwidth usage for CSI feedback in massive MIMO FDD systems. It proposes a multi-task NN at the base station that unfolds two MMSE iterations to jointly recover downlink CSI and uplink user sequences. Trained subnet-by-subnet at a fixed SNR and PPC, this network is shown to outperform standalone SC methods in CSI accuracy with comparable or better UL-US detection even when SNR and PPC change.

Core claim

By unfolding two iterations of the MMSE criterion-based interference reduction into a multi-task neural network architecture and training it subnet-by-subnet, the network recovers downlink CSI and UL-US from superimposed signals, and when trained at a specific SNR and PPC, it consistently improves downlink CSI estimation with similar or better UL-US detection under varying SNR and PPC compared to standalone SC-based CSI scheme.

What carries the argument

The multi-task neural network that unfolds two MMSE iterations for interference reduction, allowing joint recovery of CSI and user data.

If this is right

Reduces the occupation of uplink bandwidth resources for CSI feedback in massive MIMO.
Improves estimation accuracy of downlink CSI without sacrificing uplink user data detection.
Enables the use of superimposed coding with deep learning to handle varying channel conditions without retraining.
Facilitates parameter tuning and faster convergence through subnet-by-subnet training.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

This approach could reduce the need for frequent retraining in dynamic wireless environments.
Similar unfolding techniques might apply to other interference cancellation problems in communications.
The method suggests that learned interference reduction can generalize better than traditional iterative methods across parameter ranges.

Load-bearing premise

That a network unfolding exactly two MMSE iterations trained at one SNR and PPC will generalize to other SNR and PPC values without retraining.

What would settle it

Testing the multi-task NN on a range of SNRs and PPCs different from the training values and checking if CSI estimation error increases or UL-US detection worsens compared to the standalone SC scheme.

Figures

Figures reproduced from arXiv: 1907.11836 by Bin Cai, Chaojin Qing, Chuan Huang, Jiafan Wang, Qingyao Yang.

**Figure 2.** Figure 2: 1) NETWORK FUNCTION SUMMARY For ease of description, we denote four subnets as CSI-NET1, DET-NET1, CSI-NET2, and DET-NET2, respectively. The functionality of the network components is summarized as follows: • CSI-NETi corresponds to the MMSE estimation of downlink CSI (i.e., (3) in SC-baseline), while i = 1, 2 represents the first and second iteration, respectively. • DET-NET1 and DET-NET2 respectively det… view at source ↗

**Figure 3.** Figure 3: shows that the NMSE of each model (i.e., N = 16, N = 32, and N = 64) outperforms the SC-baseline, especially at high SNR. Although SNR = 5dB is adopted in training phase, the three trained network models work well in the entire SNR span varying from 0dB to 14dB. Thus, it is obvious that the designed and trained subnets (i.e., CSINET1 and CSI-NET2) have a good generalization ability for 0 2 4 6 8 10 12 14 … view at source ↗

**Figure 10.** Figure 10: Note that, from Fig. 5 to Fig. 10, the NN training [PITH_FULL_IMAGE:figures/full_fig_p008_10.png] view at source ↗

**Figure 5.** Figure 5: FIGURE 5 [PITH_FULL_IMAGE:figures/full_fig_p009_5.png] view at source ↗

**Figure 6.** Figure 6: FIGURE 6 [PITH_FULL_IMAGE:figures/full_fig_p009_6.png] view at source ↗

**Figure 9.** Figure 9: FIGURE 9 [PITH_FULL_IMAGE:figures/full_fig_p010_9.png] view at source ↗

read the original abstract

Massive multiple-input multiple-output (MIMO) with frequency division duplex (FDD) mode is a promising approach to increasing system capacity and link robustness for the fifth generation (5G) wireless cellular systems. The premise of these advantages is the accurate downlink channel state information (CSI) fed back from user equipment. However, conventional feedback methods have difficulties in reducing feedback overhead due to significant amount of base station (BS) antennas in massive MIMO systems. Recently, deep learning (DL)-based CSI feedback conquers many difficulties, yet still shows insufficiency to decrease the occupation of uplink bandwidth resources. In this paper, to solve this issue, we combine DL and superimposed coding (SC) for CSI feedback, in which the downlink CSI is spread and then superimposed on uplink user data sequences (UL-US) toward the BS. Then, a multi-task neural network (NN) architecture is proposed at BS to recover the downlink CSI and UL-US by unfolding two iterations of the minimum mean-squared error (MMSE) criterion-based interference reduction. In addition, for a network training, a subnet-by-subnet approach is exploited to facilitate the parameter tuning and expedite the convergence rate. Compared with standalone SC-based CSI scheme, our multi-task NN, trained in a specific signal-to-noise ratio (SNR) and power proportional coefficient (PPC), consistently improves the estimation of downlink CSI with similar or better UL-US detection under SNR and PPC varying.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper combines superimposed coding with a two-iteration unfolded-MMSE multi-task NN for joint CSI and uplink data recovery in FDD massive MIMO, but the generalization claim across SNR/PPC rests on an unverified empirical assertion with no supporting numbers or mechanism shown.

read the letter

The paper's main move is to superimpose downlink CSI feedback onto uplink user sequences via superimposed coding, then recover both at the base station with a multi-task network that unfolds exactly two MMSE iterations for interference reduction. Training happens subnet by subnet to ease tuning. This joint recovery setup looks like the concrete new element relative to the DL-CSI and SC literature they cite. The subnet training is a practical detail that could help convergence in practice. The goal of cutting uplink feedback overhead while keeping data detection intact is a real systems concern in FDD massive MIMO. If the performance holds, it would matter for capacity in 5G deployments. The abstract states that the network, trained at one SNR and power proportional coefficient, still improves CSI estimation with comparable or better data detection when those parameters vary. That is the central claim. Nothing in the description supplies an explicit reason the learned canceller should stay effective away from the training point—no input normalization, no parameter embedding, no extra regularization mentioned. The abstract also gives no numerical results, no error bars, no dataset size or channel model details, and no ablation on the two-iteration choice. All quantitative support is missing. This work is aimed at researchers who already follow DL applications to the physical layer in wireless. A reader looking for new architectures in CSI feedback might pick up the joint-recovery idea, but would need the full experiments to judge whether the robustness actually materializes. I would send it for peer review. The topic is relevant, the architecture is specific enough to evaluate, and referees can check the missing numbers and test the generalization point directly.

Referee Report

2 major / 0 minor

Summary. The paper proposes combining deep learning with superimposed coding for downlink CSI feedback in massive MIMO FDD systems. Downlink CSI is spread and superimposed onto uplink user data sequences; at the BS a multi-task NN recovers both by unfolding exactly two MMSE iterations, trained subnet-by-subnet at one fixed SNR/PPC pair. The central claim is that this yields consistently better downlink CSI NMSE and comparable or superior UL-US detection when SNR and PPC deviate from the training values.

Significance. If the empirical robustness result holds, the approach would demonstrate a practical route to lowering uplink bandwidth consumption for CSI feedback while preserving data detection performance. The subnet-by-subnet training procedure is a concrete implementation detail that could aid reproducibility. The significance remains provisional because the generalization across operating points rests on an unverified architectural assumption rather than an explicit invariance mechanism.

major comments (2)

[Abstract] Abstract: the claim that the multi-task NN 'consistently improves' downlink CSI estimation under SNR and PPC variation is asserted without any numerical results, error bars, dataset description, or ablation studies; this empirical assertion is load-bearing for the central contribution.
[Abstract] Abstract: the architecture unfolds exactly two MMSE iterations and is trained at one fixed SNR/PPC pair via subnet-by-subnet training, yet no input normalization, PPC/SNR embedding, or regularization is described that would render the learned canceller independent of the operating point; the generalization claim therefore reduces to an unverified robustness assumption.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful review and constructive comments. We address the two major comments on the abstract point by point below.

read point-by-point responses

Referee: [Abstract] Abstract: the claim that the multi-task NN 'consistently improves' downlink CSI estimation under SNR and PPC variation is asserted without any numerical results, error bars, dataset description, or ablation studies; this empirical assertion is load-bearing for the central contribution.

Authors: The abstract summarizes the main empirical findings of the work. The supporting evidence—including NMSE curves for downlink CSI under SNR and PPC sweeps (with direct comparisons to standalone SC), simulation parameters, channel dataset generation details, and performance of the multi-task architecture versus single-task baselines—is contained in Sections IV and V together with the associated figures. These results quantify the consistent improvement and include the operating-point variation tests. We are willing to revise the abstract to include a brief parenthetical reference to these sections if the editor deems it necessary for clarity. revision: partial
Referee: [Abstract] Abstract: the architecture unfolds exactly two MMSE iterations and is trained at one fixed SNR/PPC pair via subnet-by-subnet training, yet no input normalization, PPC/SNR embedding, or regularization is described that would render the learned canceller independent of the operating point; the generalization claim therefore reduces to an unverified robustness assumption.

Authors: The architecture deliberately unfolds a fixed number of MMSE iterations and employs subnet-by-subnet training at a single operating point to obtain stable convergence. No explicit SNR/PPC embedding or additional regularization for invariance is introduced. Nevertheless, the manuscript reports extensive cross-validation experiments (detailed in the results section) in which the same trained network is evaluated at SNR and PPC values different from the training point; these experiments show that downlink CSI NMSE remains superior to standalone SC while UL-US detection stays comparable. The generalization claim is therefore grounded in the reported empirical behavior rather than an architectural invariance guarantee. We do not claim theoretical independence from the operating point. revision: no

Circularity Check

0 steps flagged

No significant circularity; empirical DL architecture with no self-referential derivation

full rationale

The paper proposes an empirical multi-task NN obtained by unfolding two MMSE iterations for joint CSI and UL-US recovery under superimposed coding, trained subnet-by-subnet at one SNR/PPC pair. The abstract and description contain no equations, uniqueness theorems, or self-citations that reduce the claimed generalization or NMSE improvement to a fitted input by construction, a renamed known result, or a load-bearing self-citation chain. Performance claims rest on simulation comparisons rather than a closed mathematical derivation that loops back to its own inputs; the architecture is presented as a trainable approximator, not a self-defining identity.

Axiom & Free-Parameter Ledger

2 free parameters · 1 axioms · 0 invented entities

The central claim rests on the standard MMSE interference-reduction criterion and the assumption that a two-iteration unfolding suffices; no new physical constants or ad-hoc entities are introduced.

free parameters (2)

power proportional coefficient (PPC)
Used both for signal superposition and for training the network; its specific value is chosen for training and affects generalization claims.
training SNR
Network is trained at one specific SNR; performance under varying SNR is asserted but the training value itself is a design choice.

axioms (1)

domain assumption MMSE criterion provides a suitable interference-reduction step that can be unfolded into neural-network layers
Invoked when the architecture is described as 'unfolding two iterations of the minimum mean-squared error (MMSE) criterion-based interference reduction'.

pith-pipeline@v0.9.0 · 5800 in / 1430 out tokens · 21331 ms · 2026-05-24T15:13:33.821335+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

a multi-task neural network (NN) architecture is proposed at BS to recover the downlink CSI and UL-US by unfolding two iterations of the minimum mean-squared error (MMSE) criterion-based interference reduction
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

subnet-by-subnet approach is exploited to facilitate the parameter tuning and expedite the convergence rate

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

40 extracted references · 40 canonical work pages · 8 internal anchors

[1]

Spatial domain management and massive MIMO coordination in 5G SDN,

S. Sun, B. Rong, R. Q. Hu, and Y . Qian, “Spatial domain management and massive MIMO coordination in 5G SDN,” IEEE Access, vol. 3, pp. 0 2 4 6 8 10 12 14SNR (dB) 10-4 10-3 10-2 10-1 BER SC-baseline (;=0.15)Proposed (;=0.15)SC-baseline (;=0.10)Proposed (;=0.10)SC-baseline (;=0.05)Proposed (;=0.05) FIGURE 10. BER versus SNR, where N = 16 ,M = 512 . 2238–225...

work page 2015
[2]

An efﬁcient CSI feedback scheme for dual-polarized massive MIMO

F. Zheng, Y . Chen, Q. Zhan, J. Zhang, “An efﬁcient CSI feedback scheme for dual-polarized massive MIMO”, IEEE Access , vol. 6, pp. 23420– 23430, Mar. 2018

work page 2018
[3]

Deep learning for massive MIMO CSI feedback

C. Wen, W. Shih, and S. Jin, “Deep learning for massive MIMO CSI feedback”, IEEE Wireless Commun. Lett., vol. 7, no. 5, pp. 748–751, Oct. 2018

work page 2018
[4]

Deep learning-based CSI feed- back approach for time-varying massive MIMO channels

T. Wang, C. Wen, S. Jin, and G. Y . Li, “Deep learning-based CSI feed- back approach for time-varying massive MIMO channels”, IEEE Wireless Commun. Lett., to be published. DOI: 10.1109/LWC.2018.2874264

work page doi:10.1109/lwc.2018.2874264 2018
[5]

MIMO channel information feedback using deep recurrent network,

C. Lu, W. Xu, H. Shen, J. Zhu, and K. Wang, “MIMO channel information feedback using deep recurrent network,” IEEE Commun. Lett., vol. 23, no. 1, pp. 188–191, Jan. 2019

work page 2019
[6]

Deep Autoencoder based CSI Feedback with Feedback Errors and Feedback Delay in FDD Massive MIMO Systems,

Y . Jang, G. Kong, M. Jung, S. Choi, and I. Kim, “Deep Autoencoder based CSI Feedback with Feedback Errors and Feedback Delay in FDD Massive MIMO Systems,” IEEE Wireless Commun. Lett ., to be published. DOI: 10.1109/LWC.2019.2895039

work page doi:10.1109/lwc.2019.2895039 2019
[7]

Deep UL2DL: Channel knowledge transfer from uplink to downlink,

M. Safari and V . Pourahmadi, “Deep UL2DL: Channel knowledge transfer from uplink to downlink,” arXiv preprint arXiv: 1812.07518, 2018

work page arXiv 2018
[8]

Enabling FDD Massive MIMO through Deep Learning-based Channel Prediction

M. Arnold, S. Dörner, S. Cammerer, S. Yan, J. Hoydis, and S. Brink, “Enabling FDD Massive MIMO through Deep Learning-based Channel Prediction,” arXiv preprint arXiv:1901.03664, 2019

work page internal anchor Pith review Pith/arXiv arXiv 1901
[9]

Deep-learning-based millimeter- wave massive MIMO for hybrid precoding,

H. Huang, Y . Song, J. Yang, and G. Gui, “Deep-learning-based millimeter- wave massive MIMO for hybrid precoding,” IEEE Trans. Veh. Technol., vol. 68, no. 3, pp. 3027–3032, Mar. 2019

work page 2019
[10]

Enhanced CSI acquisition for FDD multi-user massive MIMO systems,

F. Zhang, S. Sun, Q. Gao, W. Tang, “Enhanced CSI acquisition for FDD multi-user massive MIMO systems,” IEEE Access , vol. 6, pp. 23034– 23042, Apr. 2018

work page 2018
[11]

Massive-MIMO Enabled FDD Wireless Backhaul Small-Cell Relay Networks: AF Protocol Based Designs With Low Channel Estima- tion and Feedback Complexity,

C. Song, “Massive-MIMO Enabled FDD Wireless Backhaul Small-Cell Relay Networks: AF Protocol Based Designs With Low Channel Estima- tion and Feedback Complexity,” IEEE Access., vol. 6,pp. 31050–31064, Jun. 2018

work page 2018
[12]

Compressive sensing-based differential channel feedback for massive MIMO,

W. Shen, L. Dai, Y . Shi, X. Zhu, Z. Wang, “Compressive sensing-based differential channel feedback for massive MIMO,” Electron Lett., vol. 51, no. 22, pp. 1824–1826, Oct. 2015

work page 2015
[13]

Spatially common sparsity based adaptive channel estimation and feedback for FDD massive MIMO,

Z. Gao, L. Dai, Z. Wang, S. Chen, “Spatially common sparsity based adaptive channel estimation and feedback for FDD massive MIMO,”IEEE Trans. Signal Process., vol. 63, no. 23, pp. 6169–6183, Dec. 2015

work page 2015
[14]

Compressive sensing based channel feedback protocols for spatially-correlated massive antenna arrays,

P. Kuo, H. Kung, and P. Ting, “Compressive sensing based channel feedback protocols for spatially-correlated massive antenna arrays,” in Proc. IEEE Int. Conf. Wireless Commun. Networking (WCNC), Shanghai, China, Apr. 2012, pp. 492–497

work page 2012
[15]

Multidimensional compressive sensing based 10 VOLUME 4, 2016 Chaojin Qing et al.: Preparation of Papers for IEEE ACCESS analog CSI feedback for massive MIMO-OFDM systems,

P. Cheng and Z. Chen, “Multidimensional compressive sensing based 10 VOLUME 4, 2016 Chaojin Qing et al.: Preparation of Papers for IEEE ACCESS analog CSI feedback for massive MIMO-OFDM systems,” in Proc. Veh. Technol. Conf. (VTC)-Fall 2014, Vancouver, Canada, Sept 2014, pp. 1–6

work page 2016
[16]

Distributed compressive CSIT estimation and feed- back for FDD multi-user massive MIMO systems,

X. Rao and V . Lau, “Distributed compressive CSIT estimation and feed- back for FDD multi-user massive MIMO systems,” IEEE Trans. Signal Process., vol. 62, no. 12, pp. 3261–3271, Jun. 2014

work page 2014
[17]

Data-Driven Deep Learning for Automatic Modulation Recognition in Cognitive Radios,

Y . Wang, M. Liu, J. Yang, and G. Gui, “Data-Driven Deep Learning for Automatic Modulation Recognition in Cognitive Radios,” IEEE Trans. Veh. Technol., vol. 68, no. 4, pp. 4074-4077, Apr. 2019

work page 2019
[18]

An introduction to deep learning for the physical layer,

T. O’Shea and J. Hoydis, “An introduction to deep learning for the physical layer,” IEEE Trans. on Cogn. Commun. Netw., vol. 3, no. 4, pp. 563–575, Dec. 2017

work page 2017
[19]

Deep learning based physical layer wire- less communication techniques: Opportunities and challenges,

G. Gui, Y . Wang, and H. Huang, “Deep learning based physical layer wire- less communication techniques: Opportunities and challenges,” Journal of Communications, vol. 40, no. 2, pp. 19–23, Feb. 2019

work page 2019
[20]

Deep Learning in Physical Layer Communications

Z. Qin, H. Ye, G. Y . Li, and B. Juang, “Deep learning in physical layer communications.” arXiv preprint arXiv: 1807.11713, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018
[21]

Deep learning for an effective non-orthogonal multiple access scheme,

G. Gui, H. Huang, Y . Song, and H. Sari, “Deep learning for an effective non-orthogonal multiple access scheme,” IEEE Trans. Veh. Technol., vol. 67, no. 9, pp. 8440-8450, Sept. 2018

work page 2018
[22]

Feedback of Downlink Channel State Information Based on Superimposed Coding,

D. Xu, Y . Huang, and L. Yang, “Feedback of Downlink Channel State Information Based on Superimposed Coding,” IEEE Commun. Lett ., vol 11, no. 3,pp. 240–242, Mar. 2007

work page 2007
[23]

Deep Unfolding: Model-Based Inspiration of Novel Deep Architectures

J. Hershey, J. Roux, and F. Weninger, “Deep unfolding: Model-based inspiration of novel deep architectures,” arXiv preprint arXiv:1409.2574, 2014

work page internal anchor Pith review Pith/arXiv arXiv 2014
[24]

Learning to detect,

N. Samuel, T. Diskin, and A. Wiesel, “Learning to detect,” IEEE Trans. Signal Process., vol. 67, no. 10, pp. 2554–2564, May 2019

work page 2019
[25]

Trainable Projected Gradient Detector for Massive Overloaded MIMO Channels: Data-driven Tuning Approach

S. Takabe, M. Imanishi, T. Wadayama, and K. Hayashi, “Trainable Pro- jected Gradient Detector for Massive Overloaded MIMO Channels: Data- driven Tuning Approach,” arXiv preprint arXiv:1812.10044, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018
[26]

The effective- ness of layer-by-layer training using the information bottleneck principle

A. Elad, D. Haviv, Y . Blau, and T. Michaeli, “The effective- ness of layer-by-layer training using the information bottleneck principle”, submitted to ICLR 2019 , 2019. [Online]. Available: https://openreview.net/pdf?id=r1Nb5i05tX

work page 2019
[27]

Multitask Learning

R. Caruana, “Multitask Learning”, Machine Learning, vol. 28, no. 1, pp. 41–75, 1997

work page 1997
[28]

ComNet: Combination of deep learning and expert knowledge in OFDM receivers,

X. Gao, S. Jin, C. Wen, and G. Y . Li, “ComNet: Combination of deep learning and expert knowledge in OFDM receivers,”IEEE Commun. Lett., pp. 2627–2630, Dec. 2018

work page 2018
[29]

Batch normalization: Accelerating deep network training by reducing internal covariate shift,

S. Ioffe and C. Szegedy, “Batch normalization: Accelerating deep network training by reducing internal covariate shift,” in Proc. 32nd Int. Conf. Mach. Learn., 2015, pp. 448–456

work page 2015
[30]

Searching for Activation Functions

P. Ramachandran, B. Zoph, and Q. Le, “searching for activation functions,” arXiv preprint arXiv:1710.05941, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017
[31]

Is it Time to Swish? Comparing Deep Learning Activation Functions Across NLP tasks

S. Eger, P. Youssef, and I. Gurevych, “Is it Time to Swish? Comparing Deep Learning Activation Functions Across NLP tasks,” arXiv preprint arXiv:1901.02671, 2019

work page internal anchor Pith review Pith/arXiv arXiv 1901
[32]

Understanding the difﬁculty of training deep feedforward neural networks,

X. Glorot and Y . Bengio, “Understanding the difﬁculty of training deep feedforward neural networks,” in Proc. 13th Int. Conf. Artif. Intell. Statist., 2010, vol. 9, pp. 249–256

work page 2010
[33]

Multi-task learning as multi-objective optimiza- tion,

O. Sener, and V . Koltun, “Multi-task learning as multi-objective optimiza- tion,” in Proc. Adv. Neural Inf. Process. Syst., pp. 525–536, 2018

work page 2018
[34]

Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks,

Z. Chen, V . Badrinarayanan, C. Lee, and A. Rabinovich, “Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks,” in Proc. Int. Conf. Mach. Learn., 2018, pp. 793–802

work page 2018
[35]

The Beneﬁts of Over-parameterization at Initializa- tion in Deep ReLU Networks,

D. Arpit, Y . Bengio, “The Beneﬁts of Over-parameterization at Initializa- tion in Deep ReLU Networks,” arXiv preprint arXiv:1901.03611, 2019

work page arXiv 1901
[36]

Artificial Intelligence-aided Receiver for A CP-Free OFDM System: Design, Simulation, and Experimental Test

J. Zhang, C. Wen, S. Jin, and G. Y . Li, “Artiﬁcial Intelligence-aided Receiver for A CP-Free OFDM System: Design, Simulation, and Experi- mental Test,” arXiv preprint arXiv:1903.04766, 2019

work page internal anchor Pith review Pith/arXiv arXiv 1903
[37]

Power of deep learning for channel esti- mation and signal detection in OFDM systems,

H. Ye, G. Y . Li, and B. Juang, “Power of deep learning for channel esti- mation and signal detection in OFDM systems,” IEEE Wireless Commun. Lett., vol. 7, no. 1, pp. 114–117, Feb. 2018

work page 2018
[38]

Adam: A Method for Stochastic Optimization

D. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014

work page internal anchor Pith review Pith/arXiv arXiv 2014
[39]

Backpropagating through the air: Deep learning at physical layer without channel models,

V . Raj, S. Kalyani, “Backpropagating through the air: Deep learning at physical layer without channel models,” IEEE Commun. Lett., vol. 22, no. 11, pp. 2278–22810, Nov. 2018

work page 2018
[40]

Goodfellow, Y

I. Goodfellow, Y . Bengio, A. Courville, Deep Learning , Cambridge, MA:MIT Press, 2016. CHAOJIN QING (M’15) received the B.S. de- gree in communication engineering from Chengdu University of Information Technology, Chengdu, China, in 2001, the M.S. and Ph.D. degrees in communications and information systems from the University of Electronic Science and Te...

work page 2016

[1] [1]

Spatial domain management and massive MIMO coordination in 5G SDN,

S. Sun, B. Rong, R. Q. Hu, and Y . Qian, “Spatial domain management and massive MIMO coordination in 5G SDN,” IEEE Access, vol. 3, pp. 0 2 4 6 8 10 12 14SNR (dB) 10-4 10-3 10-2 10-1 BER SC-baseline (;=0.15)Proposed (;=0.15)SC-baseline (;=0.10)Proposed (;=0.10)SC-baseline (;=0.05)Proposed (;=0.05) FIGURE 10. BER versus SNR, where N = 16 ,M = 512 . 2238–225...

work page 2015

[2] [2]

An efﬁcient CSI feedback scheme for dual-polarized massive MIMO

F. Zheng, Y . Chen, Q. Zhan, J. Zhang, “An efﬁcient CSI feedback scheme for dual-polarized massive MIMO”, IEEE Access , vol. 6, pp. 23420– 23430, Mar. 2018

work page 2018

[3] [3]

Deep learning for massive MIMO CSI feedback

C. Wen, W. Shih, and S. Jin, “Deep learning for massive MIMO CSI feedback”, IEEE Wireless Commun. Lett., vol. 7, no. 5, pp. 748–751, Oct. 2018

work page 2018

[4] [4]

Deep learning-based CSI feed- back approach for time-varying massive MIMO channels

T. Wang, C. Wen, S. Jin, and G. Y . Li, “Deep learning-based CSI feed- back approach for time-varying massive MIMO channels”, IEEE Wireless Commun. Lett., to be published. DOI: 10.1109/LWC.2018.2874264

work page doi:10.1109/lwc.2018.2874264 2018

[5] [5]

MIMO channel information feedback using deep recurrent network,

C. Lu, W. Xu, H. Shen, J. Zhu, and K. Wang, “MIMO channel information feedback using deep recurrent network,” IEEE Commun. Lett., vol. 23, no. 1, pp. 188–191, Jan. 2019

work page 2019

[6] [6]

Deep Autoencoder based CSI Feedback with Feedback Errors and Feedback Delay in FDD Massive MIMO Systems,

Y . Jang, G. Kong, M. Jung, S. Choi, and I. Kim, “Deep Autoencoder based CSI Feedback with Feedback Errors and Feedback Delay in FDD Massive MIMO Systems,” IEEE Wireless Commun. Lett ., to be published. DOI: 10.1109/LWC.2019.2895039

work page doi:10.1109/lwc.2019.2895039 2019

[7] [7]

Deep UL2DL: Channel knowledge transfer from uplink to downlink,

M. Safari and V . Pourahmadi, “Deep UL2DL: Channel knowledge transfer from uplink to downlink,” arXiv preprint arXiv: 1812.07518, 2018

work page arXiv 2018

[8] [8]

Enabling FDD Massive MIMO through Deep Learning-based Channel Prediction

M. Arnold, S. Dörner, S. Cammerer, S. Yan, J. Hoydis, and S. Brink, “Enabling FDD Massive MIMO through Deep Learning-based Channel Prediction,” arXiv preprint arXiv:1901.03664, 2019

work page internal anchor Pith review Pith/arXiv arXiv 1901

[9] [9]

Deep-learning-based millimeter- wave massive MIMO for hybrid precoding,

H. Huang, Y . Song, J. Yang, and G. Gui, “Deep-learning-based millimeter- wave massive MIMO for hybrid precoding,” IEEE Trans. Veh. Technol., vol. 68, no. 3, pp. 3027–3032, Mar. 2019

work page 2019

[10] [10]

Enhanced CSI acquisition for FDD multi-user massive MIMO systems,

F. Zhang, S. Sun, Q. Gao, W. Tang, “Enhanced CSI acquisition for FDD multi-user massive MIMO systems,” IEEE Access , vol. 6, pp. 23034– 23042, Apr. 2018

work page 2018

[11] [11]

Massive-MIMO Enabled FDD Wireless Backhaul Small-Cell Relay Networks: AF Protocol Based Designs With Low Channel Estima- tion and Feedback Complexity,

C. Song, “Massive-MIMO Enabled FDD Wireless Backhaul Small-Cell Relay Networks: AF Protocol Based Designs With Low Channel Estima- tion and Feedback Complexity,” IEEE Access., vol. 6,pp. 31050–31064, Jun. 2018

work page 2018

[12] [12]

Compressive sensing-based differential channel feedback for massive MIMO,

W. Shen, L. Dai, Y . Shi, X. Zhu, Z. Wang, “Compressive sensing-based differential channel feedback for massive MIMO,” Electron Lett., vol. 51, no. 22, pp. 1824–1826, Oct. 2015

work page 2015

[13] [13]

Spatially common sparsity based adaptive channel estimation and feedback for FDD massive MIMO,

Z. Gao, L. Dai, Z. Wang, S. Chen, “Spatially common sparsity based adaptive channel estimation and feedback for FDD massive MIMO,”IEEE Trans. Signal Process., vol. 63, no. 23, pp. 6169–6183, Dec. 2015

work page 2015

[14] [14]

Compressive sensing based channel feedback protocols for spatially-correlated massive antenna arrays,

P. Kuo, H. Kung, and P. Ting, “Compressive sensing based channel feedback protocols for spatially-correlated massive antenna arrays,” in Proc. IEEE Int. Conf. Wireless Commun. Networking (WCNC), Shanghai, China, Apr. 2012, pp. 492–497

work page 2012

[15] [15]

Multidimensional compressive sensing based 10 VOLUME 4, 2016 Chaojin Qing et al.: Preparation of Papers for IEEE ACCESS analog CSI feedback for massive MIMO-OFDM systems,

P. Cheng and Z. Chen, “Multidimensional compressive sensing based 10 VOLUME 4, 2016 Chaojin Qing et al.: Preparation of Papers for IEEE ACCESS analog CSI feedback for massive MIMO-OFDM systems,” in Proc. Veh. Technol. Conf. (VTC)-Fall 2014, Vancouver, Canada, Sept 2014, pp. 1–6

work page 2016

[16] [16]

Distributed compressive CSIT estimation and feed- back for FDD multi-user massive MIMO systems,

X. Rao and V . Lau, “Distributed compressive CSIT estimation and feed- back for FDD multi-user massive MIMO systems,” IEEE Trans. Signal Process., vol. 62, no. 12, pp. 3261–3271, Jun. 2014

work page 2014

[17] [17]

Data-Driven Deep Learning for Automatic Modulation Recognition in Cognitive Radios,

Y . Wang, M. Liu, J. Yang, and G. Gui, “Data-Driven Deep Learning for Automatic Modulation Recognition in Cognitive Radios,” IEEE Trans. Veh. Technol., vol. 68, no. 4, pp. 4074-4077, Apr. 2019

work page 2019

[18] [18]

An introduction to deep learning for the physical layer,

T. O’Shea and J. Hoydis, “An introduction to deep learning for the physical layer,” IEEE Trans. on Cogn. Commun. Netw., vol. 3, no. 4, pp. 563–575, Dec. 2017

work page 2017

[19] [19]

Deep learning based physical layer wire- less communication techniques: Opportunities and challenges,

G. Gui, Y . Wang, and H. Huang, “Deep learning based physical layer wire- less communication techniques: Opportunities and challenges,” Journal of Communications, vol. 40, no. 2, pp. 19–23, Feb. 2019

work page 2019

[20] [20]

Deep Learning in Physical Layer Communications

Z. Qin, H. Ye, G. Y . Li, and B. Juang, “Deep learning in physical layer communications.” arXiv preprint arXiv: 1807.11713, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018

[21] [21]

Deep learning for an effective non-orthogonal multiple access scheme,

G. Gui, H. Huang, Y . Song, and H. Sari, “Deep learning for an effective non-orthogonal multiple access scheme,” IEEE Trans. Veh. Technol., vol. 67, no. 9, pp. 8440-8450, Sept. 2018

work page 2018

[22] [22]

Feedback of Downlink Channel State Information Based on Superimposed Coding,

D. Xu, Y . Huang, and L. Yang, “Feedback of Downlink Channel State Information Based on Superimposed Coding,” IEEE Commun. Lett ., vol 11, no. 3,pp. 240–242, Mar. 2007

work page 2007

[23] [23]

Deep Unfolding: Model-Based Inspiration of Novel Deep Architectures

J. Hershey, J. Roux, and F. Weninger, “Deep unfolding: Model-based inspiration of novel deep architectures,” arXiv preprint arXiv:1409.2574, 2014

work page internal anchor Pith review Pith/arXiv arXiv 2014

[24] [24]

Learning to detect,

N. Samuel, T. Diskin, and A. Wiesel, “Learning to detect,” IEEE Trans. Signal Process., vol. 67, no. 10, pp. 2554–2564, May 2019

work page 2019

[25] [25]

Trainable Projected Gradient Detector for Massive Overloaded MIMO Channels: Data-driven Tuning Approach

S. Takabe, M. Imanishi, T. Wadayama, and K. Hayashi, “Trainable Pro- jected Gradient Detector for Massive Overloaded MIMO Channels: Data- driven Tuning Approach,” arXiv preprint arXiv:1812.10044, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018

[26] [26]

The effective- ness of layer-by-layer training using the information bottleneck principle

A. Elad, D. Haviv, Y . Blau, and T. Michaeli, “The effective- ness of layer-by-layer training using the information bottleneck principle”, submitted to ICLR 2019 , 2019. [Online]. Available: https://openreview.net/pdf?id=r1Nb5i05tX

work page 2019

[27] [27]

Multitask Learning

R. Caruana, “Multitask Learning”, Machine Learning, vol. 28, no. 1, pp. 41–75, 1997

work page 1997

[28] [28]

ComNet: Combination of deep learning and expert knowledge in OFDM receivers,

X. Gao, S. Jin, C. Wen, and G. Y . Li, “ComNet: Combination of deep learning and expert knowledge in OFDM receivers,”IEEE Commun. Lett., pp. 2627–2630, Dec. 2018

work page 2018

[29] [29]

Batch normalization: Accelerating deep network training by reducing internal covariate shift,

S. Ioffe and C. Szegedy, “Batch normalization: Accelerating deep network training by reducing internal covariate shift,” in Proc. 32nd Int. Conf. Mach. Learn., 2015, pp. 448–456

work page 2015

[30] [30]

Searching for Activation Functions

P. Ramachandran, B. Zoph, and Q. Le, “searching for activation functions,” arXiv preprint arXiv:1710.05941, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017

[31] [31]

Is it Time to Swish? Comparing Deep Learning Activation Functions Across NLP tasks

S. Eger, P. Youssef, and I. Gurevych, “Is it Time to Swish? Comparing Deep Learning Activation Functions Across NLP tasks,” arXiv preprint arXiv:1901.02671, 2019

work page internal anchor Pith review Pith/arXiv arXiv 1901

[32] [32]

Understanding the difﬁculty of training deep feedforward neural networks,

X. Glorot and Y . Bengio, “Understanding the difﬁculty of training deep feedforward neural networks,” in Proc. 13th Int. Conf. Artif. Intell. Statist., 2010, vol. 9, pp. 249–256

work page 2010

[33] [33]

Multi-task learning as multi-objective optimiza- tion,

O. Sener, and V . Koltun, “Multi-task learning as multi-objective optimiza- tion,” in Proc. Adv. Neural Inf. Process. Syst., pp. 525–536, 2018

work page 2018

[34] [34]

Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks,

Z. Chen, V . Badrinarayanan, C. Lee, and A. Rabinovich, “Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks,” in Proc. Int. Conf. Mach. Learn., 2018, pp. 793–802

work page 2018

[35] [35]

The Beneﬁts of Over-parameterization at Initializa- tion in Deep ReLU Networks,

D. Arpit, Y . Bengio, “The Beneﬁts of Over-parameterization at Initializa- tion in Deep ReLU Networks,” arXiv preprint arXiv:1901.03611, 2019

work page arXiv 1901

[36] [36]

Artificial Intelligence-aided Receiver for A CP-Free OFDM System: Design, Simulation, and Experimental Test

J. Zhang, C. Wen, S. Jin, and G. Y . Li, “Artiﬁcial Intelligence-aided Receiver for A CP-Free OFDM System: Design, Simulation, and Experi- mental Test,” arXiv preprint arXiv:1903.04766, 2019

work page internal anchor Pith review Pith/arXiv arXiv 1903

[37] [37]

Power of deep learning for channel esti- mation and signal detection in OFDM systems,

H. Ye, G. Y . Li, and B. Juang, “Power of deep learning for channel esti- mation and signal detection in OFDM systems,” IEEE Wireless Commun. Lett., vol. 7, no. 1, pp. 114–117, Feb. 2018

work page 2018

[38] [38]

Adam: A Method for Stochastic Optimization

D. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014

work page internal anchor Pith review Pith/arXiv arXiv 2014

[39] [39]

Backpropagating through the air: Deep learning at physical layer without channel models,

V . Raj, S. Kalyani, “Backpropagating through the air: Deep learning at physical layer without channel models,” IEEE Commun. Lett., vol. 22, no. 11, pp. 2278–22810, Nov. 2018

work page 2018

[40] [40]

Goodfellow, Y

I. Goodfellow, Y . Bengio, A. Courville, Deep Learning , Cambridge, MA:MIT Press, 2016. CHAOJIN QING (M’15) received the B.S. de- gree in communication engineering from Chengdu University of Information Technology, Chengdu, China, in 2001, the M.S. and Ph.D. degrees in communications and information systems from the University of Electronic Science and Te...

work page 2016