pith. sign in

arxiv: 1907.09225 · v1 · pith:53EPD7PYnew · submitted 2019-07-22 · 💻 cs.IT · eess.SP· math.IT

Deep Learning Assisted Sum-Product Detection Algorithm for Faster-than-Nyquist Signaling

Pith reviewed 2026-05-24 18:14 UTC · model grok-4.3

classification 💻 cs.IT eess.SPmath.IT
keywords faster-than-Nyquist signalingsum-product algorithmdeep learningintersymbol interferencedetection algorithmTurbo equalizationfactor graph
0
0 comments X

The pith

A neural network attached to the factor graph lets faster-than-Nyquist detectors handle residual intersymbol interference and reach lower bit error rates.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces a modified sum-product detection algorithm for faster-than-Nyquist signaling that inserts a neural network as an additional function node on the factor graph. This node is trained to capture the interference components that a conventional detector leaves out when it models only a limited number of taps. The update rules are adjusted so the overall detector remains compatible with iterative Turbo equalization. A lightweight convolutional network is used, which requires only a small training set. The resulting detector is reported to improve error performance by as much as 2.5 dB over the standard algorithm under the same interference conditions.

Core claim

The deep learning assisted sum-product detection algorithm concatenates a neural network to the variable nodes of the conventional factor graph so that the network functions as an extra node that mitigates residual intersymbol interference; the message-passing rules are revised to support Turbo equalization, and simulations show a gain of up to 2.5 dB at the same bit error rate.

What carries the argument

Neural network function node inserted into the factor graph of the FTN system, which processes the unmodeled residual ISI while the rest of the graph performs standard sum-product updates.

If this is right

  • The detector converges to better approximations of the a posteriori probabilities given the received sequence.
  • The algorithm remains compatible with iterative Turbo equalization after the update-rule modification.
  • Only a small number of training batches is needed for the simplified convolutional network to deliver the reported gain.
  • The same performance advantage appears across different ISI responses when the network is trained accordingly.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same insertion technique could be tried on other message-passing detectors whose graphs truncate long interference tails.
  • Hardware implementations might trade some neural-network arithmetic for simpler channel models with fewer taps.
  • The approach suggests a general pattern: attach a small learned correction node wherever a factor graph deliberately omits part of the true channel memory.

Load-bearing premise

The neural network can be trained to accurately represent and mitigate the residual intersymbol interference that remains after the conventional detector models only a limited number of ISI taps.

What would settle it

A direct comparison test in which the DL-SPA and the conventional sum-product algorithm are run on identical FTN signals with the same limited ISI tap model; if the bit error rate curves show no improvement or a loss for the DL-SPA, the central performance claim is falsified.

Figures

Figures reproduced from arXiv: 1907.09225 by Bryan Liu, Jinhong Yuan, Shuangyang Li, Yixuan Xie.

Figure 2
Figure 2. Figure 2: Message updating in the factor graph of SPDA. [PITH_FULL_IMAGE:figures/full_fig_p002_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Deep learning assisted sum-product algorithm with [PITH_FULL_IMAGE:figures/full_fig_p003_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Simplified CNN function node Φ(x1, ..., xN ). filters. The filters convolve and stride over the input. The max￾pooling layer performs downsampling to reduce the spatial size of the convolved features. A dense layer is appended after the max-pooling layer to provide possibly nonlinear function [14]. In [15], a pure CNN based detection algorithm was proposed, where both max-pooling layers and dense layer are… view at source ↗
Figure 5
Figure 5. Figure 5: The BER of FTN signaling with τ = 0.6, α = 0.3, CC(7,5). 2 3 4 5 6 7 8 9 Eb /N0 (dB) 10-6 10-5 10-4 10-3 10-2 10-1 100 BER (Bit Error Rate) DLSPA(15, 2) SPDA(15, 2) SPDA(15, 6) BCJR(15, 2) BCJR(15, 3) [PITH_FULL_IMAGE:figures/full_fig_p005_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: The BER of FTN signaling with τ = 0.5, α = 0.3, CC(7,5) [PITH_FULL_IMAGE:figures/full_fig_p005_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Normalized average loss for every 103 batches of training samples and the absolute change of loss for every 5 × 103 batches. as the average loss from (a−1)×5×103 to a×5×103 batches (ξ 0 avg = 1) and ξ a cg = |(ξ a avg − ξ a−1 avg )/ξa−1 avg | as the percentage of the absolute change on the average loss (ξ 0 cg = 0), where a ∈ Z. Define that a stable performance of the training is reached after a × 5 × 103 … view at source ↗
read the original abstract

A deep learning assisted sum-product detection algorithm (DL-SPA) for faster-than-Nyquist (FTN) signaling is proposed in this paper. The proposed detection algorithm concatenates a neural network to the variable nodes of the conventional factor graph of the FTN system to help the detector converge to the a posterior probabilities based on the received sequence. More specifically, the neural network performs as a function node in the modified factor graph to deal with the residual intersymbol interference (ISI) that is not modeled by the conventional detector with a limited number of ISI taps. We modify the updating rule in the conventional sum-product algorithm so that the neural network assisted detector can be complemented to a Turbo equalization. Furthermore, a simplified convolutional neural network is employed as the neural network function node to enhance the detector's performance and the neural network needs a small number of batches to be trained. Simulation results have shown that the proposed DL-SPA achieves a performance gain up to 2.5 dB with the same bit error rate compared to the conventional sum-product detection algorithm under the same ISI responses.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper proposes a deep learning assisted sum-product detection algorithm (DL-SPA) for faster-than-Nyquist (FTN) signaling. It augments the conventional factor graph by concatenating a simplified convolutional neural network as an additional function node at the variable nodes to mitigate residual intersymbol interference (ISI) beyond the limited taps modeled by the standard detector. The sum-product update rules are modified to support Turbo-style iterations, and simulations are reported to show up to 2.5 dB gain in bit error rate performance compared to the conventional sum-product algorithm under identical ISI responses.

Significance. If the performance gain is reproducible and the neural-network augmentation is shown to be consistent with the underlying message-passing semantics, the work would provide a concrete example of hybrid model-based and data-driven detection that could improve practical FTN receivers in bandwidth-constrained channels. The emphasis on a small training set and simplified CNN architecture is a practical strength that distinguishes it from purely black-box approaches.

major comments (3)
  1. [Abstract / Simulation Results] Abstract and Simulation Results section: the central performance claim of a 2.5 dB gain at the same BER is presented without any description of simulation parameters (SNR range, acceleration factor, ISI length, training/test sequence lengths, number of Monte-Carlo trials), exact baseline implementations, error bars, or statistical significance testing. This absence prevents assessment of whether the reported gain is robust or reproducible.
  2. [Neural Network Function Node / Modified SPA Update Rule] Section describing the neural-network function node and modified update rule: no derivation or analysis is supplied showing that the CNN output preserves the marginalization semantics of the sum-product algorithm or that the learned mapping remains consistent when the detector is evaluated at SNRs or FTN acceleration factors different from those used in training. Without such analysis the 2.5 dB gain cannot be attributed to successful modeling of residual ISI rather than a heuristic correction tuned to the training distribution.
  3. [Training Procedure] Training procedure description: the claim that the network “needs a small number of batches to be trained” is stated without reporting the actual batch size, loss function, optimizer, or any ablation showing that performance degrades when the number of batches is further reduced. This detail is load-bearing for the practicality argument.
minor comments (2)
  1. [Abstract] The phrase “a posterior probabilities” in the abstract should read “a posteriori probabilities.”
  2. [Modified Factor Graph] Notation for the modified message updates (e.g., the precise form of the Turbo-style iteration) is introduced without an accompanying equation or pseudocode block, making the algorithmic change difficult to reproduce from the text alone.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the thorough review and valuable feedback. We agree that additional details on simulations, justification for the neural-network augmentation, and training specifics are needed to strengthen the manuscript. We will revise accordingly, providing the requested information and clarifications while preserving the core contribution of the hybrid DL-SPA approach. Point-by-point responses follow.

read point-by-point responses
  1. Referee: [Abstract / Simulation Results] Abstract and Simulation Results section: the central performance claim of a 2.5 dB gain at the same BER is presented without any description of simulation parameters (SNR range, acceleration factor, ISI length, training/test sequence lengths, number of Monte-Carlo trials), exact baseline implementations, error bars, or statistical significance testing. This absence prevents assessment of whether the reported gain is robust or reproducible.

    Authors: We agree that the simulation setup was insufficiently documented. In the revised manuscript we will add a new subsection (or expanded table) in the Simulation Results section that explicitly lists: SNR range (e.g., 0–10 dB), FTN acceleration factor, ISI tap length used by the baseline SPA, training and test sequence lengths, number of Monte-Carlo trials (at least 10^5 bits per point), exact baseline implementations (including the conventional SPA with the same ISI truncation), and either error bars or a statement on statistical significance. This will allow readers to reproduce and assess the robustness of the reported 2.5 dB gain. revision: yes

  2. Referee: [Neural Network Function Node / Modified SPA Update Rule] Section describing the neural-network function node and modified update rule: no derivation or analysis is supplied showing that the CNN output preserves the marginalization semantics of the sum-product algorithm or that the learned mapping remains consistent when the detector is evaluated at SNRs or FTN acceleration factors different from those used in training. Without such analysis the 2.5 dB gain cannot be attributed to successful modeling of residual ISI rather than a heuristic correction tuned to the training distribution.

    Authors: We acknowledge that the manuscript provides no formal derivation proving that the CNN-augmented messages preserve exact sum-product marginalization semantics. The modification is heuristic: the CNN output is treated as an additional soft evidence term that is combined with the conventional messages via the modified update rule to enable turbo-style iterations. We will revise the relevant section to (i) explicitly state that the approach is an empirical hybrid rather than a provably consistent message-passing algorithm, (ii) supply a brief justification based on the factor-graph interpretation (the CNN acts as an unmodeled residual-ISI function node), and (iii) add a short discussion of generalization, including any available results or caveats when SNR or acceleration factor differs from the training distribution. If space permits, we will also include a small ablation on out-of-distribution performance. revision: partial

  3. Referee: [Training Procedure] Training procedure description: the claim that the network “needs a small number of batches to be trained” is stated without reporting the actual batch size, loss function, optimizer, or any ablation showing that performance degrades when the number of batches is further reduced. This detail is load-bearing for the practicality argument.

    Authors: We agree that the training details were omitted. In the revised manuscript we will expand the Training Procedure subsection to report: batch size, loss function (cross-entropy on bit probabilities), optimizer (Adam with learning rate …), number of epochs/batches actually used, and the hardware/training time. If the original experiments contain the data, we will also add a brief ablation curve showing BER versus number of training batches to support the “small number” claim; otherwise we will qualify the statement accordingly. revision: yes

Circularity Check

0 steps flagged

No circularity; performance claims rest on external simulation benchmarks

full rationale

The paper introduces a concatenated CNN as an additional function node in a modified factor graph for FTN detection and reports up to 2.5 dB BER gain via Monte-Carlo simulations. No derivation chain is presented that reduces a claimed result to a fitted parameter, self-citation, or ansatz by construction; the update-rule modification and training procedure are described at the algorithmic level without equations that equate outputs to inputs. The result is therefore self-contained against the reported simulation evidence.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The claim depends on the neural network successfully learning residual ISI effects and on the modified update rules preserving detector functionality; these are domain assumptions without independent verification in the abstract.

axioms (1)
  • domain assumption The sum-product algorithm can be modified to incorporate an additional neural network function node while maintaining convergence properties for turbo equalization.
    The paper states that the updating rule is modified so the NN-assisted detector can be complemented to a Turbo equalization.
invented entities (1)
  • Neural network function node no independent evidence
    purpose: To deal with residual intersymbol interference not modeled by the conventional detector with a limited number of ISI taps
    The NN is introduced as a new component in the modified factor graph to enhance performance.

pith-pipeline@v0.9.0 · 5730 in / 1192 out tokens · 51454 ms · 2026-05-24T18:14:34.266819+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

18 extracted references · 18 canonical work pages · 4 internal anchors

  1. [1]

    Faster-than-Nyquist signaling,

    J. E. Mazo, “Faster-than-Nyquist signaling,” Bell Syst. Tech. J. , vol. 54, no. 8, pp. 1451–1462, Oct 1975

  2. [2]

    Faster-than-Nyq uist signaling,

    J. B. Anderson, F. Rusek, and V . Öwall, “Faster-than-Nyq uist signaling,” Proc. of the IEEE , vol. 101, no. 8, pp. 1817–1830, 2013

  3. [3]

    Improving the spectral efficiency of nonlinear satellite systems thro ugh time-frequency packing and advanced receiver processing,

    A. Piemontese, A. Modenini, G. Colavolpe, and N. S. Alagh a, “Improving the spectral efficiency of nonlinear satellite systems thro ugh time-frequency packing and advanced receiver processing,” IEEE Trans. on Commun. , vol. 61, no. 8, pp. 3404–3412, August 2013

  4. [4]

    Luo and C

    F. Luo and C. Zhang, Faster-than-Nyquist Signaling for 5G Communication . IEEE, 2016. [Online]. Available: https://ieeexplore.ieee.org/document/7572754

  5. [5]

    Reduced-comple xity equalization for faster-than-Nyquist signaling: New meth ods based on ungerboeck observation model,

    S. Li, B. Bai, J. Zhou, P . Chen, and Z. Y u, “Reduced-comple xity equalization for faster-than-Nyquist signaling: New meth ods based on ungerboeck observation model,” IEEE Trans. on Commun. , vol. 66, no. 3, pp. 1190–1204, March 2018

  6. [6]

    Adaptive maximum-likelihood receiver for carrier- modulated data-transmission systems,

    G. Ungerboeck, “Adaptive maximum-likelihood receiver for carrier- modulated data-transmission systems,” IEEE Trans. Commun. , vol. 22, no. 5, pp. 624–636, May 1974

  7. [7]

    SISO det ection over linear channels with linear complexity in the number of interferer s,

    G. Colavolpe, D. Fertonani, and A. Piemontese, “SISO det ection over linear channels with linear complexity in the number of interferer s,” IEEE J. Sel. Topics Signal Process. , vol. 5, no. 8, pp. 1475–1485, Dec 2011

  8. [8]

    OFDM-Autoencoder for End-to-End Learning of Communications Systems

    A. Felix, S. Cammerer, S. Dörner, J. Hoydis, and S. ten Bri nk, “OFDM-autoencoder for end-to-end learning of communicati ons systems,” CoRR, vol. abs/1803.05815, 2018. [Online]. Available: http://arxiv.org/abs/1803.05815

  9. [9]

    Performance en hancement of ACO-OFDM-based VLC systems using a hybrid autoencoder sc heme,

    L. Hao, D. Wang, W. Cheng, J. Li, and A. Ma, “Performance en hancement of ACO-OFDM-based VLC systems using a hybrid autoencoder sc heme,” Optics Commun. , vol. 442, pp. 110 – 116, 2019. [Online]. Available: http://www.sciencedirect.com/science/article/pii/S0030401819302032

  10. [10]

    Deep learning methods for improved decoding of l inear codes,

    E. Nachmani, E. Marciano, L. Lugosch, W. J. Gross, D. Bur shtein, and Y . Be’ery, “Deep learning methods for improved decoding of l inear codes,” IEEE J. Sel. Topics Signal Process. , vol. 12, no. 1, pp. 119–131, Feb 2018

  11. [11]

    Learned be lief- propagation decoding with simple scaling and SNR adaptatio n,

    M. Lian, F. Carpi, C. Häger, and H. D. Pfister, “Learned be lief- propagation decoding with simple scaling and SNR adaptatio n,” CoRR, vol. abs/1901.08621, 2019. [Online]. Available: http://arxiv .org/abs/1901.08621

  12. [12]

    On Deep Learning-Based Channel Decoding

    T. Gruber, S. Cammerer, J. Hoydis, and S. ten Brink, “On d eep learning-based channel decoding,” CoRR, vol. abs/1701.07738, 2017. [Online]. Available: http://arxiv.org/abs/1701.07738

  13. [13]

    On MAP symbol detection f or ISI channels using the Ungerboeck observation model,

    G. Colavolpe and A. Barbieri, “On MAP symbol detection f or ISI channels using the Ungerboeck observation model,” IEEE Commun. Lett , vol. 9, no. 8, pp. 720–722, Aug 2005

  14. [14]

    Backpropagation applied to ha ndwritten zip code recognition,

    Y . LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howa rd, W. Hubbard, and L. D. Jackel, “Backpropagation applied to ha ndwritten zip code recognition,” Neural Computation , vol. 1, no. 4, pp. 541–551,

  15. [15]

    Available: https://doi.org/10.1162/nec o.1989.1.4.541

    [Online]. Available: https://doi.org/10.1162/nec o.1989.1.4.541

  16. [16]

    CNN-Based Signal Detection for Banded Linear Systems

    C. Fan, X. Y uan, and Y . A. Zhang, “CNN-based signal detec tion for banded linear systems,” CoRR, vol. abs/1809.03682, 2018. [Online]. Available: http://arxiv.org/abs/1809.03682

  17. [17]

    Rectified linear units improve restricted boltzmann machines,

    V . Nair and G. E. Hinton, “Rectified linear units improve restricted boltzmann machines,” in Proc. of the 27th Int. Conf. on Machine Learning , ser. ICML’10. USA: Omnipress, 2010, pp. 807–814. [Online]. Available: http://dl.acm.org/citation.cfm?id=3104322.3104425

  18. [18]

    Deep Learning for Decoding of Linear Codes - A Syndrome-Based Approach

    A. Bennatan, Y . Choukroun, and P . Kisilev, “Deep learni ng for decoding of linear codes - A syndrome-based approach,” arXiv e-prints , p. arXiv:1802.04741, Feb 2018