pith. sign in

arxiv: 1907.08535 · v1 · pith:WEMMAUCSnew · submitted 2019-07-19 · 💻 cs.IT · eess.SP· math.IT· stat.ML

End-to-end Learning for GMI Optimized Geometric Constellation Shape

Pith reviewed 2026-05-24 18:46 UTC · model grok-4.3

classification 💻 cs.IT eess.SPmath.ITstat.ML
keywords geometric constellation shapingautoencodergeneralized mutual informationbit mappingend-to-end learningtransceiver impairmentsQAM
0
0 comments X

The pith

End-to-end autoencoder training jointly optimizes geometric constellation shapes and bit mappings to raise GMI by up to 0.2 bits per QAM symbol.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that an autoencoder can be trained to discover both the positions of constellation points and the assignment of bits to those points. Training occurs through a channel model that includes transceiver impairments, with the objective of maximizing generalized mutual information. Reported gains reach 0.2 bits per QAM symbol across multiple rates. The resulting constellations remain compatible with ordinary binary forward-error-correction codes.

Core claim

By casting constellation design as the end-to-end training of a neural network that includes the transmitter, channel, and receiver, the method learns point locations and bit mappings that achieve higher GMI than conventional geometric shaping while remaining usable with standard binary FEC.

What carries the argument

The autoencoder that parameterizes both the constellation geometry and the bit-to-symbol mapping, trained end-to-end to maximize GMI through a differentiable impairment model.

If this is right

  • The same binary FEC can be retained while increasing achievable rate.
  • The approach applies across a range of data rates without redesigning the outer code.
  • Gains persist when transceiver impairments are included in the training loop.
  • No change to the receiver architecture beyond the demapper is required.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If the learned shapes transfer to hardware, they could narrow the gap between practical links and theoretical limits without new coding schemes.
  • The same training loop could be re-targeted to other differentiable metrics such as bit-error rate or mutual information.
  • Including nonlinear fiber effects in the training model might produce constellations suited to long-haul optical systems.

Load-bearing premise

The simulation model of transceiver impairments used during autoencoder training is sufficiently representative of real hardware that the learned constellations and mappings will deliver the reported GMI gain when deployed.

What would settle it

Deploy the learned constellation points and mappings on actual transceiver hardware, measure the realized GMI, and check whether the 0.2 bits per QAM symbol improvement over conventional BICM still appears.

Figures

Figures reproduced from arXiv: 1907.08535 by Darko Zibar, Metodi P. Yankov, Rasmus T. Jones.

Figure 1
Figure 1. Figure 1: Schematic autoencoder model applied directly on bits. [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
read the original abstract

Autoencoder-based geometric shaping is proposed that includes optimizing bit mappings. Up to 0.2 bits/QAM symbol gain in GMI is achieved for a variety of data rates and in the presence of transceiver impairments. The gains can be harvested with standard binary FEC at no cost w.r.t. conventional BICM.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript proposes an autoencoder-based end-to-end learning framework for jointly optimizing geometric constellation shapes and bit mappings to maximize generalized mutual information (GMI) under transceiver impairments. It reports empirical gains of up to 0.2 bits per QAM symbol across a range of data rates, which can be realized using standard binary FEC without additional overhead relative to conventional BICM.

Significance. If the reported GMI gains prove robust, the work provides a practical route to improved spectral efficiency in impaired channels by learning constellations and labelings that remain compatible with existing binary FEC. The joint optimization via autoencoders is a clear methodological strength when accompanied by reproducible training procedures and baseline comparisons.

major comments (2)
  1. [Abstract and simulation section] Abstract and § on simulation setup: the central claim of 0.2 bit/QAM GMI improvement 'in the presence of transceiver impairments' rests on the training impairment model (nonlinearities, noise statistics, etc.) being representative of hardware; the manuscript supplies no validation of this model against measured hardware statistics or sensitivity analysis showing that the learned points remain GMI-optimal when the model is perturbed.
  2. [Results] Results section: the abstract states empirical gains but the provided description supplies no training details, validation procedure, baseline comparisons, or error bars, preventing assessment of whether the 0.2 bit margin is robust or an artifact of the simulation setup.
minor comments (1)
  1. Notation for GMI and the autoencoder loss should be defined explicitly on first use to aid readers unfamiliar with the intersection of machine learning and information theory.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments. We address each major comment below and indicate the revisions we will make.

read point-by-point responses
  1. Referee: [Abstract and simulation section] Abstract and § on simulation setup: the central claim of 0.2 bit/QAM GMI improvement 'in the presence of transceiver impairments' rests on the training impairment model (nonlinearities, noise statistics, etc.) being representative of hardware; the manuscript supplies no validation of this model against measured hardware statistics or sensitivity analysis showing that the learned points remain GMI-optimal when the model is perturbed.

    Authors: Our impairment model follows standard transceiver models from the literature (nonlinearities, phase noise, and AWGN). We agree that a sensitivity analysis would strengthen the claims and will add one in the revised version to show that the learned points remain near-optimal under moderate perturbations to model parameters. Direct validation against measured hardware statistics is outside the scope of this simulation study, as the work focuses on the end-to-end learning methodology rather than a specific hardware campaign. revision: partial

  2. Referee: [Results] Results section: the abstract states empirical gains but the provided description supplies no training details, validation procedure, baseline comparisons, or error bars, preventing assessment of whether the 0.2 bit margin is robust or an artifact of the simulation setup.

    Authors: We will expand the results section to explicitly detail the training procedure (optimizer, learning rate schedule, number of epochs), the validation split used, the exact baseline constellations and labelings compared, and error bars computed over multiple independent training runs with different random seeds to demonstrate that the reported GMI gains are robust. revision: yes

Circularity Check

0 steps flagged

No significant circularity; empirical simulation gains are measured outcomes

full rationale

The paper uses an autoencoder to learn constellation points and bit mappings that maximize GMI under a chosen impairment model, then reports measured GMI improvements versus conventional BICM in simulation. No load-bearing equations, fitted parameters renamed as predictions, or self-citation chains reduce the reported gains to inputs by construction. The central result is an empirical comparison of two separately simulated systems, which remains falsifiable against external hardware or alternative models and does not collapse into a definitional identity.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Only the abstract is available; the ledger is therefore minimal and provisional.

axioms (1)
  • domain assumption The channel and impairment model used for training is representative of the target deployment scenario.
    Required for any learned constellation to transfer from simulation to hardware.

pith-pipeline@v0.9.0 · 5579 in / 1088 out tokens · 47045 ms · 2026-05-24T18:46:09.271026+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

18 extracted references · 18 canonical work pages · 1 internal anchor

  1. [1]

    We show that the proposed autoencoder arrives at a Gray-like code, which does not exhibit this problem

    Conventional bit-interleaved coded modulation (BICM) is penalized with geometric shaping due to the non-Gray labeling. We show that the proposed autoencoder arrives at a Gray-like code, which does not exhibit this problem

  2. [2]

    We show that in the operating regions of interest and with the application of modulation- format independent digital signal processing (DSP) chain, the penalty is the same

    The implementation penalty is higher for geometric shap- ing than rectangular QAM. We show that in the operating regions of interest and with the application of modulation- format independent digital signal processing (DSP) chain, the penalty is the same

  3. [3]

    End-to-end Learning for GMI Optimized Geometric Constellation Shape

    Iterative demapping or non-binary FEC are required for geometric shaping schemes. We show that the proposed labelings do not have this requirement because they are GMI optimized. 1 arXiv:1907.08535v1 [cs.IT] 19 Jul 2019 -1 0 1 In-phase -1 0 1 Quadrature 0 1 In-phase 0 1Quadrature 00001000 00001001 00001010 00001011 00001101 00001111 00011000 00011001 0001...

  4. [4]

    A family of three geometric shapes optimized as in Section 2 are evaluated

    frequency offset between transmitter laser and local oscilla- tor of 50 MHz; 3) ADC sampling frequency of 80 GSa/s; 4) ADC resolution of 6 bits, modelled with a uniform quantiza- tion step. A family of three geometric shapes optimized as in Section 2 are evaluated. The shapes are optimized for transmis- sion at 2, 5 and 10 spans. The optimal of the three ...

  5. [5]

    Constellation shaping for WDM systems using 256QAM/1024QAM with probabilistic optimization

    Yankov, M. P., et al. "Constellation shaping for WDM systems using 256QAM/1024QAM with probabilistic optimization." Journal of Lightwave Technology 34.22 (2016): 5146-5156

  6. [6]

    Rate adaptation and reach increase by probabilistically shaped 64-QAM: An experimental demonstration

    Buchali, F., et al. "Rate adaptation and reach increase by probabilistically shaped 64-QAM: An experimental demonstration." Journal of Lightwave Technology 34.7 (2016): 1599-1609

  7. [7]

    Bandwidth effi- cient and rate-matched low-density parity-check coded modulation

    Böcherer, G., Steiner, F., and Schulte, P. "Bandwidth effi- cient and rate-matched low-density parity-check coded modulation." IEEE Transactions on Communications 63.12 (2015): 4651-4665

  8. [8]

    Capacity achieving nonbinary LDPC coded non-uniform shaping modulation for adaptive optical communications

    Lin, C., et al. "Capacity achieving nonbinary LDPC coded non-uniform shaping modulation for adaptive optical communications." Optics express 24.16 (2016): 18095-18104

  9. [9]

    Coded PDM-OFDM transmission with shaped 256-iterative-polar-modulation achieving 11.15-b/s/Hz intrachannel spectral efficiency and 800-km reach

    Lotz, T. H., et al. "Coded PDM-OFDM transmission with shaped 256-iterative-polar-modulation achieving 11.15-b/s/Hz intrachannel spectral efficiency and 800-km reach." Journal of Lightwave Technology 31.4 (2013): 538-545

  10. [10]

    Constant composition dis- tribution matching

    Schulte, P., and Böcherer, G. "Constant composition dis- tribution matching." IEEE Transactions on Information Theory 62.1 (2016): 430-434

  11. [11]

    Hierarchical distribution matching for probabilistically shaped coded modulation

    Yoshida, T., Karlsson, M., and Agrell, E. "Hierarchical distribution matching for probabilistically shaped coded modulation." Journal of Lightwave Technology (2019)

  12. [12]

    ”Properties of nonlinear noise in long, dispersion-uncompensated fiber links.” Opt

    Dar, R., et al. ”Properties of nonlinear noise in long, dispersion-uncompensated fiber links.” Opt. Exp. 21.22 (2013): 25685-25699

  13. [13]

    Ultrahigh-Spectral-Efficiency WDM/SDM Transmission Using PDM-1024-QAM Probabilistic Shaping With Adaptive Rate

    Hu, H., et al. "Ultrahigh-Spectral-Efficiency WDM/SDM Transmission Using PDM-1024-QAM Probabilistic Shaping With Adaptive Rate." Journal of Lightwave Technology 36.6 (2018): 1304-1308. Conference Paper

  14. [14]

    Design and performance evaluation of a GMI-optimized 32QAM

    Zhang, S., et al. "Design and performance evaluation of a GMI-optimized 32QAM." 2017 European Conference on Optical Communication (ECOC). IEEE, 2017

  15. [15]

    Increasing achievable information rates via geometric shaping

    Chen, B., et al. "Increasing achievable information rates via geometric shaping." 2018 European Conference on Optical Communication (ECOC). IEEE, 2018

  16. [16]

    Deep learning of geometric constella- tion shaping including fiber nonlinearities

    Jones, R. T., et al. "Deep learning of geometric constella- tion shaping including fiber nonlinearities." 2018 Euro- pean Conference on Optical Communication (ECOC). IEEE, 2018. Standards

  17. [17]

    LTE 3GPP TS 36.212: Evolved Universal Terrestrial Radio Access (E-UTRA); Multiplexing and channel cod- ing’, 2013 Source Code

  18. [18]

    R. T. Jones, https://github.com/rassibassi/claude, 2019. 4