End-to-end Learning for GMI Optimized Geometric Constellation Shape
Pith reviewed 2026-05-24 18:46 UTC · model grok-4.3
The pith
End-to-end autoencoder training jointly optimizes geometric constellation shapes and bit mappings to raise GMI by up to 0.2 bits per QAM symbol.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By casting constellation design as the end-to-end training of a neural network that includes the transmitter, channel, and receiver, the method learns point locations and bit mappings that achieve higher GMI than conventional geometric shaping while remaining usable with standard binary FEC.
What carries the argument
The autoencoder that parameterizes both the constellation geometry and the bit-to-symbol mapping, trained end-to-end to maximize GMI through a differentiable impairment model.
If this is right
- The same binary FEC can be retained while increasing achievable rate.
- The approach applies across a range of data rates without redesigning the outer code.
- Gains persist when transceiver impairments are included in the training loop.
- No change to the receiver architecture beyond the demapper is required.
Where Pith is reading between the lines
- If the learned shapes transfer to hardware, they could narrow the gap between practical links and theoretical limits without new coding schemes.
- The same training loop could be re-targeted to other differentiable metrics such as bit-error rate or mutual information.
- Including nonlinear fiber effects in the training model might produce constellations suited to long-haul optical systems.
Load-bearing premise
The simulation model of transceiver impairments used during autoencoder training is sufficiently representative of real hardware that the learned constellations and mappings will deliver the reported GMI gain when deployed.
What would settle it
Deploy the learned constellation points and mappings on actual transceiver hardware, measure the realized GMI, and check whether the 0.2 bits per QAM symbol improvement over conventional BICM still appears.
Figures
read the original abstract
Autoencoder-based geometric shaping is proposed that includes optimizing bit mappings. Up to 0.2 bits/QAM symbol gain in GMI is achieved for a variety of data rates and in the presence of transceiver impairments. The gains can be harvested with standard binary FEC at no cost w.r.t. conventional BICM.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes an autoencoder-based end-to-end learning framework for jointly optimizing geometric constellation shapes and bit mappings to maximize generalized mutual information (GMI) under transceiver impairments. It reports empirical gains of up to 0.2 bits per QAM symbol across a range of data rates, which can be realized using standard binary FEC without additional overhead relative to conventional BICM.
Significance. If the reported GMI gains prove robust, the work provides a practical route to improved spectral efficiency in impaired channels by learning constellations and labelings that remain compatible with existing binary FEC. The joint optimization via autoencoders is a clear methodological strength when accompanied by reproducible training procedures and baseline comparisons.
major comments (2)
- [Abstract and simulation section] Abstract and § on simulation setup: the central claim of 0.2 bit/QAM GMI improvement 'in the presence of transceiver impairments' rests on the training impairment model (nonlinearities, noise statistics, etc.) being representative of hardware; the manuscript supplies no validation of this model against measured hardware statistics or sensitivity analysis showing that the learned points remain GMI-optimal when the model is perturbed.
- [Results] Results section: the abstract states empirical gains but the provided description supplies no training details, validation procedure, baseline comparisons, or error bars, preventing assessment of whether the 0.2 bit margin is robust or an artifact of the simulation setup.
minor comments (1)
- Notation for GMI and the autoencoder loss should be defined explicitly on first use to aid readers unfamiliar with the intersection of machine learning and information theory.
Simulated Author's Rebuttal
We thank the referee for the constructive comments. We address each major comment below and indicate the revisions we will make.
read point-by-point responses
-
Referee: [Abstract and simulation section] Abstract and § on simulation setup: the central claim of 0.2 bit/QAM GMI improvement 'in the presence of transceiver impairments' rests on the training impairment model (nonlinearities, noise statistics, etc.) being representative of hardware; the manuscript supplies no validation of this model against measured hardware statistics or sensitivity analysis showing that the learned points remain GMI-optimal when the model is perturbed.
Authors: Our impairment model follows standard transceiver models from the literature (nonlinearities, phase noise, and AWGN). We agree that a sensitivity analysis would strengthen the claims and will add one in the revised version to show that the learned points remain near-optimal under moderate perturbations to model parameters. Direct validation against measured hardware statistics is outside the scope of this simulation study, as the work focuses on the end-to-end learning methodology rather than a specific hardware campaign. revision: partial
-
Referee: [Results] Results section: the abstract states empirical gains but the provided description supplies no training details, validation procedure, baseline comparisons, or error bars, preventing assessment of whether the 0.2 bit margin is robust or an artifact of the simulation setup.
Authors: We will expand the results section to explicitly detail the training procedure (optimizer, learning rate schedule, number of epochs), the validation split used, the exact baseline constellations and labelings compared, and error bars computed over multiple independent training runs with different random seeds to demonstrate that the reported GMI gains are robust. revision: yes
Circularity Check
No significant circularity; empirical simulation gains are measured outcomes
full rationale
The paper uses an autoencoder to learn constellation points and bit mappings that maximize GMI under a chosen impairment model, then reports measured GMI improvements versus conventional BICM in simulation. No load-bearing equations, fitted parameters renamed as predictions, or self-citation chains reduce the reported gains to inputs by construction. The central result is an empirical comparison of two separately simulated systems, which remains falsifiable against external hardware or alternative models and does not collapse into a definitional identity.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption The channel and impairment model used for training is representative of the target deployment scenario.
Reference graph
Works this paper leans on
-
[1]
Conventional bit-interleaved coded modulation (BICM) is penalized with geometric shaping due to the non-Gray labeling. We show that the proposed autoencoder arrives at a Gray-like code, which does not exhibit this problem
-
[2]
The implementation penalty is higher for geometric shap- ing than rectangular QAM. We show that in the operating regions of interest and with the application of modulation- format independent digital signal processing (DSP) chain, the penalty is the same
-
[3]
End-to-end Learning for GMI Optimized Geometric Constellation Shape
Iterative demapping or non-binary FEC are required for geometric shaping schemes. We show that the proposed labelings do not have this requirement because they are GMI optimized. 1 arXiv:1907.08535v1 [cs.IT] 19 Jul 2019 -1 0 1 In-phase -1 0 1 Quadrature 0 1 In-phase 0 1Quadrature 00001000 00001001 00001010 00001011 00001101 00001111 00011000 00011001 0001...
work page internal anchor Pith review Pith/arXiv arXiv 1907
-
[4]
A family of three geometric shapes optimized as in Section 2 are evaluated
frequency offset between transmitter laser and local oscilla- tor of 50 MHz; 3) ADC sampling frequency of 80 GSa/s; 4) ADC resolution of 6 bits, modelled with a uniform quantiza- tion step. A family of three geometric shapes optimized as in Section 2 are evaluated. The shapes are optimized for transmis- sion at 2, 5 and 10 spans. The optimal of the three ...
-
[5]
Constellation shaping for WDM systems using 256QAM/1024QAM with probabilistic optimization
Yankov, M. P., et al. "Constellation shaping for WDM systems using 256QAM/1024QAM with probabilistic optimization." Journal of Lightwave Technology 34.22 (2016): 5146-5156
work page 2016
-
[6]
Rate adaptation and reach increase by probabilistically shaped 64-QAM: An experimental demonstration
Buchali, F., et al. "Rate adaptation and reach increase by probabilistically shaped 64-QAM: An experimental demonstration." Journal of Lightwave Technology 34.7 (2016): 1599-1609
work page 2016
-
[7]
Bandwidth effi- cient and rate-matched low-density parity-check coded modulation
Böcherer, G., Steiner, F., and Schulte, P. "Bandwidth effi- cient and rate-matched low-density parity-check coded modulation." IEEE Transactions on Communications 63.12 (2015): 4651-4665
work page 2015
-
[8]
Lin, C., et al. "Capacity achieving nonbinary LDPC coded non-uniform shaping modulation for adaptive optical communications." Optics express 24.16 (2016): 18095-18104
work page 2016
-
[9]
Lotz, T. H., et al. "Coded PDM-OFDM transmission with shaped 256-iterative-polar-modulation achieving 11.15-b/s/Hz intrachannel spectral efficiency and 800-km reach." Journal of Lightwave Technology 31.4 (2013): 538-545
work page 2013
-
[10]
Constant composition dis- tribution matching
Schulte, P., and Böcherer, G. "Constant composition dis- tribution matching." IEEE Transactions on Information Theory 62.1 (2016): 430-434
work page 2016
-
[11]
Hierarchical distribution matching for probabilistically shaped coded modulation
Yoshida, T., Karlsson, M., and Agrell, E. "Hierarchical distribution matching for probabilistically shaped coded modulation." Journal of Lightwave Technology (2019)
work page 2019
-
[12]
”Properties of nonlinear noise in long, dispersion-uncompensated fiber links.” Opt
Dar, R., et al. ”Properties of nonlinear noise in long, dispersion-uncompensated fiber links.” Opt. Exp. 21.22 (2013): 25685-25699
work page 2013
-
[13]
Hu, H., et al. "Ultrahigh-Spectral-Efficiency WDM/SDM Transmission Using PDM-1024-QAM Probabilistic Shaping With Adaptive Rate." Journal of Lightwave Technology 36.6 (2018): 1304-1308. Conference Paper
work page 2018
-
[14]
Design and performance evaluation of a GMI-optimized 32QAM
Zhang, S., et al. "Design and performance evaluation of a GMI-optimized 32QAM." 2017 European Conference on Optical Communication (ECOC). IEEE, 2017
work page 2017
-
[15]
Increasing achievable information rates via geometric shaping
Chen, B., et al. "Increasing achievable information rates via geometric shaping." 2018 European Conference on Optical Communication (ECOC). IEEE, 2018
work page 2018
-
[16]
Deep learning of geometric constella- tion shaping including fiber nonlinearities
Jones, R. T., et al. "Deep learning of geometric constella- tion shaping including fiber nonlinearities." 2018 Euro- pean Conference on Optical Communication (ECOC). IEEE, 2018. Standards
work page 2018
-
[17]
LTE 3GPP TS 36.212: Evolved Universal Terrestrial Radio Access (E-UTRA); Multiplexing and channel cod- ing’, 2013 Source Code
work page 2013
-
[18]
R. T. Jones, https://github.com/rassibassi/claude, 2019. 4
work page 2019
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.