VCSEL-based PAM-4 transmission system emulator: A data-driven deep learning perspective

Adonis Bogris; Charis Mesaritakis; Dimitris Kalavrouziotis; Nikos Argyris; Paraskevas Bakopoulos; Stavros Deligiannidis; Stefanos Dris

arxiv: 2605.18917 · v1 · pith:S4LTIKHVnew · submitted 2026-05-18 · 📡 eess.SP · physics.optics

VCSEL-based PAM-4 transmission system emulator: A data-driven deep learning perspective

Stavros Deligiannidis , Nikos Argyris , Stefanos Dris , Dimitris Kalavrouziotis , Paraskevas Bakopoulos , Charis Mesaritakis , Adonis Bogris This is my paper

Pith reviewed 2026-05-20 08:41 UTC · model grok-4.3

classification 📡 eess.SP physics.optics

keywords VCSELPAM-4Bi-LSTMtransfer learningoptical interconnectsdata-driven emulatordeep learningsystem emulation

0 comments

The pith

Bi-LSTM networks trained on experimental waveforms emulate VCSEL PAM-4 links and extend to new regimes via transfer learning with 20-fold less computation while keeping error below 0.04.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes a data-driven emulator for high-speed VCSEL-based PAM-4 optical interconnects that learns end-to-end system behavior directly from measured input-output waveforms instead of solving rate-equation models. Bidirectional LSTM networks capture the nonlinear dynamics and noise, then transfer learning with weight interpolation adapts the same network to unseen operating conditions without full retraining. This yields accurate emulation in far less time than starting from scratch each time. A sympathetic reader would care because conventional physical models are slow and require tricky parameter fitting, so a fast, data-driven alternative could accelerate the design of short-reach optical links used in data centers.

Core claim

A bidirectional LSTM network trained on experimental waveforms learns the mapping from electrical drive signals to optical output in a VCSEL-based PAM-4 system. Transfer learning combined with weight interpolation then extends the trained model to new operating regimes, achieving a twenty-fold reduction in computation time relative to independent training while holding normalized mean squared error below 0.04. The resulting emulator serves as a rapid, accurate substitute for conventional rate-equation simulations in the design and optimization of short-reach optical links.

What carries the argument

Bidirectional Long Short-Term Memory (Bi-LSTM) recurrent network that processes time-series waveforms to learn end-to-end system dynamics, extended by transfer learning and weight interpolation for new operating points.

If this is right

The emulator replaces computationally intensive rate-equation models that need difficult parameter tuning.
Design and optimization of short-reach optical links becomes faster because new conditions can be tested without full retraining.
End-to-end nonlinear dynamics and noise are captured directly from measured data rather than assumed physical equations.
Normalized mean squared error remains below 0.04 across the adapted regimes.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same waveform-based training and interpolation approach could be tested on other laser types or modulation formats such as PAM-8 or coherent links.
Integration of the emulator into real-time monitoring hardware might allow online adaptation when link conditions drift.
Weight interpolation could lower the volume of new experimental data needed when only a few parameters change.

Load-bearing premise

The experimental waveforms used for initial training already contain representative samples of all relevant nonlinear dynamics and noise sources that will appear in the new operating regimes.

What would settle it

Running the transferred model on a new operating regime and measuring normalized mean squared error above 0.04 or observing computation time savings far below 20-fold would falsify the claim of reliable and efficient extension.

read the original abstract

We demonstrate a data-driven framework for emulating high-speed VCSEL-based 4-level Pulse Amplitude Modulation (PAM-4) optical interconnects using bidirectional Long Short-Term Memory (Bi-LSTM) networks. Unlike conventional rate-equation models, which are computationally intensive and often require difficult parameter tuning, our approach utilizes experimental waveforms to learn the end-to-end system dynamics. By employing transfer learning and weight interpolation, we extend the model to new operating regimes with a 20-fold reduction in computation time compared to independent training, while maintaining normalized mean squared error below 0.04. This emulator provides a rapid, accurate tool for the design and optimization of short-reach optical links.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Bi-LSTM emulator with transfer learning delivers usable speedup for VCSEL PAM-4 modeling, but the generalization claim rests on thin evidence for truly new regimes.

read the letter

Hi, the main point is that this paper shows a Bi-LSTM trained on real VCSEL PAM-4 waveforms can match the link behavior with NMSE below 0.04 and run about 20 times faster than rate-equation models once transfer learning and weight interpolation are applied to new conditions. That combination is the concrete advance here. They train directly on experimental traces instead of tuning physical parameters, which removes a common headache in short-reach link design. The reported numbers are specific enough that a reader can see what the method actually achieves on this hardware. The approach is straightforward and stays grounded in measured data rather than self-referential equations. The soft spot is the transfer-learning step. The abstract claims the interpolated model works on new operating regimes, yet there is no quantitative map of how far those regimes sit from the training set in bias current, temperature, or drive level, and no error figures on held-out experimental traces from those regimes. If the test conditions stay close to the original data, the 20-fold claim looks narrower than presented. Validation details on splits, baseline comparisons, and overfitting checks are also missing from what is visible, so the central performance numbers cannot be fully stress-tested yet. This work is for engineers who simulate short-reach optical interconnects and want faster design loops than full physical models allow. A reader focused on ML tools for photonics hardware would get practical value from the reported speed-accuracy trade-off. It deserves a serious referee because the claims are testable and the method is reproducible from experimental waveforms. I would send it to peer review and ask the authors to add explicit distance metrics between training and target regimes plus error on fresh experimental data from those regimes.

Referee Report

2 major / 2 minor

Summary. The manuscript presents a data-driven Bi-LSTM emulator for VCSEL-based PAM-4 optical interconnects trained directly on experimental waveforms. It claims that transfer learning combined with weight interpolation allows extension to new operating regimes, delivering a 20-fold reduction in computation time versus independent training from scratch while keeping normalized mean squared error below 0.04, thereby providing a fast alternative to conventional rate-equation models that require intensive parameter tuning.

Significance. If the generalization performance is substantiated, the work offers a practical, low-compute tool for rapid design and optimization of short-reach optical links. The explicit use of experimental data avoids the parameter-fitting difficulties of physics-based models and demonstrates a concrete engineering benefit (20-fold speedup) that could be adopted in link budgeting and transceiver development workflows.

major comments (2)

[Abstract / transfer-learning results] Abstract and transfer-learning results section: the central claim that weight interpolation extends the model to new regimes while preserving NMSE < 0.04 rests on the untested assumption that the collected experimental waveforms are representative of the full range of nonlinear dynamics and noise statistics in the target conditions. No quantitative description is given of the parameter shifts (bias current, temperature, drive amplitude, or fiber length) between training and target regimes, nor are NMSE values reported on truly held-out experimental traces from those regimes.
[Experimental validation] Validation and experimental setup: the manuscript does not report details on validation splits, baseline comparisons against rate-equation solvers or other ML emulators, overfitting diagnostics, or statistical significance tests for the reported NMSE values. These omissions prevent assessment of whether the 20-fold speedup claim is robust or merely an artifact of narrow test conditions.

minor comments (2)

[Methods] Notation for the Bi-LSTM architecture and interpolation procedure could be clarified with an explicit equation or diagram showing how weights are interpolated between source and target models.
[Introduction] A short discussion of related deep-learning emulators for optical links (e.g., prior CNN or RNN approaches) would help situate the novelty of the transfer-learning + interpolation strategy.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments, which help clarify the presentation of our results. We address each major comment below and will revise the manuscript accordingly to provide the requested details and strengthen the validation.

read point-by-point responses

Referee: [Abstract / transfer-learning results] Abstract and transfer-learning results section: the central claim that weight interpolation extends the model to new regimes while preserving NMSE < 0.04 rests on the untested assumption that the collected experimental waveforms are representative of the full range of nonlinear dynamics and noise statistics in the target conditions. No quantitative description is given of the parameter shifts (bias current, temperature, drive amplitude, or fiber length) between training and target regimes, nor are NMSE values reported on truly held-out experimental traces from those regimes.

Authors: We agree that explicit quantification of the operating-point shifts and direct evaluation on held-out target-regime traces would strengthen the transfer-learning claim. In the revised manuscript we will insert a table that lists the exact changes in bias current, temperature, drive amplitude, and fiber length between the source and target regimes. We will also add NMSE results computed on additional, previously unused experimental waveforms recorded under the target conditions, confirming that the interpolated model continues to satisfy NMSE < 0.04 on these independent traces. revision: yes
Referee: [Experimental validation] Validation and experimental setup: the manuscript does not report details on validation splits, baseline comparisons against rate-equation solvers or other ML emulators, overfitting diagnostics, or statistical significance tests for the reported NMSE values. These omissions prevent assessment of whether the 20-fold speedup claim is robust or merely an artifact of narrow test conditions.

Authors: We acknowledge that these methodological details are necessary for a complete assessment. The revised version will specify the training/validation/test split ratios and the procedure used to ensure the splits contain independent measurements. We will add a direct runtime and accuracy comparison against a conventional rate-equation solver to substantiate the reported 20-fold speedup. Training and validation loss curves will be included to document the absence of overfitting, and we will report mean NMSE together with standard deviation across five independent training runs with different random seeds to establish statistical reliability of the results. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation is self-contained data-driven learning

full rationale

The paper trains Bi-LSTM networks directly on external experimental waveforms collected from the VCSEL-PAM-4 link to learn end-to-end dynamics, then applies standard transfer learning and weight interpolation to extend to new regimes. No step reduces a claimed prediction or result to its own inputs by construction, self-definition, or a load-bearing self-citation chain; the outputs are learned from independent measurements rather than tautological fits or renamed empirical patterns. The central claims rest on empirical training and ML techniques without invoking uniqueness theorems or ansatzes from prior author work.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on the assumption that measured input-output waveform pairs contain enough information to learn the full nonlinear dynamics without explicit physical equations, plus the practical availability of representative experimental data for both initial training and transfer.

free parameters (1)

Bi-LSTM network weights and hyperparameters
All network parameters are fitted to the experimental training waveforms; their specific values are not reported in the abstract.

axioms (1)

domain assumption End-to-end system dynamics of a VCSEL-PAM-4 link can be adequately captured by a recurrent neural network trained on input-output waveform pairs.
This premise justifies replacing rate-equation models with the data-driven Bi-LSTM approach.

pith-pipeline@v0.9.0 · 5676 in / 1376 out tokens · 37566 ms · 2026-05-20T08:41:31.965280+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

15 extracted references · 15 canonical work pages

[1]

Deep learning for the design of photonic structures,

W. Ma, Z. Liu, Z. A. Kudyshev, A. Boltasseva, W. Cai, and Y. Liu, "Deep learning for the design of photonic structures," Nat. Photonics 15, 77–90 (2021)

work page 2021
[2]

End-to-End Learning for VCSEL-Based Optical Interconnects: State-of-the-Art, Challenges, and Opportunities,

M. Srinivasan, J. Song, A. Grabowski, K. Szczerba, H. K. Iversen, M. N. Schmidt, D. Zibar, J. Schroder, A. Larsson, C. Hager, and H. Wymeersch, "End-to-End Learning for VCSEL-Based Optical Interconnects: State-of-the-Art, Challenges, and Opportunities," Journal of Lightwave Technology 41, 3261–3277 (2023)

work page 2023
[3]

Statistics of transverse mode turn-on dynamics in VCSELs,

J. Dellunde, M. C. Torrent, J. M. Sancho, and K. A. Shore, "Statistics of transverse mode turn-on dynamics in VCSELs," IEEE J. Quantum Electron. 33, 1197–1204 (2002)

work page 2002
[4]

Machine Learning- based Model for Defining Circuit-level Parameters of VCSEL,

I. Khan, L. Tunesi, M. U. Masood, E. Ghillino, V. Curri, A. Carena, and P. Bardella, "Machine Learning- based Model for Defining Circuit-level Parameters of VCSEL," in 2022 International Conference on Software, Telecommunications and Computer Networks (SoftCOM) (IEEE, 2022), pp. 1–6

work page 2022
[5]

Deep neural networks for the evaluation and design of photonic devices,

J. Jiang, M. Chen, and J. A. Fan, "Deep neural networks for the evaluation and design of photonic devices," Nat. Rev. Mater. 6, 679–700 (2021)

work page 2021
[6]

Accurate deep learning based method for real-time directly modulated laser modeling,

Q. Zhang, S. Jia, T. Zhang, and J. Yu, "Accurate deep learning based method for real-time directly modulated laser modeling," Opt. Express 33, 2360 (2025)

work page 2025
[7]

Experimental End-to-End Optimization of Directly Modulated Laser-based IM/DD Transmission,

S. Hernandez, C. Peucheret, F. Da Ros, and D. Zibar, "Experimental End-to-End Optimization of Directly Modulated Laser-based IM/DD Transmission," Journal of Lightwave Technology (2025)

work page 2025
[8]

Modeling spatial hole burning and mode competition in index-guided VCSELs,

R. Schatz and M. Peeters, "Modeling spatial hole burning and mode competition in index-guided VCSELs," in VCSELs and Optical Interconnects (2003), Vol. 4942, pp. 158–169

work page 2003
[9]

High-speed VCSELs for short reach communication,

A. Larsson, P. Westbergh, J. Gustavsson, Å. Haglund, and B. Kögel, "High-speed VCSELs for short reach communication," Semicond. Sci. Technol. 26, 14017 (2011)

work page 2011
[10]

Deep-Learning–based VCSEL transmitter emulator,

S. Deligiannidis, N. Argyris, S. Dris, D. Kalavrouziotis, P. Bakopoulos, C. Mesaritakis, and A. Bogris, "Deep-Learning–based VCSEL transmitter emulator," in European Quantum Electronics Conference (2023), p. ej_3_4

work page 2023
[11]

Bidirectional recurrent neural networks,

M. Schuster and K. K. Paliwal, "Bidirectional recurrent neural networks," IEEE transactions on Signal Processing 45, 2673–2681 (1997)

work page 1997
[12]

LONG SHORT-TERM MEMORY,

S. Hochreiter and J. Schmidhuber, "LONG SHORT-TERM MEMORY," Neural Comput. 9, 1735–1780 (1997)

work page 1997
[13]

Information processing using a single dynamical node as complex system,

L. Appeltant, M. C. Soriano, G. Van Der Sande, J. Danckaert, S. Massar, J. Dambre, B. Schrauwen, C. R. Mirasso, and I. Fischer, "Information processing using a single dynamical node as complex system," Nat. Commun. 2, 1–6 (2011)

work page 2011
[14]

Performance and Complexity Analysis of Bi-Directional Recurrent Neural Network Models Versus Volterra Nonlinear Equalizers in Digital Coherent Systems,

S. Deligiannidis, C. Mesaritakis, and A. Bogris, "Performance and Complexity Analysis of Bi-Directional Recurrent Neural Network Models Versus Volterra Nonlinear Equalizers in Digital Coherent Systems," Journal of Lightwave Technology 39, 5791–5798 (2021)

work page 2021
[15]

Formalizing generalization and adversarial robustness of neural networks to weight perturbations,

Y.-L. Tsai, C.-Y. Hsu, C.-M. Yu, and P.-Y. Chen, "Formalizing generalization and adversarial robustness of neural networks to weight perturbations," in Proceedings of the 35th International Conference on Neural Information Processing Systems, NIPS ’21 (Curran Associates Inc., 2021)

work page 2021

[1] [1]

Deep learning for the design of photonic structures,

W. Ma, Z. Liu, Z. A. Kudyshev, A. Boltasseva, W. Cai, and Y. Liu, "Deep learning for the design of photonic structures," Nat. Photonics 15, 77–90 (2021)

work page 2021

[2] [2]

End-to-End Learning for VCSEL-Based Optical Interconnects: State-of-the-Art, Challenges, and Opportunities,

M. Srinivasan, J. Song, A. Grabowski, K. Szczerba, H. K. Iversen, M. N. Schmidt, D. Zibar, J. Schroder, A. Larsson, C. Hager, and H. Wymeersch, "End-to-End Learning for VCSEL-Based Optical Interconnects: State-of-the-Art, Challenges, and Opportunities," Journal of Lightwave Technology 41, 3261–3277 (2023)

work page 2023

[3] [3]

Statistics of transverse mode turn-on dynamics in VCSELs,

J. Dellunde, M. C. Torrent, J. M. Sancho, and K. A. Shore, "Statistics of transverse mode turn-on dynamics in VCSELs," IEEE J. Quantum Electron. 33, 1197–1204 (2002)

work page 2002

[4] [4]

Machine Learning- based Model for Defining Circuit-level Parameters of VCSEL,

I. Khan, L. Tunesi, M. U. Masood, E. Ghillino, V. Curri, A. Carena, and P. Bardella, "Machine Learning- based Model for Defining Circuit-level Parameters of VCSEL," in 2022 International Conference on Software, Telecommunications and Computer Networks (SoftCOM) (IEEE, 2022), pp. 1–6

work page 2022

[5] [5]

Deep neural networks for the evaluation and design of photonic devices,

J. Jiang, M. Chen, and J. A. Fan, "Deep neural networks for the evaluation and design of photonic devices," Nat. Rev. Mater. 6, 679–700 (2021)

work page 2021

[6] [6]

Accurate deep learning based method for real-time directly modulated laser modeling,

Q. Zhang, S. Jia, T. Zhang, and J. Yu, "Accurate deep learning based method for real-time directly modulated laser modeling," Opt. Express 33, 2360 (2025)

work page 2025

[7] [7]

Experimental End-to-End Optimization of Directly Modulated Laser-based IM/DD Transmission,

S. Hernandez, C. Peucheret, F. Da Ros, and D. Zibar, "Experimental End-to-End Optimization of Directly Modulated Laser-based IM/DD Transmission," Journal of Lightwave Technology (2025)

work page 2025

[8] [8]

Modeling spatial hole burning and mode competition in index-guided VCSELs,

R. Schatz and M. Peeters, "Modeling spatial hole burning and mode competition in index-guided VCSELs," in VCSELs and Optical Interconnects (2003), Vol. 4942, pp. 158–169

work page 2003

[9] [9]

High-speed VCSELs for short reach communication,

A. Larsson, P. Westbergh, J. Gustavsson, Å. Haglund, and B. Kögel, "High-speed VCSELs for short reach communication," Semicond. Sci. Technol. 26, 14017 (2011)

work page 2011

[10] [10]

Deep-Learning–based VCSEL transmitter emulator,

S. Deligiannidis, N. Argyris, S. Dris, D. Kalavrouziotis, P. Bakopoulos, C. Mesaritakis, and A. Bogris, "Deep-Learning–based VCSEL transmitter emulator," in European Quantum Electronics Conference (2023), p. ej_3_4

work page 2023

[11] [11]

Bidirectional recurrent neural networks,

M. Schuster and K. K. Paliwal, "Bidirectional recurrent neural networks," IEEE transactions on Signal Processing 45, 2673–2681 (1997)

work page 1997

[12] [12]

LONG SHORT-TERM MEMORY,

S. Hochreiter and J. Schmidhuber, "LONG SHORT-TERM MEMORY," Neural Comput. 9, 1735–1780 (1997)

work page 1997

[13] [13]

Information processing using a single dynamical node as complex system,

L. Appeltant, M. C. Soriano, G. Van Der Sande, J. Danckaert, S. Massar, J. Dambre, B. Schrauwen, C. R. Mirasso, and I. Fischer, "Information processing using a single dynamical node as complex system," Nat. Commun. 2, 1–6 (2011)

work page 2011

[14] [14]

Performance and Complexity Analysis of Bi-Directional Recurrent Neural Network Models Versus Volterra Nonlinear Equalizers in Digital Coherent Systems,

S. Deligiannidis, C. Mesaritakis, and A. Bogris, "Performance and Complexity Analysis of Bi-Directional Recurrent Neural Network Models Versus Volterra Nonlinear Equalizers in Digital Coherent Systems," Journal of Lightwave Technology 39, 5791–5798 (2021)

work page 2021

[15] [15]

Formalizing generalization and adversarial robustness of neural networks to weight perturbations,

Y.-L. Tsai, C.-Y. Hsu, C.-M. Yu, and P.-Y. Chen, "Formalizing generalization and adversarial robustness of neural networks to weight perturbations," in Proceedings of the 35th International Conference on Neural Information Processing Systems, NIPS ’21 (Curran Associates Inc., 2021)

work page 2021