pith. sign in

arxiv: 2604.18035 · v1 · submitted 2026-04-20 · 💻 cs.LG

Variational Autoencoder Domain Adaptation for Cross-System Generalization in ML-Based SOP Monitoring

Pith reviewed 2026-05-10 04:37 UTC · model grok-4.3

classification 💻 cs.LG
keywords variational autoencoderdomain adaptationoptical fiber monitoringstate of polarizationcross-system generalizationmachine learningphysical layer threatsunsupervised pretraining
0
0 comments X

The pith

A variational autoencoder learns shared representations that let threat detectors trained on one optical system work on another.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Optical fiber systems vary in wavelength, fiber properties, and architecture, causing machine learning models trained to detect physical-layer threats on one system to fail on another. The paper uses a variational autoencoder trained on unlabeled data from two such systems to learn a representation focused on common event signatures. Once the encoder is trained and frozen, a classifier trained on labels from a single system can then be applied to the other system. This yields cross-system accuracies of 95.3% and 73.5%, representing large improvements over a baseline deep neural network trained on one system alone, while maintaining performance within each system. The result is relevant because it addresses the practical barrier of having to collect and label new data for every distinct optical network.

Core claim

Training a VAE on combined unlabeled data from a 21 km O-band dark-fiber testbed and a 63.4 km C-band live metro ring produces an encoder that captures event signatures common to both while suppressing system-specific differences; freezing this encoder and training a classifier on one system's labels then enables effective cross-system generalization in SOP monitoring.

What carries the argument

Variational autoencoder encoder trained on pooled unlabeled data to extract domain-invariant features for classification.

If this is right

  • Cross-system accuracy reaches 95.3 percent when adapting from the testbed to the metro ring and 73.5 percent in the opposite direction.
  • These accuracies represent gains of 83.4 percent and 51 percent over a fully supervised DNN baseline.
  • Intra-system performance is preserved, avoiding any loss on the original training system.
  • The framework applies to both a controlled testbed and a live operational network.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The need for labeled data collection can be limited to one system if unlabeled data is available from others.
  • Similar adaptation techniques could help in other monitoring tasks where sensor data shifts due to hardware or environmental differences.
  • Extending the unlabeled training pool to additional systems might improve robustness to further variations.
  • Testing the method on systems with more extreme differences would reveal the limits of the shared representation.

Load-bearing premise

The VAE can learn a latent representation from mixed unlabeled data that isolates common physical event patterns from system-specific variations enough to support cross-system classification.

What would settle it

A test on a third optical system where the VAE-adapted classifier performs no better than the unadapted baseline would show that the domain adaptation does not reliably capture transferable signatures.

Figures

Figures reproduced from arXiv: 2604.18035 by Carlos Natalino, Eoin Kenny, Fehmida Usmani, Lena Wosinska, Leyla Sadighi, Marco Ruffini, Marija Furdek, Mojtaba Eshghie, Paolo Monti, Stefan Karlsson.

Figure 1
Figure 1. Figure 1: Schematic of experimental testbed comprising System 1 and System 2. standard SM G.652 fiber. As a field-deployed fiber, this link introduces realistic impairments representative of production in￾frastructure. The second, System 2, operates in the C-band over a live production metro ring network in Dublin city, operated by Ireland’s National Education and Research Network, Asiera (for￾merly HEAnet) [31]. Th… view at source ↗
Figure 2
Figure 2. Figure 2: t-SNE projection of 500 samples per system, colored by event class. Filled markers denote System 1 and open markers denote System 2. ✚ RB1 (L=1) h (1) = x (512-D) h (2) Linear 𝐹𝐶1 Batch Norm Activation Dropout ✚ RB2 (L=2) h (3) Batch Norm Activation Dropout .…….. ⋮ rlx eav sbd 𝑝𝑠𝑏𝑑 𝑝𝑟𝑙𝑥 𝑝𝑒𝑎𝑣 Classification Head Softmax ℎ (𝟐) Linear 𝐹𝐶2 h (2) h (𝐍𝐝𝐧𝐧) ✚ h (Ndnn) Batch Norm Activation Dropout ℎ (𝑵𝒅𝒏𝒏−𝟏) Line… view at source ↗
Figure 3
Figure 3. Figure 3: Architecture of the Fully Connected (FC) DNN baseline with sequential Residual Blocks (RBs). intra-system performance ceiling and the cross-system general￾ization gap, and then present the proposed VAE-based frame￾work designed to bridge this gap. A. Fully Connected (FC) Residual DNN Baseline As a supervised benchmark, we employ an FC residual DNN model that operates directly on the 512-D SOP feature vec￾t… view at source ↗
Figure 4
Figure 4. Figure 4: General architecture of the proposed VAE framework comprises three components: an encoder, a decoder, and a classification head. the encoder E(x), comprising Nvae stacked FC layers with BN and DP regularization, maps x to the parameters of a posterior Gaussian distribution over a low-D latent space: µ, log σ 2 = E(x), qϕ(z | x) = N  µ, diag(σ 2 )  , (8) where µ, σ 2 ∈ Rdz are the latent mean and variance… view at source ↗
Figure 5
Figure 5. Figure 5: Test-set accuracy comparison across all four evaluation scenarios for the DNN baseline, VAEsgl, and VAEcmb. B. Experimental Results and Performance Analysis We first compare the overall accuracy of the frameworks across the four scenarios. We then analyze its classification behavior in depth through per-scenario confusion matrices and per-class comparisons against the DNN baseline. B.1. Overall Accuracy Co… view at source ↗
Figure 6
Figure 6. Figure 6: Confusion matrices of VAEcmb for all four evaluation scenarios. Values are per-class accuracy (%). S1 S2 S3 S4 0 25 50 75 100 Accuracy (%) 90.3 99.9 96.7 96.1 91.8 99.9 0.0 13.6 78.9 89.7 99.0 77.9 70.4 91.8 99.0 44.3 86.7 91.4 89.3 46.5 86.4 87.1 11.2 9.7 VAEcomb rlx DNN rlx VAEcomb eav DNN eav VAEcomb sbd DNN sbd [PITH_FULL_IMAGE:figures/full_fig_p009_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Per-class accuracy of VAEcmb vs. the DNN baseline across all four scenarios. remains the most challenging class, with 51.9% of samples mis￾classified as eav. Per-Class Comparison with the DNN Baseline [PITH_FULL_IMAGE:figures/full_fig_p009_7.png] view at source ↗
read the original abstract

Machine learning (ML) models trained to detect physical-layer threats on one optical fiber system often fail catastrophically when applied to a different system, due to variations in operating wavelength, fiber properties, and network architecture. To overcome this, we propose a Domain Adaptation (DA) framework based on a Variational Autoencoder (VAE) that learns a shared representation capturing event signatures common to both systems while suppressing system-specific differences. The shared encoder is first trained on the combined data from two distinct optical systems: a 21 km O-band dark-fiber testbed (System 1) and a 63.4 km C-band live metro ring (System 2). The encoder is then frozen, and a classifier is trained using labels from an individual system. The proposed approach achieves 95.3% and 73.5% cross-system accuracy when moving from System 1 to System 2 and vice versa, respectively. This corresponds to gains of 83.4% and 51% over a fully supervised Deep Neural Network (DNN) baseline trained on a single system, while preserving intra-system performance.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper proposes a domain adaptation framework using a variational autoencoder (VAE) trained on pooled unlabeled data from two optical fiber systems (21 km O-band dark-fiber and 63.4 km C-band live metro ring) to learn shared representations of event signatures for SOP monitoring. The encoder is frozen after training, a classifier is trained on labels from one system, and cross-system generalization is claimed with accuracies of 95.3% (System 1 to 2) and 73.5% (System 2 to 1), representing gains of 83.4% and 51% over a single-system supervised DNN baseline while preserving intra-system performance.

Significance. If the cross-system gains are shown to arise from genuinely invariant features rather than residual domain cues, the work could enable more practical deployment of ML-based physical-layer monitoring across heterogeneous optical networks without requiring per-system labeled data collection. The empirical numbers suggest a potentially useful engineering approach, though the absence of mechanistic validation limits the strength of the contribution.

major comments (2)
  1. [Proposed VAE Domain Adaptation Framework] The VAE training procedure (described in the abstract and method) uses a standard reconstruction-plus-KL objective on combined unlabeled data from both systems but includes no explicit term (e.g., adversarial domain classifier, MMD, or mutual-information penalty) to suppress system-specific information in the latent codes. Consequently the central claim that the frozen encoder produces representations from which a one-system classifier generalizes to the other rests on an unverified assumption rather than a demonstrated property of the objective.
  2. [Experimental Results] The reported cross-system accuracies (95.3 % and 73.5 %) and the stated gains over the DNN baseline are presented without dataset cardinalities, train/test split ratios, number of independent runs, or any statistical significance tests or error bars. This information is required to determine whether the numerical improvements are robust or could be explained by favorable data partitioning or overfitting to residual domain statistics.
minor comments (1)
  1. [Abstract] The abstract states concrete accuracy figures and percentage gains but provides no accompanying context on data volume or evaluation protocol; moving these details into the abstract or adding a short experimental-setup paragraph would improve readability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. We address the major comments point by point below and will revise the manuscript accordingly to improve clarity and rigor.

read point-by-point responses
  1. Referee: [Proposed VAE Domain Adaptation Framework] The VAE training procedure (described in the abstract and method) uses a standard reconstruction-plus-KL objective on combined unlabeled data from both systems but includes no explicit term (e.g., adversarial domain classifier, MMD, or mutual-information penalty) to suppress system-specific information in the latent codes. Consequently the central claim that the frozen encoder produces representations from which a one-system classifier generalizes to the other rests on an unverified assumption rather than a demonstrated property of the objective.

    Authors: We agree that the training procedure relies on a standard VAE objective without an explicit domain-invariance penalty. The central hypothesis is that pooling unlabeled data from both systems and optimizing for reconstruction across domains encourages the encoder to prioritize shared event signatures over system-specific cues. However, this property is not explicitly verified in the current manuscript. In the revision we will add (i) t-SNE visualizations of the latent codes colored by system, (ii) the accuracy of a domain classifier trained on the frozen encoder outputs, and (iii) a brief comparison to an MMD-regularized variant. We will also expand the discussion to acknowledge that the invariance is an emergent rather than enforced property. revision: yes

  2. Referee: [Experimental Results] The reported cross-system accuracies (95.3 % and 73.5 %) and the stated gains over the DNN baseline are presented without dataset cardinalities, train/test split ratios, number of independent runs, or any statistical significance tests or error bars. This information is required to determine whether the numerical improvements are robust or could be explained by favorable data partitioning or overfitting to residual domain statistics.

    Authors: We acknowledge that these experimental details were omitted. The revised manuscript will report: the total number of samples per system and per event class (both labeled and unlabeled), the exact train/test split ratios used for each experiment, results averaged over five independent runs with different random seeds, standard deviations, and p-values from appropriate statistical tests (e.g., paired t-test or Wilcoxon signed-rank) comparing the proposed method against the single-system DNN baseline. These additions will allow readers to assess the robustness of the reported cross-system gains. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical accuracies from VAE+classifier training on pooled data

full rationale

The paper presents a VAE trained on combined unlabeled data from two optical systems, followed by freezing the encoder and training a classifier on labels from one system, with reported test accuracies (95.3% and 73.5%) and gains over a DNN baseline. No equations, derivations, or self-citations are invoked that reduce these results to fitted parameters by construction or to a self-referential loop. The central claims rest on experimental outcomes rather than any mathematical reduction of the form 'prediction equals input fit'. The approach is self-contained against external benchmarks via direct cross-system testing.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that a VAE can separate shared event features from system-specific variations when trained on combined data from two heterogeneous optical systems.

axioms (1)
  • domain assumption VAE latent space can capture common event signatures across systems while suppressing differences
    Invoked in the description of the shared encoder training on combined data from System 1 and System 2.

pith-pipeline@v0.9.0 · 5534 in / 1200 out tokens · 30607 ms · 2026-05-10T04:37:44.296018+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

39 extracted references · 39 canonical work pages

  1. [1]

    Cognitive assurance architecture for optical network fault management,

    D. Rafique, T. Szyrkowiec, H. Grießer, A. Autenrieth, and J.-P . Elbers, “Cognitive assurance architecture for optical network fault management,” Journal of Lightwave Technology36, 1443–1450 (2018)

  2. [2]

    Optical fiber sensors in physical intrusion detection systems: A review,

    G. Allwood, G. Wild, and S. Hinckley, “Optical fiber sensors in physical intrusion detection systems: A review,” IEEE Sensors Journal16, 5497– 5509 (2016)

  3. [3]

    Overview on the state of polarization sensing: Application scenarios and anomaly detection algorithms,

    S. Pellegrini, L. Minelli, L. Andrenacci, G. Rizzelli, D. Pilori, G. Bosco, L. Della Chiesa, C. Crognale, S. Piciaccia, and R. Gaudino, “Overview on the state of polarization sensing: Application scenarios and anomaly detection algorithms,” Journal of Optical Communications and Network- ing17, A196–A209 (2025)

  4. [4]

    Distributed optical fiber sensing: Review and perspective,

    P . Lu, N. Lalam, M. Badaret al., “Distributed optical fiber sensing: Review and perspective,” Applied Physics Reviews6(2019)

  5. [5]

    Polarization sensing of network health and seismic activity over a live terrestrial fiber-optic cable,

    C. J. Carver and X. Zhou, “Polarization sensing of network health and seismic activity over a live terrestrial fiber-optic cable,” Communications Engineering3, 91 (2024)

  6. [6]

    Machine learning- based polarization signature analysis for detection and categorization of eavesdropping and harmful events,

    L. Sadighi, S. Karlsson, C. Natalino, and M. Furdek, “Machine learning- based polarization signature analysis for detection and categorization of eavesdropping and harmful events,” inOptical Fiber Communications Conference and Exhibition (OFC),(2024), p. M1H.1

  7. [7]

    Generalizability of ML-based classification of state of polarization signatures across different bands and links,

    L. Sadighi, C. Natalino, S. Karlsson, M. Ruffini, E. Kenny, L. Wosinska, and M. Furdek, “Generalizability of ML-based classification of state of polarization signatures across different bands and links,” in2025 European Conference on Optical Communications (ECOC),(2025), pp. 1–4

  8. [8]

    A survey on transfer learning,

    S. J. Pan and Q. Y ang, “A survey on transfer learning,” IEEE Transac- tions on Knowledge and Data Engineering22, 1345–1359 (2010)

  9. [9]

    Auto-encoding variational Bayes,

    D. P . Kingma and M. Welling, “Auto-encoding variational Bayes,” in2nd International Conference on Learning Representations (ICLR),(Banff, AB, Canada, 2014)

  10. [10]

    Machine learning opportunities for integrated polarization sensing and communication in optical fibers,

    A. Rode, M. Farsi, V. Lauinger, M. Karlsson, E. Agrell, L. Schmalen, and C. Häger, “Machine learning opportunities for integrated polarization sensing and communication in optical fibers,” Optical Fiber Technology 89, 103924 (2025)

  11. [11]

    Machine learn- ing analysis of polarization signatures for distinguishing harmful from non-harmful fiber events,

    L. Sadighi, S. Karlsson, L. Wosinska, and M. Furdek, “Machine learn- ing analysis of polarization signatures for distinguishing harmful from non-harmful fiber events,” inInternational Conference on T ransparent Optical Networks (ICTON),(2024)

  12. [12]

    Detection and classification of eavesdropping and me- chanical vibrations in fiber optical networks by analyzing polarization signatures over a noisy environment,

    L. Sadighi, S. Karlsson, C. Natalino, L. Wosinska, M. Ruffini, and M. Furdek, “Detection and classification of eavesdropping and me- chanical vibrations in fiber optical networks by analyzing polarization signatures over a noisy environment,” inEuropean Conference on Optical Communication (ECOC),(2024), pp. 527–530

  13. [13]

    Deep learning for detection of harmful events in real-world, noisy optical fiber deployments,

    L. Sadighi, S. Karlsson, C. Natalino, L. Wosinska, M. Ruffini, and M. Furdek, “Deep learning for detection of harmful events in real-world, noisy optical fiber deployments,” J. Lightwave Technol.43, 6092–6101 (2025)

  14. [14]

    Enhancing fiber security using a simple state of polarization analyzer and machine learning,

    A. Tomasov, P . Dejdar, P . Munster, T. Horvath, P . Barcik, and F . Da Ros, “Enhancing fiber security using a simple state of polarization analyzer and machine learning,” Optics & Laser Technology167, 109668 (2023)

  15. [15]

    Vision transformers for anomaly classification and localiza- tion in optical networks using SOP spectrograms,

    K. Abdelli, M. Lonardi, J. Gripp, D. Correa, S. Olsson, F . Boitier, and P . Layec, “Vision transformers for anomaly classification and localiza- tion in optical networks using SOP spectrograms,” Journal of Lightwave Technology (2025)

  16. [16]

    De- tection of abnormal activities on a SM or MM fiber,

    S. Karlsson, M. Andersson, R. Lin, L. Wosinska, and P . Monti, “De- tection of abnormal activities on a SM or MM fiber,” inOptical Fiber Communication Conference (OFC),(Optica Publishing Group, San Diego, CA, USA, 2023), p. M3Z.6

  17. [17]

    Convolutional neural networks for fiber-bending eavesdropping attacks detection in coherent optical communication systems,

    W. Qin, Q. Zhang, W. Hou, X. Zhang, and X. Gong, “Convolutional neural networks for fiber-bending eavesdropping attacks detection in coherent optical communication systems,” inInternational Conference on Ubiquitous Communication (Ucom),(Xi’an, China, 2024)

  18. [18]

    Ml-based state of polarization analysis to detect emerging threats to optical fiber security,

    L. Sadighi, S. Karlsson, C. Natalino, and M. Furdek, “Ml-based state of polarization analysis to detect emerging threats to optical fiber security,” IEEE Transactions on Network and Service Management23, 432–442 (2026)

  19. [19]

    Cluster-based unsupervised method for eavesdrop- ping detection and localization in WDM systems,

    H. Song, R. Lin, L. Wosinska, P . Monti, M. Zhang, Y . Liang, Y . Li, and J. Zhang, “Cluster-based unsupervised method for eavesdrop- ping detection and localization in WDM systems,” Journal of Optical Communications and Networking16, F52–F61 (2024)

  20. [20]

    SOP-based DSP blind anomaly detection for sensing on deployed metropolitan fibers,

    L. Minelli, S. Pellegrini, L. Andrenacci, D. Pilori, G. Bosco, L. Della Chiesa, A. Tanzi, C. Crognale, and R. Gaudino, “SOP-based DSP blind anomaly detection for sensing on deployed metropolitan fibers,” inEuropean Conference on Optical Communications (ECOC), (Glasgow, UK, 2023), pp. 519–522

  21. [21]

    On the benefits of domain adaptation techniques for quality of transmission estimation in optical networks,

    C. Rottondi, R. di Marino, M. Nava, A. Giusti, and A. Bianco, “On the benefits of domain adaptation techniques for quality of transmission estimation in optical networks,” Journal of Optical Communications and Networking13, A34–A43 (2021)

  22. [22]

    Domain adaptation and trans- fer learning for failure detection and failure-cause identification in op- tical networks across different lightpaths [invited],

    F . Musumeci, V. Garbhapu Venkata, Y . Hirota, Y . Awaji, S. Xu, M. Shi- raiwa, B. Mukherjee, and M. Tornatore, “Domain adaptation and trans- fer learning for failure detection and failure-cause identification in op- tical networks across different lightpaths [invited],” Journal of Optical Communications and Networking14, A91–A100 (2022)

  23. [23]

    Ann-based transfer learning for qot prediction in real-time mixed line- rate systems,

    W. Mo, Y .-K. Huang, S. Zhang, E. Ip, D. C. Kilper, Y . Aono, and T. Tajima, “Ann-based transfer learning for qot prediction in real-time mixed line- rate systems,” in2018 Optical Fiber Communications Conference and Exposition (OFC),(2018), pp. 1–3

  24. [24]

    Model transfer of QoT prediction in optical networks based on artificial neural networks,

    J. Yu, W. Mo, Y .-K. Huang, E. Ip, and D. C. Kilper, “Model transfer of QoT prediction in optical networks based on artificial neural networks,” Journal of Optical Communications and Networking11, C48–C57 (2019)

  25. [25]

    Do- main adversarial adaptation framework for few-shot QoT estimation in optical networks,

    Z. Cai, Q. Wang, Y . Deng, P . Zhang, G. Zhou, Y . Li, and F . N. Khan, “Do- main adversarial adaptation framework for few-shot QoT estimation in optical networks,” Journal of Optical Communications and Networking 16, 1133–1144 (2024)

  26. [26]

    Risky event classification leveraging transfer learning for very limited datasets in optical networks,

    K. Abdelli, M. Lonardi, J. Gripp, S. Olsson, F . Boitier, and P . Layec, “Risky event classification leveraging transfer learning for very limited datasets in optical networks,” Journal of Optical Communications and Networking16, C51–C68 (2024). 11

  27. [27]

    DIVA: Domain invariant variational autoencoders,

    M. Ilse, J. M. Tomczak, C. Louizos, and M. Welling, “DIVA: Domain invariant variational autoencoders,” inProceedings of the Third Confer- ence on Medical Imaging with Deep Learning,vol. 121 ofProceedings of Machine Learning Research(PMLR, 2020), pp. 322–348

  28. [28]

    The varia- tional fair autoencoder,

    C. Louizos, K. Swersky, Y . Li, M. Welling, and R. S. Zemel, “The varia- tional fair autoencoder,” in4th International Conference on Learning Representations (ICLR),(2016)

  29. [29]

    Unsupervised domain adaptation for robust speech recognition via variational autoencoder-based data augmentation,

    W.-N. Hsu, Y . Zhang, and J. Glass, “Unsupervised domain adaptation for robust speech recognition via variational autoencoder-based data augmentation,” inIEEE Automatic Speech Recognition and Under- standing Workshop (ASRU),(IEEE, 2017), pp. 16–23

  30. [30]

    Open Ireland Testbed,

    CONNECT Centre for Future Networks and Communications, “Open Ireland Testbed,” (2020). Available here

  31. [31]

    Ireland’s National Education and Re- search Network,

    ASIERA (formerly HEAnet), “Ireland’s National Education and Re- search Network,” (2025). Available here

  32. [32]

    A method for detecting external influence on an optical cable,

    S. Karlsson, “A method for detecting external influence on an optical cable,” (1994). Filed Mar. 9, 1989; granted Oct. 19, 1994

  33. [33]

    Comparative performance analysis of hamming, hanning and blackman window,

    P . Podder, T. Z. Khan, M. H. Khan, and M. M. Rahman, “Comparative performance analysis of hamming, hanning and blackman window,” International Journal of Computer Applications96, 1–7 (2014)

  34. [34]

    Visualizing data using t-sne

    L. Van der Maaten and G. Hinton, “Visualizing data using t-sne.” Journal of machine learning research9(2008)

  35. [35]

    Batch normalization: Accelerating deep network training by reducing internal covariate shift,

    S. Ioffe and C. Szegedy, “Batch normalization: Accelerating deep network training by reducing internal covariate shift,” inProceedings of the 32nd International Conference on Machine Learning (ICML), (PMLR, 2015), pp. 448–456

  36. [36]

    Dropout: A simple way to prevent neural networks from overfitting,

    N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdi- nov, “Dropout: A simple way to prevent neural networks from overfitting,” Journal of Machine Learning Research (JMLR)15, 1929–1958 (2014)

  37. [37]

    Focal loss for dense object detection,

    T.-Y . Lin, P . Goyal, R. Girshick, K. He, and P . Dollár, “Focal loss for dense object detection,” inProceedings of the IEEE International Con- ference on Computer Vision (ICCV),(IEEE, 2017), pp. 2980–2988

  38. [38]

    Optuna: A next- generation hyperparameter optimization framework,

    T. Akiba, S. Sano, T. Y anase, T. Ohta, and M. Koyama, “Optuna: A next- generation hyperparameter optimization framework,” inProceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD),(ACM, 2019), pp. 2623–2631

  39. [39]

    PyTorch: An im- perative style, high-performance deep learning library,

    A. Paszke, S. Gross, F . Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antigaet al., “PyTorch: An im- perative style, high-performance deep learning library,” inAdvances in Neural Information Processing Systems (NeurIPS),vol. 32 (Curran Associates, Inc., 2019)