Variational Autoencoder Domain Adaptation for Cross-System Generalization in ML-Based SOP Monitoring
Pith reviewed 2026-05-10 04:37 UTC · model grok-4.3
The pith
A variational autoencoder learns shared representations that let threat detectors trained on one optical system work on another.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Training a VAE on combined unlabeled data from a 21 km O-band dark-fiber testbed and a 63.4 km C-band live metro ring produces an encoder that captures event signatures common to both while suppressing system-specific differences; freezing this encoder and training a classifier on one system's labels then enables effective cross-system generalization in SOP monitoring.
What carries the argument
Variational autoencoder encoder trained on pooled unlabeled data to extract domain-invariant features for classification.
If this is right
- Cross-system accuracy reaches 95.3 percent when adapting from the testbed to the metro ring and 73.5 percent in the opposite direction.
- These accuracies represent gains of 83.4 percent and 51 percent over a fully supervised DNN baseline.
- Intra-system performance is preserved, avoiding any loss on the original training system.
- The framework applies to both a controlled testbed and a live operational network.
Where Pith is reading between the lines
- The need for labeled data collection can be limited to one system if unlabeled data is available from others.
- Similar adaptation techniques could help in other monitoring tasks where sensor data shifts due to hardware or environmental differences.
- Extending the unlabeled training pool to additional systems might improve robustness to further variations.
- Testing the method on systems with more extreme differences would reveal the limits of the shared representation.
Load-bearing premise
The VAE can learn a latent representation from mixed unlabeled data that isolates common physical event patterns from system-specific variations enough to support cross-system classification.
What would settle it
A test on a third optical system where the VAE-adapted classifier performs no better than the unadapted baseline would show that the domain adaptation does not reliably capture transferable signatures.
Figures
read the original abstract
Machine learning (ML) models trained to detect physical-layer threats on one optical fiber system often fail catastrophically when applied to a different system, due to variations in operating wavelength, fiber properties, and network architecture. To overcome this, we propose a Domain Adaptation (DA) framework based on a Variational Autoencoder (VAE) that learns a shared representation capturing event signatures common to both systems while suppressing system-specific differences. The shared encoder is first trained on the combined data from two distinct optical systems: a 21 km O-band dark-fiber testbed (System 1) and a 63.4 km C-band live metro ring (System 2). The encoder is then frozen, and a classifier is trained using labels from an individual system. The proposed approach achieves 95.3% and 73.5% cross-system accuracy when moving from System 1 to System 2 and vice versa, respectively. This corresponds to gains of 83.4% and 51% over a fully supervised Deep Neural Network (DNN) baseline trained on a single system, while preserving intra-system performance.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes a domain adaptation framework using a variational autoencoder (VAE) trained on pooled unlabeled data from two optical fiber systems (21 km O-band dark-fiber and 63.4 km C-band live metro ring) to learn shared representations of event signatures for SOP monitoring. The encoder is frozen after training, a classifier is trained on labels from one system, and cross-system generalization is claimed with accuracies of 95.3% (System 1 to 2) and 73.5% (System 2 to 1), representing gains of 83.4% and 51% over a single-system supervised DNN baseline while preserving intra-system performance.
Significance. If the cross-system gains are shown to arise from genuinely invariant features rather than residual domain cues, the work could enable more practical deployment of ML-based physical-layer monitoring across heterogeneous optical networks without requiring per-system labeled data collection. The empirical numbers suggest a potentially useful engineering approach, though the absence of mechanistic validation limits the strength of the contribution.
major comments (2)
- [Proposed VAE Domain Adaptation Framework] The VAE training procedure (described in the abstract and method) uses a standard reconstruction-plus-KL objective on combined unlabeled data from both systems but includes no explicit term (e.g., adversarial domain classifier, MMD, or mutual-information penalty) to suppress system-specific information in the latent codes. Consequently the central claim that the frozen encoder produces representations from which a one-system classifier generalizes to the other rests on an unverified assumption rather than a demonstrated property of the objective.
- [Experimental Results] The reported cross-system accuracies (95.3 % and 73.5 %) and the stated gains over the DNN baseline are presented without dataset cardinalities, train/test split ratios, number of independent runs, or any statistical significance tests or error bars. This information is required to determine whether the numerical improvements are robust or could be explained by favorable data partitioning or overfitting to residual domain statistics.
minor comments (1)
- [Abstract] The abstract states concrete accuracy figures and percentage gains but provides no accompanying context on data volume or evaluation protocol; moving these details into the abstract or adding a short experimental-setup paragraph would improve readability.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback. We address the major comments point by point below and will revise the manuscript accordingly to improve clarity and rigor.
read point-by-point responses
-
Referee: [Proposed VAE Domain Adaptation Framework] The VAE training procedure (described in the abstract and method) uses a standard reconstruction-plus-KL objective on combined unlabeled data from both systems but includes no explicit term (e.g., adversarial domain classifier, MMD, or mutual-information penalty) to suppress system-specific information in the latent codes. Consequently the central claim that the frozen encoder produces representations from which a one-system classifier generalizes to the other rests on an unverified assumption rather than a demonstrated property of the objective.
Authors: We agree that the training procedure relies on a standard VAE objective without an explicit domain-invariance penalty. The central hypothesis is that pooling unlabeled data from both systems and optimizing for reconstruction across domains encourages the encoder to prioritize shared event signatures over system-specific cues. However, this property is not explicitly verified in the current manuscript. In the revision we will add (i) t-SNE visualizations of the latent codes colored by system, (ii) the accuracy of a domain classifier trained on the frozen encoder outputs, and (iii) a brief comparison to an MMD-regularized variant. We will also expand the discussion to acknowledge that the invariance is an emergent rather than enforced property. revision: yes
-
Referee: [Experimental Results] The reported cross-system accuracies (95.3 % and 73.5 %) and the stated gains over the DNN baseline are presented without dataset cardinalities, train/test split ratios, number of independent runs, or any statistical significance tests or error bars. This information is required to determine whether the numerical improvements are robust or could be explained by favorable data partitioning or overfitting to residual domain statistics.
Authors: We acknowledge that these experimental details were omitted. The revised manuscript will report: the total number of samples per system and per event class (both labeled and unlabeled), the exact train/test split ratios used for each experiment, results averaged over five independent runs with different random seeds, standard deviations, and p-values from appropriate statistical tests (e.g., paired t-test or Wilcoxon signed-rank) comparing the proposed method against the single-system DNN baseline. These additions will allow readers to assess the robustness of the reported cross-system gains. revision: yes
Circularity Check
No circularity: empirical accuracies from VAE+classifier training on pooled data
full rationale
The paper presents a VAE trained on combined unlabeled data from two optical systems, followed by freezing the encoder and training a classifier on labels from one system, with reported test accuracies (95.3% and 73.5%) and gains over a DNN baseline. No equations, derivations, or self-citations are invoked that reduce these results to fitted parameters by construction or to a self-referential loop. The central claims rest on experimental outcomes rather than any mathematical reduction of the form 'prediction equals input fit'. The approach is self-contained against external benchmarks via direct cross-system testing.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption VAE latent space can capture common event signatures across systems while suppressing differences
Reference graph
Works this paper leans on
-
[1]
Cognitive assurance architecture for optical network fault management,
D. Rafique, T. Szyrkowiec, H. Grießer, A. Autenrieth, and J.-P . Elbers, “Cognitive assurance architecture for optical network fault management,” Journal of Lightwave Technology36, 1443–1450 (2018)
work page 2018
-
[2]
Optical fiber sensors in physical intrusion detection systems: A review,
G. Allwood, G. Wild, and S. Hinckley, “Optical fiber sensors in physical intrusion detection systems: A review,” IEEE Sensors Journal16, 5497– 5509 (2016)
work page 2016
-
[3]
S. Pellegrini, L. Minelli, L. Andrenacci, G. Rizzelli, D. Pilori, G. Bosco, L. Della Chiesa, C. Crognale, S. Piciaccia, and R. Gaudino, “Overview on the state of polarization sensing: Application scenarios and anomaly detection algorithms,” Journal of Optical Communications and Network- ing17, A196–A209 (2025)
work page 2025
-
[4]
Distributed optical fiber sensing: Review and perspective,
P . Lu, N. Lalam, M. Badaret al., “Distributed optical fiber sensing: Review and perspective,” Applied Physics Reviews6(2019)
work page 2019
-
[5]
C. J. Carver and X. Zhou, “Polarization sensing of network health and seismic activity over a live terrestrial fiber-optic cable,” Communications Engineering3, 91 (2024)
work page 2024
-
[6]
L. Sadighi, S. Karlsson, C. Natalino, and M. Furdek, “Machine learning- based polarization signature analysis for detection and categorization of eavesdropping and harmful events,” inOptical Fiber Communications Conference and Exhibition (OFC),(2024), p. M1H.1
work page 2024
-
[7]
L. Sadighi, C. Natalino, S. Karlsson, M. Ruffini, E. Kenny, L. Wosinska, and M. Furdek, “Generalizability of ML-based classification of state of polarization signatures across different bands and links,” in2025 European Conference on Optical Communications (ECOC),(2025), pp. 1–4
work page 2025
-
[8]
A survey on transfer learning,
S. J. Pan and Q. Y ang, “A survey on transfer learning,” IEEE Transac- tions on Knowledge and Data Engineering22, 1345–1359 (2010)
work page 2010
-
[9]
Auto-encoding variational Bayes,
D. P . Kingma and M. Welling, “Auto-encoding variational Bayes,” in2nd International Conference on Learning Representations (ICLR),(Banff, AB, Canada, 2014)
work page 2014
-
[10]
A. Rode, M. Farsi, V. Lauinger, M. Karlsson, E. Agrell, L. Schmalen, and C. Häger, “Machine learning opportunities for integrated polarization sensing and communication in optical fibers,” Optical Fiber Technology 89, 103924 (2025)
work page 2025
-
[11]
L. Sadighi, S. Karlsson, L. Wosinska, and M. Furdek, “Machine learn- ing analysis of polarization signatures for distinguishing harmful from non-harmful fiber events,” inInternational Conference on T ransparent Optical Networks (ICTON),(2024)
work page 2024
-
[12]
L. Sadighi, S. Karlsson, C. Natalino, L. Wosinska, M. Ruffini, and M. Furdek, “Detection and classification of eavesdropping and me- chanical vibrations in fiber optical networks by analyzing polarization signatures over a noisy environment,” inEuropean Conference on Optical Communication (ECOC),(2024), pp. 527–530
work page 2024
-
[13]
Deep learning for detection of harmful events in real-world, noisy optical fiber deployments,
L. Sadighi, S. Karlsson, C. Natalino, L. Wosinska, M. Ruffini, and M. Furdek, “Deep learning for detection of harmful events in real-world, noisy optical fiber deployments,” J. Lightwave Technol.43, 6092–6101 (2025)
work page 2025
-
[14]
Enhancing fiber security using a simple state of polarization analyzer and machine learning,
A. Tomasov, P . Dejdar, P . Munster, T. Horvath, P . Barcik, and F . Da Ros, “Enhancing fiber security using a simple state of polarization analyzer and machine learning,” Optics & Laser Technology167, 109668 (2023)
work page 2023
-
[15]
K. Abdelli, M. Lonardi, J. Gripp, D. Correa, S. Olsson, F . Boitier, and P . Layec, “Vision transformers for anomaly classification and localiza- tion in optical networks using SOP spectrograms,” Journal of Lightwave Technology (2025)
work page 2025
-
[16]
De- tection of abnormal activities on a SM or MM fiber,
S. Karlsson, M. Andersson, R. Lin, L. Wosinska, and P . Monti, “De- tection of abnormal activities on a SM or MM fiber,” inOptical Fiber Communication Conference (OFC),(Optica Publishing Group, San Diego, CA, USA, 2023), p. M3Z.6
work page 2023
-
[17]
W. Qin, Q. Zhang, W. Hou, X. Zhang, and X. Gong, “Convolutional neural networks for fiber-bending eavesdropping attacks detection in coherent optical communication systems,” inInternational Conference on Ubiquitous Communication (Ucom),(Xi’an, China, 2024)
work page 2024
-
[18]
Ml-based state of polarization analysis to detect emerging threats to optical fiber security,
L. Sadighi, S. Karlsson, C. Natalino, and M. Furdek, “Ml-based state of polarization analysis to detect emerging threats to optical fiber security,” IEEE Transactions on Network and Service Management23, 432–442 (2026)
work page 2026
-
[19]
Cluster-based unsupervised method for eavesdrop- ping detection and localization in WDM systems,
H. Song, R. Lin, L. Wosinska, P . Monti, M. Zhang, Y . Liang, Y . Li, and J. Zhang, “Cluster-based unsupervised method for eavesdrop- ping detection and localization in WDM systems,” Journal of Optical Communications and Networking16, F52–F61 (2024)
work page 2024
-
[20]
SOP-based DSP blind anomaly detection for sensing on deployed metropolitan fibers,
L. Minelli, S. Pellegrini, L. Andrenacci, D. Pilori, G. Bosco, L. Della Chiesa, A. Tanzi, C. Crognale, and R. Gaudino, “SOP-based DSP blind anomaly detection for sensing on deployed metropolitan fibers,” inEuropean Conference on Optical Communications (ECOC), (Glasgow, UK, 2023), pp. 519–522
work page 2023
-
[21]
C. Rottondi, R. di Marino, M. Nava, A. Giusti, and A. Bianco, “On the benefits of domain adaptation techniques for quality of transmission estimation in optical networks,” Journal of Optical Communications and Networking13, A34–A43 (2021)
work page 2021
-
[22]
F . Musumeci, V. Garbhapu Venkata, Y . Hirota, Y . Awaji, S. Xu, M. Shi- raiwa, B. Mukherjee, and M. Tornatore, “Domain adaptation and trans- fer learning for failure detection and failure-cause identification in op- tical networks across different lightpaths [invited],” Journal of Optical Communications and Networking14, A91–A100 (2022)
work page 2022
-
[23]
Ann-based transfer learning for qot prediction in real-time mixed line- rate systems,
W. Mo, Y .-K. Huang, S. Zhang, E. Ip, D. C. Kilper, Y . Aono, and T. Tajima, “Ann-based transfer learning for qot prediction in real-time mixed line- rate systems,” in2018 Optical Fiber Communications Conference and Exposition (OFC),(2018), pp. 1–3
work page 2018
-
[24]
Model transfer of QoT prediction in optical networks based on artificial neural networks,
J. Yu, W. Mo, Y .-K. Huang, E. Ip, and D. C. Kilper, “Model transfer of QoT prediction in optical networks based on artificial neural networks,” Journal of Optical Communications and Networking11, C48–C57 (2019)
work page 2019
-
[25]
Do- main adversarial adaptation framework for few-shot QoT estimation in optical networks,
Z. Cai, Q. Wang, Y . Deng, P . Zhang, G. Zhou, Y . Li, and F . N. Khan, “Do- main adversarial adaptation framework for few-shot QoT estimation in optical networks,” Journal of Optical Communications and Networking 16, 1133–1144 (2024)
work page 2024
-
[26]
K. Abdelli, M. Lonardi, J. Gripp, S. Olsson, F . Boitier, and P . Layec, “Risky event classification leveraging transfer learning for very limited datasets in optical networks,” Journal of Optical Communications and Networking16, C51–C68 (2024). 11
work page 2024
-
[27]
DIVA: Domain invariant variational autoencoders,
M. Ilse, J. M. Tomczak, C. Louizos, and M. Welling, “DIVA: Domain invariant variational autoencoders,” inProceedings of the Third Confer- ence on Medical Imaging with Deep Learning,vol. 121 ofProceedings of Machine Learning Research(PMLR, 2020), pp. 322–348
work page 2020
-
[28]
The varia- tional fair autoencoder,
C. Louizos, K. Swersky, Y . Li, M. Welling, and R. S. Zemel, “The varia- tional fair autoencoder,” in4th International Conference on Learning Representations (ICLR),(2016)
work page 2016
-
[29]
W.-N. Hsu, Y . Zhang, and J. Glass, “Unsupervised domain adaptation for robust speech recognition via variational autoencoder-based data augmentation,” inIEEE Automatic Speech Recognition and Under- standing Workshop (ASRU),(IEEE, 2017), pp. 16–23
work page 2017
-
[30]
CONNECT Centre for Future Networks and Communications, “Open Ireland Testbed,” (2020). Available here
work page 2020
-
[31]
Ireland’s National Education and Re- search Network,
ASIERA (formerly HEAnet), “Ireland’s National Education and Re- search Network,” (2025). Available here
work page 2025
-
[32]
A method for detecting external influence on an optical cable,
S. Karlsson, “A method for detecting external influence on an optical cable,” (1994). Filed Mar. 9, 1989; granted Oct. 19, 1994
work page 1994
-
[33]
Comparative performance analysis of hamming, hanning and blackman window,
P . Podder, T. Z. Khan, M. H. Khan, and M. M. Rahman, “Comparative performance analysis of hamming, hanning and blackman window,” International Journal of Computer Applications96, 1–7 (2014)
work page 2014
-
[34]
L. Van der Maaten and G. Hinton, “Visualizing data using t-sne.” Journal of machine learning research9(2008)
work page 2008
-
[35]
Batch normalization: Accelerating deep network training by reducing internal covariate shift,
S. Ioffe and C. Szegedy, “Batch normalization: Accelerating deep network training by reducing internal covariate shift,” inProceedings of the 32nd International Conference on Machine Learning (ICML), (PMLR, 2015), pp. 448–456
work page 2015
-
[36]
Dropout: A simple way to prevent neural networks from overfitting,
N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdi- nov, “Dropout: A simple way to prevent neural networks from overfitting,” Journal of Machine Learning Research (JMLR)15, 1929–1958 (2014)
work page 1929
-
[37]
Focal loss for dense object detection,
T.-Y . Lin, P . Goyal, R. Girshick, K. He, and P . Dollár, “Focal loss for dense object detection,” inProceedings of the IEEE International Con- ference on Computer Vision (ICCV),(IEEE, 2017), pp. 2980–2988
work page 2017
-
[38]
Optuna: A next- generation hyperparameter optimization framework,
T. Akiba, S. Sano, T. Y anase, T. Ohta, and M. Koyama, “Optuna: A next- generation hyperparameter optimization framework,” inProceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD),(ACM, 2019), pp. 2623–2631
work page 2019
-
[39]
PyTorch: An im- perative style, high-performance deep learning library,
A. Paszke, S. Gross, F . Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antigaet al., “PyTorch: An im- perative style, high-performance deep learning library,” inAdvances in Neural Information Processing Systems (NeurIPS),vol. 32 (Curran Associates, Inc., 2019)
work page 2019
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.