Recognition: unknown
Clustering-Enhanced Domain Adaptation for Cross-Domain Intrusion Detection in Industrial Control Systems
Pith reviewed 2026-05-10 15:50 UTC · model grok-4.3
The pith
A clustering-enhanced domain adaptation method improves unknown attack detection in industrial control systems by up to 49% over baselines.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The clustering-enhanced domain adaptation framework projects source and target ICS traffic domains into a shared latent subspace through spectral-transform-based feature alignment to iteratively reduce distribution discrepancies, while a K-Medoids clustering strategy combined with PCA-based dimensionality reduction improves cross-domain correlation estimation and reduces degradation from manual parameter tuning, leading to significantly better detection of unknown attacks.
What carries the argument
Clustering-enhanced domain adaptation framework, consisting of spectral-transform feature alignment for shared subspace projection and K-Medoids-PCA clustering for correlation enhancement.
If this is right
- Unknown attack detection accuracy rises by up to 49% compared with five baseline models.
- F-score gains are larger and stability is stronger across tasks.
- The clustering enhancement strategy alone increases detection accuracy by up to 26% on representative tasks.
- The approach reduces the impact of data scarcity and domain shift in dynamic industrial environments.
Where Pith is reading between the lines
- If the shared-structure premise holds more broadly, the method could lower the cost of labeling data for each new ICS deployment.
- The reduced need for manual tuning via clustering and PCA might simplify integration into real-time ICS monitoring pipelines.
- The stability observed suggests the framework could be tested on other network security domains that experience traffic shifts, such as IoT or enterprise networks.
Load-bearing premise
Source and target ICS traffic domains share enough structural similarity that spectral alignment produces a useful common space and K-Medoids with PCA reliably improves correlation estimates without introducing bias or harming performance on the tested distributions.
What would settle it
Applying the method to new ICS traffic datasets with substantially different distributions or attack profiles and observing no accuracy gains over the baselines or no additional boost from the clustering step would falsify the central claim.
Figures
read the original abstract
Industrial control systems operate in dynamic environments where traffic distributions vary across scenarios, labeled samples are limited, and unknown attacks frequently emerge, posing significant challenges to cross-domain intrusion detection. To address this issue, this paper proposes a clustering-enhanced domain adaptation method for industrial control traffic. The framework contains two key components. First, a feature-based transfer learning module projects source and target domains into a shared latent subspace through spectral-transform-based feature alignment and iteratively reduces distribution discrepancies, enabling accurate cross-domain detection. Second, a clustering enhancement strategy combines K-Medoids clustering with PCA-based dimensionality reduction to improve cross-domain correlation estimation and reduce performance degradation caused by manual parameter tuning. Experimental results show that the proposed method significantly improves unknown attack detection. Compared with five baseline models, it increases detection accuracy by up to 49%, achieves larger gains in F-score, and demonstrates stronger stability. Moreover, the clustering enhancement strategy further boosts detection accuracy by up to 26% on representative tasks. These results suggest that the proposed method effectively alleviates data scarcity and domain shift, providing a practical solution for robust cross-domain intrusion detection in dynamic industrial environments.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes a clustering-enhanced domain adaptation method for cross-domain intrusion detection in industrial control systems. The framework includes a feature-based transfer learning module that uses spectral-transform-based feature alignment to project source and target domains into a shared latent subspace while iteratively reducing distribution discrepancies, plus a clustering enhancement strategy that applies K-Medoids clustering combined with PCA-based dimensionality reduction to improve cross-domain correlation estimates. The authors report that the method significantly improves unknown attack detection, achieving up to 49% higher detection accuracy than five baseline models, larger gains in F-score, stronger stability, and an additional up to 26% accuracy boost from the clustering strategy on representative tasks.
Significance. If the empirical performance gains can be substantiated with proper statistical validation, dataset details, and controls, the work could offer a practical approach to mitigating data scarcity and domain shift in ICS intrusion detection, addressing a relevant challenge in securing dynamic industrial environments where unknown attacks emerge frequently.
major comments (3)
- [Abstract and Experimental Results] Abstract and Experimental Results section: The central claims of up to 49% accuracy improvement over baselines and a 26% boost from the clustering enhancement are presented as single-point estimates without dataset descriptions, baseline implementation details, error bars, number of random seeds, statistical significance tests, or ablation controls, preventing evaluation of whether the deltas exceed run-to-run variation or arise from post-hoc selection on the specific ICS traffic splits.
- [Method] Method section: The spectral-transform-based feature alignment and iterative discrepancy reduction are described at a high level without explicit equations defining the shared latent subspace construction or the alignment objective, making it difficult to assess the method's grounding or whether the reported gains reduce to quantities defined by the fitted parameters (e.g., number of clusters or PCA dimensionality).
- [Experimental Results] Experimental Results section: The stability and correlation improvement claims for K-Medoids+PCA rest on the assumption that source and target ICS domains share sufficient structure for the alignment to produce useful subspaces without introducing selection bias, yet no evidence (such as sensitivity analysis or cross-validation across partitions) is provided to support this for the tested traffic distributions.
minor comments (1)
- [Abstract] The abstract refers to 'five baseline models' without naming them; these should be explicitly listed with references or implementation details in the main text for reproducibility.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed comments, which highlight important areas for strengthening the manuscript. We address each major comment below and commit to revisions that will incorporate the requested details, equations, and analyses.
read point-by-point responses
-
Referee: [Abstract and Experimental Results] Abstract and Experimental Results section: The central claims of up to 49% accuracy improvement over baselines and a 26% boost from the clustering enhancement are presented as single-point estimates without dataset descriptions, baseline implementation details, error bars, number of random seeds, statistical significance tests, or ablation controls, preventing evaluation of whether the deltas exceed run-to-run variation or arise from post-hoc selection on the specific ICS traffic splits.
Authors: We agree that single-point estimates limit the interpretability of the reported gains. In the revised manuscript, we will expand the Experimental Results section with: full dataset descriptions (including ICS traffic characteristics, sizes, and how source/target splits were formed); implementation details and hyperparameters for all five baselines; performance as means and standard deviations over at least five random seeds; statistical significance tests (e.g., paired t-tests or Wilcoxon tests) against baselines; and ablation studies that isolate the clustering component. These additions will demonstrate that the up-to-49% accuracy and up-to-26% clustering boosts exceed run-to-run variation and are not artifacts of particular splits. revision: yes
-
Referee: [Method] Method section: The spectral-transform-based feature alignment and iterative discrepancy reduction are described at a high level without explicit equations defining the shared latent subspace construction or the alignment objective, making it difficult to assess the method's grounding or whether the reported gains reduce to quantities defined by the fitted parameters (e.g., number of clusters or PCA dimensionality).
Authors: The referee correctly identifies that the current description is insufficiently formal. We will revise the Method section to include explicit mathematical formulations: the spectral transform operator, the projection into the shared latent subspace, the alignment objective (including the discrepancy measure and iterative update rule), and the precise integration of K-Medoids clustering with PCA dimensionality reduction. We will also state how the number of clusters and PCA dimensions are selected and their role in the objective, enabling readers to evaluate the method's grounding and reproducibility. revision: yes
-
Referee: [Experimental Results] Experimental Results section: The stability and correlation improvement claims for K-Medoids+PCA rest on the assumption that source and target ICS domains share sufficient structure for the alignment to produce useful subspaces without introducing selection bias, yet no evidence (such as sensitivity analysis or cross-validation across partitions) is provided to support this for the tested traffic distributions.
Authors: We acknowledge that additional empirical validation is required to support the assumptions underlying the clustering enhancement. In the revision, we will add a sensitivity analysis that varies the number of K-Medoids clusters and the retained PCA dimensionality, reporting detection accuracy and stability metrics across these choices. We will also include cross-validation results over multiple random partitions of the source and target traffic data to show that performance gains remain consistent and that no selection bias is introduced by the particular splits used in the original experiments. revision: yes
Circularity Check
No circularity: empirical method with external validation
full rationale
The paper presents an algorithmic framework for cross-domain intrusion detection consisting of spectral-transform feature alignment followed by K-Medoids+PCA clustering. These steps are described procedurally and evaluated via direct comparison against five baselines on held-out ICS traffic tasks, with reported accuracy and F-score deltas. No equations, closed-form derivations, or parameter-fitting procedures are shown that would make the claimed performance gains (49% accuracy, 26% clustering boost) equivalent to the method's own inputs by construction. No self-citations are invoked as load-bearing uniqueness theorems or ansatzes. The result is therefore a standard empirical proposal whose validity rests on external test-set measurements rather than internal definitional equivalence.
Axiom & Free-Parameter Ledger
free parameters (2)
- Number of clusters for K-Medoids
- Dimensionality after PCA
axioms (2)
- domain assumption Source and target ICS traffic distributions can be aligned in a shared latent subspace via spectral-transform feature mapping
- domain assumption K-Medoids clustering after PCA yields more reliable cross-domain correlation estimates than manual parameter tuning alone
Reference graph
Works this paper leans on
-
[1]
Preprocess source and target data
-
[2]
Apply PCA to obtainZ𝑠 andZ 𝑡
-
[3]
Perform K-Medoids clustering on both domains
-
[4]
Compute cluster-level similarity and build correspondence setΩ
-
[5]
Learn latent projection𝐴by minimizing the adaptation objective
-
[6]
Train classifier𝑔on adapted source features
-
[7]
After PCA transformation, the source and target samples are partitioned into𝐾 𝑠 and𝐾 𝑡 clusters, respectively: C𝑠 ={𝐶 𝑠 1 , 𝐶𝑠 2 ,
Predict target labelsˆ𝑌𝑡 =𝑔(𝐴 ⊤𝑍𝑡 ) integrated into a unified cross-domain intrusion detection pipeline. After PCA transformation, the source and target samples are partitioned into𝐾 𝑠 and𝐾 𝑡 clusters, respectively: C𝑠 ={𝐶 𝑠 1 , 𝐶𝑠 2 , . . . , 𝐶𝑠 𝐾𝑠 },C 𝑡 ={𝐶 𝑡 1, 𝐶𝑡 2, . . . , 𝐶𝑡 𝐾𝑡 }. Each cluster is represented by its medoid [9]: 𝑚𝑠 𝑝 ∈𝐶 𝑠 𝑝, 𝑚 𝑡 𝑞 ∈𝐶 ...
-
[8]
A theory of learning from different domains.Machine Learning, 79(1–2):151–175, 2010
Shai Ben-David, John Blitzer, Koby Crammer, and Fernando Pereira. A theory of learning from different domains.Machine Learning, 79(1–2):151–175, 2010
2010
-
[9]
Organ-agents: Virtual human physiology simulator via llms
Rihao Chang, Hongbo Jiao, Weizhi Nie, Huijie Guo,KaiXie,ZihanWu,LinZhao,YutongBai, Yongtao Ma, Lijuan Wang, et al. Organ-agents: Virtual human physiology simulator via llms. arXiv preprint arXiv:2508.14357, 2025
-
[10]
3d shape knowledge graph for cross-domain 3d shape retrieval
Rihao Chang, Yongtao Ma, Tong Hao, Weijie Wang, and Weizhi Nie. 3d shape knowledge graph for cross-domain 3d shape retrieval. CAAI Transactions on Intelligence Technology, 9(5):1199–1216, 2024
2024
-
[11]
Y. Chen, S. Su, D. Yu, H. He, X. Wang, Y. Ma, and H. Guo. Cross-domain industrial intrusion detection deep model trained with imbalanced data.IEEE Internet of Things Journal, 10:584–596, 2023
2023
-
[12]
Domain-adversarial training of neural networks.Journal of Machine Learning Research, 17(59):1–35, 2016
Yaroslav Ganin, Evgeniya Ustinova, Hana Ajakan, Pascal Germain, Hugo Larochelle, François Laviolette, Mario Marchand, and Victor Lempitsky. Domain-adversarial training of neural networks.Journal of Machine Learning Research, 17(59):1–35, 2016
2016
-
[13]
Point-pc: Point cloud completion guided by prior knowledge via causal inference.CAAI Transactions on Intelligence Technology, 2025
Xuesong Gao, Chuanqi Jiao, Ruidong Chen, Weijie Wang, and Weizhi Nie. Point-pc: Point cloud completion guided by prior knowledge via causal inference.CAAI Transactions on Intelligence Technology, 2025
2025
-
[14]
M. R. Gauthama Raman, Chuadhry Mujeeb Ahmed, and Aditya Mathur. Machine learning for intrusion detection in industrial control systems: Challenges and lessons from experimental evaluation.Cybersecurity, 4(1):27, 2021
2021
-
[15]
Jolliffe.Principal Component Analysis
Ian T. Jolliffe.Principal Component Analysis. Springer, New York, 2 edition, 2002
2002
-
[16]
Rousseeuw
Leonard Kaufman and Peter J. Rousseeuw. Partitioning around medoids (program pam). In Finding Groups in Data: An Introduction to Cluster Analysis, pages 68–125. John Wiley & Sons, New York, 1990
1990
-
[17]
Deep transfer learning for intrusion detection in industrial control networks: A comprehensive review.Journal of Network and Computer Applications, 220:103760, 2023
Hamza Kheddar, Yassine Himeur, and Ali Ismail Awad. Deep transfer learning for intrusion detection in industrial control networks: A comprehensive review.Journal of Network and Computer Applications, 220:103760, 2023
2023
-
[18]
Moshe Kravchik and Asaf Shabtai. Detecting cyberattacks in industrial control systems using convolutional neural networks.arXiv preprint arXiv:1806.08110, 2018
-
[19]
Freeinsert: Disentangled text-guided object insertion in 3d gaussian scene without spatial priors, 2025
Chenxi Li, Weijie Wang, Qiang Li, Bruno Lepri, Nicu Sebe, and Weizhi Nie. Freeinsert: Disentangled text-guided object insertion in 3d gaussian scene without spatial priors, 2025
2025
-
[20]
Multimodal information fusion based on lstm for 3d model retrieval.Multimedia Tools and Applications, 79(45–46):33943–33956, 2020
Qi Liang, Ning Xu, Weijie Wang, and Xingjian Long. Multimodal information fusion based on lstm for 3d model retrieval.Multimedia Tools and Applications, 79(45–46):33943–33956, 2020
2020
-
[21]
Mingrui Ma, Weijie Wang, Jie Ning, Jianfeng He, Nicu Sebe, and Bruno Lepri. Large language models for multimodal deformable image registration.arXiv preprint arXiv:2408.10703, 2024
-
[22]
Unsupervised deep probabilistic approach for partial point cloud registration
Guofeng Mei, Hao Tang, Xiaoshui Huang, Weijie Wang, Juan Liu, Jian Zhang, Luc Van Gool, and Qiang Wu. Unsupervised deep probabilistic approach for partial point cloud registration. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023
2023
-
[23]
Haroon Mushtaq, S. U. Khan, M. A. Jan, A. Ullah, and H. A. Khattak. A parallel architectureforthepartitioningaroundmedoids (pam) algorithm.Sensors, 18(12):4129, 2018
2018
-
[24]
T2td: Text-3d generation model based on prior knowledge guidance.IEEE Transactions on Pattern Analysis and Machine Intelligence, 47(1):172–189, 2025
Weizhi Nie, Ruidong Chen, Weijie Wang, Bruno Lepri, and Nicu Sebe. T2td: Text-3d generation model based on prior knowledge guidance.IEEE Transactions on Pattern Analysis and Machine Intelligence, 47(1):172–189, 2025
2025
-
[25]
Characteristic views extraction modal based-on deep reinforcement learning for 3d model retrieval
Weizhi Nie, Weijie Wang, Anan Liu, and Chuang Chen. Characteristic views extraction modal based-on deep reinforcement learning for 3d model retrieval. In2019 IEEE International Conference on Image Processing (ICIP), pages 2389–2393, 2019
2019
-
[26]
Hgan: Holistic generative adversarial networks for two-dimensional image-based three-dimensional object retrieval
Weizhi Nie, Weijie Wang, Anan Liu, Jie Nie, and Yuxuan Su. Hgan: Holistic generative adversarial networks for two-dimensional image-based three-dimensional object retrieval. ACM Transactions on Multimedia Computing, Communications, and Applications, 15(4):1–24, 2019
2019
-
[27]
Tsang, James T
Sinno Jialin Pan, Ivor W. Tsang, James T. Kwok, and Qiang Yang. Domain adaptation via transfer component analysis. InProceedings of the 21st International Joint Conference on Artificial Intelligence (IJCAI), pages 1187–1192, 2009
2009
-
[28]
On lines and planes of closest fit to systems of points in space.The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science, 2(11):559–572, 1901
Karl Pearson. On lines and planes of closest fit to systems of points in space.The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science, 2(11):559–572, 1901
1901
-
[29]
Bringing masked autoencoders explicit contrastive properties for point cloud self-supervised learning
Bin Ren, Guofeng Mei, Danda Pani Paudel, Weijie Wang, Yawei Li, Mengyuan Liu, Rita Cucchiara, Luc Van Gool, and Nicu Sebe. Bringing masked autoencoders explicit contrastive properties for point cloud self-supervised learning. InProceedings of the Asian Conference on Computer Vision (ACCV), 2024
2024
- [30]
-
[31]
MuhammadAzmiUmer,KhurumNazirJunejo, Muhammad Taha Jilani, and Aditya P. Mathur. Machine learning for intrusion detection in industrial control systems: Applications, challenges, and recommendations. International Journal of Critical Infrastructure Protection, 38:100516, 2022
2022
-
[32]
Mbt-polyp: A new multi-branch memory-augmented transformer for polyp segmentation.Image and Vision Computing, 163:105747, 2025
Tao Wang, Weijie Wang, Fausto Giunchiglia, Fengzhi Zhao, Ye Zhang, Duo Yu, and Guixia Liu. Mbt-polyp: A new multi-branch memory-augmented transformer for polyp segmentation.Image and Vision Computing, 163:105747, 2025
2025
-
[33]
U-hrmlp: Refining segmentation boundaries in histopathology images
Tao Wang, Kai Zhang, Weijie Wang, Mingrui Ma, Ye Zhang, He Zhao, and Guixia Liu. U-hrmlp: Refining segmentation boundaries in histopathology images. In2024 IEEE International Symposium on Biomedical Imaging (ISBI), pages 1–5, 2024
2024
-
[34]
Dynamically instance-guided adaptation: A backward-free approach for test-time domain adaptive semantic segmentation
Wei Wang, Zhun Zhong, Weijie Wang, Xi Chen, Charles Ling, Boyu Wang, and Nicu Sebe. Dynamically instance-guided adaptation: A backward-free approach for test-time domain adaptive semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 24090–24099, 2023
2023
-
[35]
Fully-geometric cross-attention for point cloud registration
Weijie Wang, Guofeng Mei, Jian Zhang, Nicu Sebe, Bruno Lepri, and Fabio Poiesi. Fully-geometric cross-attention for point cloud registration. In3DV. IEEE, 2025
2025
-
[36]
Zeroreg: Zero-shot point cloudregistrationwithfoundationmodels.arXiv preprint arXiv:2312.03032, 2023
Weijie Wang, Wenqi Ren, Guofeng Mei, Bin Ren, Xiaoshui Huang, Fabio Poiesi, Nicu Sebe, and Bruno Lepri. Zeroreg: Zero-shot point cloud registration with foundation models. arXiv preprint arXiv:2312.03032, 2023
-
[37]
Uvmap-id: A controllable and personalized uv map generative model
Weijie Wang, Jichao Zhang, Chang Liu, Xia Li, Xingqian Xu, Humphrey Shi, Nicu Sebe, and Bruno Lepri. Uvmap-id: A controllable and personalized uv map generative model. InACM MM, pages 10725–10734, 2024
2024
-
[38]
Turn fake into real: Adversarial head turn attacks against deepfake detection
Weijie Wang, Zhengyu Zhao, Nicu Sebe, and Bruno Lepri. Turn fake into real: Adversarial head turn attacks against deepfake detection. arXiv preprint arXiv:2309.01104, 2023
-
[39]
Learning spatial-spectral dual adaptive graph embedding for multispectral and hyperspectral image fusion.Pattern Recognition, 151:110365, 2024
Xuquan Wang, Feng Zhang, Kai Zhang, Weijie Wang, Xiong Dun, and Jiande Sun. Learning spatial-spectral dual adaptive graph embedding for multispectral and hyperspectral image fusion.Pattern Recognition, 151:110365, 2024
2024
-
[40]
Structure causal models and llms integration in medical visual question answering.IEEE Transactions on Medical Imaging, 44(8):3476–3489, 2025
Zibo Xu, Qiang Li, Weizhi Nie, Weijie Wang, and Anan Liu. Structure causal models and llms integration in medical visual question answering.IEEE Transactions on Medical Imaging, 44(8):3476–3489, 2025
2025
-
[41]
Transfer learning for detecting unknown network attacks
Juan Zhao, Sachin Shetty, Jan Wei Pan, Charles Kamhoua, and Kevin Kwiat. Transfer learning for detecting unknown network attacks. EURASIP Journal on Information Security, 2019(1):1, 2019
2019
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.