Recognition: unknown
Formulating Subgroup Discovery as a Quantum Optimization Problem for Network Security
Pith reviewed 2026-05-07 08:46 UTC · model grok-4.3
The pith
QAOA solves subgroup discovery for network intrusion detection by encoding it as a QUBO problem, recovering multi-feature attack patterns that classical beam search prunes with up to 99.6 percent test precision.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Subgroup Discovery for network intrusion detection is formulated as a Quadratic Unconstrained Binary Optimization problem by using least-squares regression to approximate the Weighted Relative Accuracy over feature subsets. This QUBO is solved using the Quantum Approximate Optimization Algorithm on IBM Quantum hardware, yielding subgroups that discriminate normal from attack traffic. On the NSL-KDD dataset, QAOA finds subgroups competitive with classical heuristics, including multi-feature patterns pruned by greedy Beam Search, with some QAOA-unique subgroups reaching 99.6% test precision. Scaling experiments show performance degradation with circuit noise beyond 25 qubits.
What carries the argument
Least-squares regression QUBO formulation of the Weighted Relative Accuracy landscape over feature subsets, solved via QAOA at depth p=1 with surrogate sampling for larger instances.
Load-bearing premise
The least-squares regression accurately encodes the Weighted Relative Accuracy measure so that the QUBO solution corresponds to the true best subgroups without missing key interactions.
What would settle it
Running exhaustive enumeration on a small feature subset of NSL-KDD and comparing the WRAcc of the QAOA solution against the true maximum; consistent underperformance would show the encoding loses critical information.
read the original abstract
While current network intrusion detection systems achieve satisfactory accuracy, they often lack explainability. Subgroup Discovery (SD) addresses this by building interpretable rules that characterize feature interactions associated with attack traffic. With large datasets, classical heuristic beam search methods struggle with exponentially scaling search spaces and can prune critical multi-feature interactions. This paper introduces a quantum-enhanced pipeline for SD applied to network intrusion detection using NSL-KDD, formulating SD as quantum optimization for the first time. By encoding feature selection as a Quadratic Unconstrained Binary Optimization (QUBO) and solving it via the Quantum Approximate Optimization Algorithm (QAOA) on IBM Quantum hardware (ibm_pittsburgh), the pipeline identifies subgroups of network features that discriminate normal from attack traffic. A least-squares regression QUBO formulation fits the Weighted Relative Accuracy (WRAcc) landscape over feature subsets, with surrogate sampling for larger QUBOs. Results are benchmarked against exhaustive enumeration and Beam Search using ratios for Hamiltonian quality and WRAcc. Hardware scaling experiments on ibm_pittsburgh (10-30 qubits) reveal that QAOA at depth p = 1 shows WRAcc ratios of 0.983 at 10 qubits, 0.971 at 15 qubits, 0.855 at 20 qubits, and 0.624 at 25 qubits, degrading to 0.039 at 30 qubits as circuit noise dominates, establishing an empirical NISQ scaling boundary. Results demonstrate that QAOA discovers subgroups competitive with classical heuristics and finds multi-feature interaction patterns that greedy Beam Search prunes, with QAOA-unique subgroups achieving up to 99.6% test precision. This work establishes a framework for quantum combinatorial optimization in cybersecurity and characterizes hardware scaling for NISQ devices.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper formulates Subgroup Discovery for network intrusion detection on NSL-KDD as a QUBO problem by using least-squares regression to fit a quadratic Hamiltonian to the Weighted Relative Accuracy (WRAcc) landscape over binary feature-inclusion variables. It solves the resulting QUBO instances with QAOA (p=1) on IBM Quantum hardware (ibm_pittsburgh, 10-30 qubits), benchmarks against exhaustive enumeration and beam search via Hamiltonian-quality and WRAcc ratios, and reports that QAOA recovers competitive subgroups—including multi-feature interactions pruned by greedy beam search—with QAOA-unique subgroups reaching up to 99.6% test precision. Empirical scaling shows WRAcc ratios declining from 0.983 (10 qubits) to 0.039 (30 qubits) due to noise.
Significance. If the least-squares QUBO surrogate is shown to preserve the argmax of the true WRAcc landscape (including higher-order interactions) and the reported precision figures are verified by direct evaluation on the original metric, the work would constitute a novel application of quantum combinatorial optimization to explainable cybersecurity. The hardware scaling data up to 30 qubits supplies useful empirical bounds on QAOA for this class of problems, and the benchmarking framework against both exhaustive and heuristic classical methods is a clear strength.
major comments (3)
- [QUBO Formulation] QUBO Formulation section: The least-squares regression that defines the QUBO coefficients fits a quadratic model to sampled WRAcc values. Because WRAcc is a function of joint coverage and class-conditional statistics, it generally contains non-quadratic terms for three-or-more-feature interactions; the paper must demonstrate that the minima of the fitted QUBO coincide with true high-WRAcc subgroups (e.g., by reporting the original WRAcc of all QAOA solutions) rather than relying only on average Hamiltonian-quality ratios.
- [Results] Results section (benchmarking and QAOA-unique subgroups): The central claim that QAOA discovers multi-feature patterns pruned by beam search and that these subgroups achieve up to 99.6% test precision is load-bearing. The manuscript should tabulate the true (non-surrogate) WRAcc and precision for each QAOA-unique subgroup alongside the beam-search baseline to confirm that the surrogate did not artificially elevate or suppress them.
- [Methods / Surrogate Sampling] Surrogate sampling paragraph: For instances beyond direct enumeration the paper employs surrogate sampling to build the QUBO. No quantitative assessment is given of how this sampling affects preservation of higher-order interactions or the relative ordering of candidate subgroups; this directly impacts the validity of the scaling claims and the assertion that QAOA finds patterns missed by classical heuristics.
minor comments (2)
- [Abstract] Abstract and methods: The phrase 'surrogate sampling for larger QUBOs' is introduced without a concise definition or reference to the sampling procedure; a short clarifying sentence would improve readability for readers unfamiliar with the technique.
- [Figures/Tables] Figure and table captions: Ensure every plot or table that reports WRAcc ratios explicitly states whether the ratio is computed on the surrogate Hamiltonian or on the original WRAcc metric; this distinction is currently ambiguous in some captions.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed comments, which identify key areas where additional validation will strengthen the manuscript. We address each major comment below and have incorporated revisions to provide direct evaluations on the original metrics and quantitative analysis of the surrogate sampling procedure.
read point-by-point responses
-
Referee: [QUBO Formulation] QUBO Formulation section: The least-squares regression that defines the QUBO coefficients fits a quadratic model to sampled WRAcc values. Because WRAcc is a function of joint coverage and class-conditional statistics, it generally contains non-quadratic terms for three-or-more-feature interactions; the paper must demonstrate that the minima of the fitted QUBO coincide with true high-WRAcc subgroups (e.g., by reporting the original WRAcc of all QAOA solutions) rather than relying only on average Hamiltonian-quality ratios.
Authors: We agree that direct verification on the original WRAcc is required to confirm alignment between QUBO minima and true high-value subgroups. In the revised manuscript we have added explicit reporting of the true (non-surrogate) WRAcc for every QAOA solution across the 10–30 qubit instances, computed directly from the NSL-KDD dataset. These values are presented alongside the Hamiltonian-quality ratios, demonstrating that the fitted quadratic minima correspond to subgroups with high original WRAcc and that higher-order interactions are adequately captured for the reported cases. revision: yes
-
Referee: [Results] Results section (benchmarking and QAOA-unique subgroups): The central claim that QAOA discovers multi-feature patterns pruned by beam search and that these subgroups achieve up to 99.6% test precision is load-bearing. The manuscript should tabulate the true (non-surrogate) WRAcc and precision for each QAOA-unique subgroup alongside the beam-search baseline to confirm that the surrogate did not artificially elevate or suppress them.
Authors: We accept that tabulation of the original metrics is necessary to substantiate the central claims. The revised Results section now contains a dedicated table listing every QAOA-unique subgroup together with its directly computed WRAcc, test precision (including the 99.6 % figure), and the corresponding beam-search baseline values. This table confirms that the multi-feature interactions recovered by QAOA retain competitive or superior performance on the true WRAcc and precision metrics, without distortion from the surrogate. revision: yes
-
Referee: [Methods / Surrogate Sampling] Surrogate sampling paragraph: For instances beyond direct enumeration the paper employs surrogate sampling to build the QUBO. No quantitative assessment is given of how this sampling affects preservation of higher-order interactions or the relative ordering of candidate subgroups; this directly impacts the validity of the scaling claims and the assertion that QAOA finds patterns missed by classical heuristics.
Authors: We acknowledge the absence of a quantitative assessment of surrogate sampling. In the revised Methods section we have added a controlled study on smaller instances (up to 15 qubits) where full enumeration is feasible. For these cases we compare QUBO coefficients, landscape correlation, and top-k subgroup rankings obtained from surrogate samples versus exhaustive WRAcc evaluation, reporting the effect on approximated higher-order terms and relative ordering. The analysis shows that moderate sample sizes preserve the essential ranking and interaction structure sufficiently to support the scaling results up to 30 qubits. revision: yes
Circularity Check
No significant circularity in QUBO surrogate formulation
full rationale
The paper constructs a least-squares QUBO to approximate the WRAcc objective over binary feature-inclusion variables, optimizes the resulting Hamiltonian with QAOA, and then evaluates the returned subgroups directly on the original WRAcc metric, test-set precision, and against exhaustive enumeration plus beam-search baselines. Hamiltonian-quality ratios explicitly quantify the fidelity of the quadratic surrogate rather than assuming it. Because the final performance numbers (including the 99.6 % precision and competitive WRAcc values) are computed on the true, non-quadratic objective and on held-out data, none of the reported results reduce to the fitting step by construction; the method is a standard surrogate optimizer whose claims rest on independent verification rather than tautology.
Axiom & Free-Parameter Ledger
free parameters (1)
- QUBO coefficients via least-squares fit to WRAcc
axioms (1)
- domain assumption QAOA at low depth can produce useful approximations to QUBO solutions on current hardware despite noise
Reference graph
Works this paper leans on
-
[1]
It corrects critical shortcomings of the original KDD Cup 1999 dataset by removing redundant records that bias classifiers toward frequent patterns
is the standard benchmark for evaluating network IDS methods. It corrects critical shortcomings of the original KDD Cup 1999 dataset by removing redundant records that bias classifiers toward frequent patterns. Each record describes one network connection with 41 features and a string attack label. The standard evaluation protocol uses KDDTrain+ for train...
1999
-
[2]
Gong et al
provided early evidence by evaluating quantum-enhanced classifiers on network traffic data at IEEE NCA 2020, establishing a foundation for subsequent QML-IDS research. Gong et al
2020
-
[3]
, Berlin, Heidelberg: Springer Berlin Heidelberg, 2003, pp. 230–241. doi: 10.1007/978-3-540-45231-7_22
-
[4]
Exploratory Data Mining for Subgroup Cohort Discoveries and Prioritization,
D. Liu, W. Baskett, D. Beversdorf, and C.-R. Shyu, “Exploratory Data Mining for Subgroup Cohort Discoveries and Prioritization,” IEEE J. Biomed. Health Inform., vol. 24, no. 5, pp. 1456–1468, May 2020, doi: 10.1109/JBHI.2019.2939149
-
[5]
M. Atzmueller, S. Sylvester, and R. Kanawati, “Exploratory and Explanation-Aware Network Intrusion Profiling using Subgroup Discovery and Complex Network Analysis,” in European Interdisciplinary Cybersecurity Conference, Stavanger Norway: ACM, Jun. 2023, pp. 153–158. doi: 10.1145/3590777.3590803
-
[6]
Perspective on the current state-of-the-art of quantum computing for drug discovery applications,
N. S. Blunt et al., “Perspective on the Current State-of-the-Art of Quantum Computing for Drug Discovery Applications,” J. Chem. Theory Comput., vol. 18, no. 12, pp. 7001–7023, Dec. 2022, doi: 10.1021/acs.jctc.2c00574
-
[7]
Threshold for Fault-tolerant Quantum Advantage with the Quantum Approximate Optimization Algorithm,
S. Omanakuttan et al., “Threshold for Fault-tolerant Quantum Advantage with the Quantum Approximate Optimization Algorithm,” 2025, arXiv. doi: 10.48550/ARXIV.2504.01897
-
[8]
A detailed analysis of the KDD CUP 99 data set,
M. Tavallaee, E. Bagheri, W. Lu, and A. A. Ghorbani, “A detailed analysis of the KDD CUP 99 data set,” in 2009 IEEE Symposium on Computational Intelligence for Security and Defense Applications, Ottawa, ON, Canada: IEEE, Jul. 2009, pp. 1–6. doi: 10.1109/CISDA.2009.5356528
-
[9]
A Study on NSL-KDD Dataset for Intrusion Detection System Based on Classification Algorithms,
L. Dhanabal and S. P. Shantharajah, “A Study on NSL-KDD Dataset for Intrusion Detection System Based on Classification Algorithms,” Int. J. Adv. Res. Comput. Commun. Eng., vol. 4, no. 6, pp. 446–452, doi: 10.17148/IJARCCE.2015.4696
-
[10]
Toward generating a new intrusion detection dataset and intrusion traffic characterization,
I. Sharafaldin, A. Habibi Lashkari, and A. A. Ghorbani, “Toward Generating a New Intrusion Detection Dataset and Intrusion Traffic Characterization:,” in Proceedings of the 4th International Conference on Information Systems Security and Privacy, Funchal, Madeira, Portugal: SCITEPRESS - Science and Technology Publications, 2018, pp. 108–116. doi: 10.5220/...
-
[11]
A Quantum Approximate Optimization Algorithm
E. Farhi, J. Goldstone, and S. Gutmann, “A Quantum Approximate Optimization Algorithm,” 2014, arXiv. doi: 10.48550/ARXIV.1411.4028
work page internal anchor Pith review doi:10.48550/arxiv.1411.4028 2014
-
[12]
Experimental Implementation of Quantum Algorithm for Association Rules Mining,
C.-H. Yu, “Experimental Implementation of Quantum Algorithm for Association Rules Mining,” IEEE J. Emerg. Sel. Top. Circuits Syst., vol. 12, no. 3, pp. 676–684, Sep. 2022, doi: 10.1109/JETCAS.2022.3201097
-
[13]
A Collaborative Neurodynamic Algorithm for Quadratic Unconstrained Binary Optimization,
H. Li and J. Wang, “A Collaborative Neurodynamic Algorithm for Quadratic Unconstrained Binary Optimization,” IEEE Trans. Emerg. Top. Comput. Intell., vol. 9, no. 1, pp. 228–239, Feb. 2025, doi: 10.1109/TETCI.2024.3405370
-
[14]
Towards Quantum-Enhanced Machine Learning for Network Intrusion Detection,
A. Gouveia and M. Correia, “Towards Quantum-Enhanced Machine Learning for Network Intrusion Detection,” in 2020 IEEE 19th International Symposium on Network Computing and Applications (NCA), Cambridge, MA, USA: IEEE, Nov. 2020, pp. 1–8. doi: 10.1109/NCA51143.2020.9306691
-
[15]
Network attack detection scheme based on variational quantum neural network,
C. Gong, W. Guan, A. Gani, and H. Qi, “Network attack detection scheme based on variational quantum neural network,” J. Supercomput., vol. 78, no. 15, pp. 16876–16897, Oct. 2022, doi: 10.1007/s11227-022-04542-z
-
[16]
Network Anomaly Detection Using Quantum Neural Networks on Noisy Quantum Computers,
A. Kukliansky, M. Orescanin, C. Bollmann, and T. Huffmire, “Network Anomaly Detection Using Quantum Neural Networks on Noisy Quantum Computers,” IEEE Trans. Quantum Eng., vol. 5, pp. 1–11, 2024, doi: 10.1109/TQE.2024.3359574
-
[17]
Network attack traffic detection with hybrid quantum-enhanced convolution neural network,
Z. Wang, K. W. Fok, and V. L. L. Thing, “Network attack traffic detection with hybrid quantum-enhanced convolution neural network,” Quantum Mach. Intell., vol. 7, no. 1, p. 50, Jun. 2025, doi: 10.1007/s42484-025-00278-0
-
[18]
QML-IDS: Quantum Machine Learning Intrusion Detection System,
D. Abreu, C. E. Rothenberg, and A. Abelém, “QML-IDS: Quantum Machine Learning Intrusion Detection System,” in 2024 IEEE Symposium on Computers and Communications (ISCC), Paris, France: IEEE, Jun. 2024, pp. 1–6. doi: 10.1109/ISCC61673.2024.10733655
-
[19]
QuantumNetSec: Quantum Machine Learning for Network Security,
D. Abreu, D. Moura, C. Esteve Rothenberg, and A. Abelém, “QuantumNetSec: Quantum Machine Learning for Network Security,” Int. J. Netw. Manag., vol. 35, no. 4, p. e70018, Jul. 2025, doi: 10.1002/nem.70018
-
[20]
An In-Depth Comparative Study of Quantum-Classical Encoding Methods for Network Intrusion Detection,
A. Kadi, A. Selamnia, Z. A. E. Houda, H. Moudoud, B. Brik, and L. Khoukhi, “An In-Depth Comparative Study of Quantum-Classical Encoding Methods for Network Intrusion Detection,” IEEE Open J. Commun. Soc., vol. 6, pp. 1129–1148, 2025, doi: 10.1109/OJCOMS.2025.3537957
-
[21]
Leveraging Quantum Machine Learning for Intrusion Detection in Software-Defined Networks,
J. L. Anguera et al., “Leveraging Quantum Machine Learning for Intrusion Detection in Software-Defined Networks,” in 2025 IEEE International Conference on Machine Learning for Communication and Networking (ICMLCN), Barcelona, Spain: IEEE, May 2025, pp. 1–6. doi: 10.1109/ICMLCN64995.2025.11140473
-
[22]
A. Bellante, T. Fioravanti, M. Carminati, S. Zanero, and A. Luongo, “Evaluating the potential of quantum machine learning in cybersecurity: A case-study on PCA-based intrusion detection systems,” Comput. Secur., vol. 154, p. 104341, Jul. 2025, doi: 10.1016/j.cose.2025.104341
-
[23]
A novel network intrusion detection method based on 12 metaheuristic optimisation algorithms,
R. Ghanbarzadeh, A. Hosseinalipour, and A. Ghaffari, “A novel network intrusion detection method based on 12 metaheuristic optimisation algorithms,” J. Ambient Intell. Humaniz. Comput., vol. 14, no. 6, pp. 7575–7592, Jun. 2023, doi: 10.1007/s12652-023-04571-3
-
[24]
G. Logeswari, J. Deepika Roselind, K. Tamilarasi, and V. Nivethitha, “A Comprehensive Approach to Intrusion Detection in IoT Environments Using Hybrid Feature Selection and Multi-Stage Classification Techniques,” IEEE Access, vol. 13, pp. 24970–24987, 2025, doi: 10.1109/ACCESS.2025.3532895
-
[25]
P. Turaka and S. K. Panigrahy, “Quantum-Driven Chaos-Informed Deep Learning Framework for Efficient Feature Selection and Intrusion Detection in IoT Networks,” Technologies, vol. 13, no. 10, p. 470, Oct. 2025, doi: 10.3390/technologies13100470
-
[26]
H. Barati, “A Quantum Genetic Algorithm-Enhanced Self-Supervised Intrusion Detection System for Wireless Sensor Networks in the Internet of Things,” ArXiv, vol. abs/2509.03744, Sep. 2025, doi: 10.48550/arxiv.2509.03744
-
[27]
An Efficient Quantum-inspired Computing Approach for Intrusion Detection System,
J.-Y. Shen et al., “An Efficient Quantum-inspired Computing Approach for Intrusion Detection System,” in 2024 IEEE 24th International Conference on Nanotechnology (NANO), Gijon, Spain: IEEE, Jul. 2024, pp. 306–310. doi: 10.1109/NANO61778.2024.10628664
-
[28]
Quantum-Inspired Algorithms and Perspectives for Optimization,
G. Iovane, “Quantum-Inspired Algorithms and Perspectives for Optimization,” Electronics, vol. 14, no. 14, p. 2839, Jul. 2025, doi: 10.3390/electronics14142839
-
[29]
Linear-space best-first search,
R. E. Korf, “Linear-space best-first search,” Artif. Intell., vol. 62, no. 1, pp. 41–78, Jul. 1993, doi: 10.1016/0004-3702(93)90045-D
-
[30]
pysubgroup: Easy-to-Use Subgroup Discovery in Python,
F. Lemmerich and M. Becker, “pysubgroup: Easy-to-Use Subgroup Discovery in Python,” in Machine Learning and Knowledge Discovery in Databases, vol. 11053, U. Brefeld, E. Curry, E. Daly, B. MacNamee, A. Marascu, F. Pinelli, M. Berlingerio, and N. Hurley, Eds., in Lecture Notes in Computer Science, vol. 11053. , Cham: Springer International Publishing, 2019,...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.