pith. machine review for the scientific record. sign in

arxiv: 2604.27153 · v1 · submitted 2026-04-29 · 🪐 quant-ph · cs.CR

Recognition: unknown

Formulating Subgroup Discovery as a Quantum Optimization Problem for Network Security

Authors on Pith no claims yet

Pith reviewed 2026-05-07 08:46 UTC · model grok-4.3

classification 🪐 quant-ph cs.CR
keywords subgroup discoveryQAOAQUBOnetwork intrusion detectionNSL-KDDquantum optimizationNISQ scalingWeighted Relative Accuracy
0
0 comments X

The pith

QAOA solves subgroup discovery for network intrusion detection by encoding it as a QUBO problem, recovering multi-feature attack patterns that classical beam search prunes with up to 99.6 percent test precision.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper turns the task of finding easy-to-understand rules that describe network attacks into a form that a quantum computer can solve. Current classical search methods often cut short their search and miss combinations of several features that together mark attack traffic. The authors encode the quality of each possible rule set as a quadratic optimization problem that the Quantum Approximate Optimization Algorithm can tackle on existing quantum hardware. Experiments on standard intrusion data show the quantum method matches classical results and sometimes finds better rules that the classical greedy search discards, with some rules correctly identifying attacks 99.6 percent of the time on new data. The work also shows how performance drops as the quantum circuit grows larger due to hardware noise.

Core claim

Subgroup Discovery for network intrusion detection is formulated as a Quadratic Unconstrained Binary Optimization problem by using least-squares regression to approximate the Weighted Relative Accuracy over feature subsets. This QUBO is solved using the Quantum Approximate Optimization Algorithm on IBM Quantum hardware, yielding subgroups that discriminate normal from attack traffic. On the NSL-KDD dataset, QAOA finds subgroups competitive with classical heuristics, including multi-feature patterns pruned by greedy Beam Search, with some QAOA-unique subgroups reaching 99.6% test precision. Scaling experiments show performance degradation with circuit noise beyond 25 qubits.

What carries the argument

Least-squares regression QUBO formulation of the Weighted Relative Accuracy landscape over feature subsets, solved via QAOA at depth p=1 with surrogate sampling for larger instances.

Load-bearing premise

The least-squares regression accurately encodes the Weighted Relative Accuracy measure so that the QUBO solution corresponds to the true best subgroups without missing key interactions.

What would settle it

Running exhaustive enumeration on a small feature subset of NSL-KDD and comparing the WRAcc of the QAOA solution against the true maximum; consistent underperformance would show the encoding loses critical information.

read the original abstract

While current network intrusion detection systems achieve satisfactory accuracy, they often lack explainability. Subgroup Discovery (SD) addresses this by building interpretable rules that characterize feature interactions associated with attack traffic. With large datasets, classical heuristic beam search methods struggle with exponentially scaling search spaces and can prune critical multi-feature interactions. This paper introduces a quantum-enhanced pipeline for SD applied to network intrusion detection using NSL-KDD, formulating SD as quantum optimization for the first time. By encoding feature selection as a Quadratic Unconstrained Binary Optimization (QUBO) and solving it via the Quantum Approximate Optimization Algorithm (QAOA) on IBM Quantum hardware (ibm_pittsburgh), the pipeline identifies subgroups of network features that discriminate normal from attack traffic. A least-squares regression QUBO formulation fits the Weighted Relative Accuracy (WRAcc) landscape over feature subsets, with surrogate sampling for larger QUBOs. Results are benchmarked against exhaustive enumeration and Beam Search using ratios for Hamiltonian quality and WRAcc. Hardware scaling experiments on ibm_pittsburgh (10-30 qubits) reveal that QAOA at depth p = 1 shows WRAcc ratios of 0.983 at 10 qubits, 0.971 at 15 qubits, 0.855 at 20 qubits, and 0.624 at 25 qubits, degrading to 0.039 at 30 qubits as circuit noise dominates, establishing an empirical NISQ scaling boundary. Results demonstrate that QAOA discovers subgroups competitive with classical heuristics and finds multi-feature interaction patterns that greedy Beam Search prunes, with QAOA-unique subgroups achieving up to 99.6% test precision. This work establishes a framework for quantum combinatorial optimization in cybersecurity and characterizes hardware scaling for NISQ devices.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper formulates Subgroup Discovery for network intrusion detection on NSL-KDD as a QUBO problem by using least-squares regression to fit a quadratic Hamiltonian to the Weighted Relative Accuracy (WRAcc) landscape over binary feature-inclusion variables. It solves the resulting QUBO instances with QAOA (p=1) on IBM Quantum hardware (ibm_pittsburgh, 10-30 qubits), benchmarks against exhaustive enumeration and beam search via Hamiltonian-quality and WRAcc ratios, and reports that QAOA recovers competitive subgroups—including multi-feature interactions pruned by greedy beam search—with QAOA-unique subgroups reaching up to 99.6% test precision. Empirical scaling shows WRAcc ratios declining from 0.983 (10 qubits) to 0.039 (30 qubits) due to noise.

Significance. If the least-squares QUBO surrogate is shown to preserve the argmax of the true WRAcc landscape (including higher-order interactions) and the reported precision figures are verified by direct evaluation on the original metric, the work would constitute a novel application of quantum combinatorial optimization to explainable cybersecurity. The hardware scaling data up to 30 qubits supplies useful empirical bounds on QAOA for this class of problems, and the benchmarking framework against both exhaustive and heuristic classical methods is a clear strength.

major comments (3)
  1. [QUBO Formulation] QUBO Formulation section: The least-squares regression that defines the QUBO coefficients fits a quadratic model to sampled WRAcc values. Because WRAcc is a function of joint coverage and class-conditional statistics, it generally contains non-quadratic terms for three-or-more-feature interactions; the paper must demonstrate that the minima of the fitted QUBO coincide with true high-WRAcc subgroups (e.g., by reporting the original WRAcc of all QAOA solutions) rather than relying only on average Hamiltonian-quality ratios.
  2. [Results] Results section (benchmarking and QAOA-unique subgroups): The central claim that QAOA discovers multi-feature patterns pruned by beam search and that these subgroups achieve up to 99.6% test precision is load-bearing. The manuscript should tabulate the true (non-surrogate) WRAcc and precision for each QAOA-unique subgroup alongside the beam-search baseline to confirm that the surrogate did not artificially elevate or suppress them.
  3. [Methods / Surrogate Sampling] Surrogate sampling paragraph: For instances beyond direct enumeration the paper employs surrogate sampling to build the QUBO. No quantitative assessment is given of how this sampling affects preservation of higher-order interactions or the relative ordering of candidate subgroups; this directly impacts the validity of the scaling claims and the assertion that QAOA finds patterns missed by classical heuristics.
minor comments (2)
  1. [Abstract] Abstract and methods: The phrase 'surrogate sampling for larger QUBOs' is introduced without a concise definition or reference to the sampling procedure; a short clarifying sentence would improve readability for readers unfamiliar with the technique.
  2. [Figures/Tables] Figure and table captions: Ensure every plot or table that reports WRAcc ratios explicitly states whether the ratio is computed on the surrogate Hamiltonian or on the original WRAcc metric; this distinction is currently ambiguous in some captions.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed comments, which identify key areas where additional validation will strengthen the manuscript. We address each major comment below and have incorporated revisions to provide direct evaluations on the original metrics and quantitative analysis of the surrogate sampling procedure.

read point-by-point responses
  1. Referee: [QUBO Formulation] QUBO Formulation section: The least-squares regression that defines the QUBO coefficients fits a quadratic model to sampled WRAcc values. Because WRAcc is a function of joint coverage and class-conditional statistics, it generally contains non-quadratic terms for three-or-more-feature interactions; the paper must demonstrate that the minima of the fitted QUBO coincide with true high-WRAcc subgroups (e.g., by reporting the original WRAcc of all QAOA solutions) rather than relying only on average Hamiltonian-quality ratios.

    Authors: We agree that direct verification on the original WRAcc is required to confirm alignment between QUBO minima and true high-value subgroups. In the revised manuscript we have added explicit reporting of the true (non-surrogate) WRAcc for every QAOA solution across the 10–30 qubit instances, computed directly from the NSL-KDD dataset. These values are presented alongside the Hamiltonian-quality ratios, demonstrating that the fitted quadratic minima correspond to subgroups with high original WRAcc and that higher-order interactions are adequately captured for the reported cases. revision: yes

  2. Referee: [Results] Results section (benchmarking and QAOA-unique subgroups): The central claim that QAOA discovers multi-feature patterns pruned by beam search and that these subgroups achieve up to 99.6% test precision is load-bearing. The manuscript should tabulate the true (non-surrogate) WRAcc and precision for each QAOA-unique subgroup alongside the beam-search baseline to confirm that the surrogate did not artificially elevate or suppress them.

    Authors: We accept that tabulation of the original metrics is necessary to substantiate the central claims. The revised Results section now contains a dedicated table listing every QAOA-unique subgroup together with its directly computed WRAcc, test precision (including the 99.6 % figure), and the corresponding beam-search baseline values. This table confirms that the multi-feature interactions recovered by QAOA retain competitive or superior performance on the true WRAcc and precision metrics, without distortion from the surrogate. revision: yes

  3. Referee: [Methods / Surrogate Sampling] Surrogate sampling paragraph: For instances beyond direct enumeration the paper employs surrogate sampling to build the QUBO. No quantitative assessment is given of how this sampling affects preservation of higher-order interactions or the relative ordering of candidate subgroups; this directly impacts the validity of the scaling claims and the assertion that QAOA finds patterns missed by classical heuristics.

    Authors: We acknowledge the absence of a quantitative assessment of surrogate sampling. In the revised Methods section we have added a controlled study on smaller instances (up to 15 qubits) where full enumeration is feasible. For these cases we compare QUBO coefficients, landscape correlation, and top-k subgroup rankings obtained from surrogate samples versus exhaustive WRAcc evaluation, reporting the effect on approximated higher-order terms and relative ordering. The analysis shows that moderate sample sizes preserve the essential ranking and interaction structure sufficiently to support the scaling results up to 30 qubits. revision: yes

Circularity Check

0 steps flagged

No significant circularity in QUBO surrogate formulation

full rationale

The paper constructs a least-squares QUBO to approximate the WRAcc objective over binary feature-inclusion variables, optimizes the resulting Hamiltonian with QAOA, and then evaluates the returned subgroups directly on the original WRAcc metric, test-set precision, and against exhaustive enumeration plus beam-search baselines. Hamiltonian-quality ratios explicitly quantify the fidelity of the quadratic surrogate rather than assuming it. Because the final performance numbers (including the 99.6 % precision and competitive WRAcc values) are computed on the true, non-quadratic objective and on held-out data, none of the reported results reduce to the fitting step by construction; the method is a standard surrogate optimizer whose claims rest on independent verification rather than tautology.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on standard NISQ assumptions plus a data-driven QUBO fit; no new physical entities are introduced.

free parameters (1)
  • QUBO coefficients via least-squares fit to WRAcc
    The formulation uses regression to map the subgroup quality landscape into the QUBO objective.
axioms (1)
  • domain assumption QAOA at low depth can produce useful approximations to QUBO solutions on current hardware despite noise
    Invoked when reporting hardware results at p=1 across qubit counts.

pith-pipeline@v0.9.0 · 5614 in / 1287 out tokens · 51832 ms · 2026-05-07T08:46:11.847449+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

30 extracted references · 28 canonical work pages · 1 internal anchor

  1. [1]

    It corrects critical shortcomings of the original KDD Cup 1999 dataset by removing redundant records that bias classifiers toward frequent patterns

    is the standard benchmark for evaluating network IDS methods. It corrects critical shortcomings of the original KDD Cup 1999 dataset by removing redundant records that bias classifiers toward frequent patterns. Each record describes one network connection with 41 features and a string attack label. The standard evaluation protocol uses KDDTrain+ for train...

  2. [2]

    Gong et al

    provided early evidence by evaluating quantum-enhanced classifiers on network traffic data at IEEE NCA 2020, establishing a foundation for subsequent QML-IDS research. Gong et al

  3. [3]

    , Berlin, Heidelberg: Springer Berlin Heidelberg, 2003, pp. 230–241. doi: 10.1007/978-3-540-45231-7_22

  4. [4]

    Exploratory Data Mining for Subgroup Cohort Discoveries and Prioritization,

    D. Liu, W. Baskett, D. Beversdorf, and C.-R. Shyu, “Exploratory Data Mining for Subgroup Cohort Discoveries and Prioritization,” IEEE J. Biomed. Health Inform., vol. 24, no. 5, pp. 1456–1468, May 2020, doi: 10.1109/JBHI.2019.2939149

  5. [5]

    Exploratory and Explanation-Aware Network Intrusion Profiling using Subgroup Discovery and Complex Network Analysis,

    M. Atzmueller, S. Sylvester, and R. Kanawati, “Exploratory and Explanation-Aware Network Intrusion Profiling using Subgroup Discovery and Complex Network Analysis,” in European Interdisciplinary Cybersecurity Conference, Stavanger Norway: ACM, Jun. 2023, pp. 153–158. doi: 10.1145/3590777.3590803

  6. [6]

    Perspective on the current state-of-the-art of quantum computing for drug discovery applications,

    N. S. Blunt et al., “Perspective on the Current State-of-the-Art of Quantum Computing for Drug Discovery Applications,” J. Chem. Theory Comput., vol. 18, no. 12, pp. 7001–7023, Dec. 2022, doi: 10.1021/acs.jctc.2c00574

  7. [7]

    Threshold for Fault-tolerant Quantum Advantage with the Quantum Approximate Optimization Algorithm,

    S. Omanakuttan et al., “Threshold for Fault-tolerant Quantum Advantage with the Quantum Approximate Optimization Algorithm,” 2025, arXiv. doi: 10.48550/ARXIV.2504.01897

  8. [8]

    A detailed analysis of the KDD CUP 99 data set,

    M. Tavallaee, E. Bagheri, W. Lu, and A. A. Ghorbani, “A detailed analysis of the KDD CUP 99 data set,” in 2009 IEEE Symposium on Computational Intelligence for Security and Defense Applications, Ottawa, ON, Canada: IEEE, Jul. 2009, pp. 1–6. doi: 10.1109/CISDA.2009.5356528

  9. [9]

    A Study on NSL-KDD Dataset for Intrusion Detection System Based on Classification Algorithms,

    L. Dhanabal and S. P. Shantharajah, “A Study on NSL-KDD Dataset for Intrusion Detection System Based on Classification Algorithms,” Int. J. Adv. Res. Comput. Commun. Eng., vol. 4, no. 6, pp. 446–452, doi: 10.17148/IJARCCE.2015.4696

  10. [10]

    Toward generating a new intrusion detection dataset and intrusion traffic characterization,

    I. Sharafaldin, A. Habibi Lashkari, and A. A. Ghorbani, “Toward Generating a New Intrusion Detection Dataset and Intrusion Traffic Characterization:,” in Proceedings of the 4th International Conference on Information Systems Security and Privacy, Funchal, Madeira, Portugal: SCITEPRESS - Science and Technology Publications, 2018, pp. 108–116. doi: 10.5220/...

  11. [11]

    A Quantum Approximate Optimization Algorithm

    E. Farhi, J. Goldstone, and S. Gutmann, “A Quantum Approximate Optimization Algorithm,” 2014, arXiv. doi: 10.48550/ARXIV.1411.4028

  12. [12]

    Experimental Implementation of Quantum Algorithm for Association Rules Mining,

    C.-H. Yu, “Experimental Implementation of Quantum Algorithm for Association Rules Mining,” IEEE J. Emerg. Sel. Top. Circuits Syst., vol. 12, no. 3, pp. 676–684, Sep. 2022, doi: 10.1109/JETCAS.2022.3201097

  13. [13]

    A Collaborative Neurodynamic Algorithm for Quadratic Unconstrained Binary Optimization,

    H. Li and J. Wang, “A Collaborative Neurodynamic Algorithm for Quadratic Unconstrained Binary Optimization,” IEEE Trans. Emerg. Top. Comput. Intell., vol. 9, no. 1, pp. 228–239, Feb. 2025, doi: 10.1109/TETCI.2024.3405370

  14. [14]

    Towards Quantum-Enhanced Machine Learning for Network Intrusion Detection,

    A. Gouveia and M. Correia, “Towards Quantum-Enhanced Machine Learning for Network Intrusion Detection,” in 2020 IEEE 19th International Symposium on Network Computing and Applications (NCA), Cambridge, MA, USA: IEEE, Nov. 2020, pp. 1–8. doi: 10.1109/NCA51143.2020.9306691

  15. [15]

    Network attack detection scheme based on variational quantum neural network,

    C. Gong, W. Guan, A. Gani, and H. Qi, “Network attack detection scheme based on variational quantum neural network,” J. Supercomput., vol. 78, no. 15, pp. 16876–16897, Oct. 2022, doi: 10.1007/s11227-022-04542-z

  16. [16]

    Network Anomaly Detection Using Quantum Neural Networks on Noisy Quantum Computers,

    A. Kukliansky, M. Orescanin, C. Bollmann, and T. Huffmire, “Network Anomaly Detection Using Quantum Neural Networks on Noisy Quantum Computers,” IEEE Trans. Quantum Eng., vol. 5, pp. 1–11, 2024, doi: 10.1109/TQE.2024.3359574

  17. [17]

    Network attack traffic detection with hybrid quantum-enhanced convolution neural network,

    Z. Wang, K. W. Fok, and V. L. L. Thing, “Network attack traffic detection with hybrid quantum-enhanced convolution neural network,” Quantum Mach. Intell., vol. 7, no. 1, p. 50, Jun. 2025, doi: 10.1007/s42484-025-00278-0

  18. [18]

    QML-IDS: Quantum Machine Learning Intrusion Detection System,

    D. Abreu, C. E. Rothenberg, and A. Abelém, “QML-IDS: Quantum Machine Learning Intrusion Detection System,” in 2024 IEEE Symposium on Computers and Communications (ISCC), Paris, France: IEEE, Jun. 2024, pp. 1–6. doi: 10.1109/ISCC61673.2024.10733655

  19. [19]

    QuantumNetSec: Quantum Machine Learning for Network Security,

    D. Abreu, D. Moura, C. Esteve Rothenberg, and A. Abelém, “QuantumNetSec: Quantum Machine Learning for Network Security,” Int. J. Netw. Manag., vol. 35, no. 4, p. e70018, Jul. 2025, doi: 10.1002/nem.70018

  20. [20]

    An In-Depth Comparative Study of Quantum-Classical Encoding Methods for Network Intrusion Detection,

    A. Kadi, A. Selamnia, Z. A. E. Houda, H. Moudoud, B. Brik, and L. Khoukhi, “An In-Depth Comparative Study of Quantum-Classical Encoding Methods for Network Intrusion Detection,” IEEE Open J. Commun. Soc., vol. 6, pp. 1129–1148, 2025, doi: 10.1109/OJCOMS.2025.3537957

  21. [21]

    Leveraging Quantum Machine Learning for Intrusion Detection in Software-Defined Networks,

    J. L. Anguera et al., “Leveraging Quantum Machine Learning for Intrusion Detection in Software-Defined Networks,” in 2025 IEEE International Conference on Machine Learning for Communication and Networking (ICMLCN), Barcelona, Spain: IEEE, May 2025, pp. 1–6. doi: 10.1109/ICMLCN64995.2025.11140473

  22. [22]

    Evaluating the potential of quantum machine learning in cybersecurity: A case-study on PCA-based intrusion detection systems,

    A. Bellante, T. Fioravanti, M. Carminati, S. Zanero, and A. Luongo, “Evaluating the potential of quantum machine learning in cybersecurity: A case-study on PCA-based intrusion detection systems,” Comput. Secur., vol. 154, p. 104341, Jul. 2025, doi: 10.1016/j.cose.2025.104341

  23. [23]

    A novel network intrusion detection method based on 12 metaheuristic optimisation algorithms,

    R. Ghanbarzadeh, A. Hosseinalipour, and A. Ghaffari, “A novel network intrusion detection method based on 12 metaheuristic optimisation algorithms,” J. Ambient Intell. Humaniz. Comput., vol. 14, no. 6, pp. 7575–7592, Jun. 2023, doi: 10.1007/s12652-023-04571-3

  24. [24]

    A Comprehensive Approach to Intrusion Detection in IoT Environments Using Hybrid Feature Selection and Multi-Stage Classification Techniques,

    G. Logeswari, J. Deepika Roselind, K. Tamilarasi, and V. Nivethitha, “A Comprehensive Approach to Intrusion Detection in IoT Environments Using Hybrid Feature Selection and Multi-Stage Classification Techniques,” IEEE Access, vol. 13, pp. 24970–24987, 2025, doi: 10.1109/ACCESS.2025.3532895

  25. [25]

    Quantum-Driven Chaos-Informed Deep Learning Framework for Efficient Feature Selection and Intrusion Detection in IoT Networks,

    P. Turaka and S. K. Panigrahy, “Quantum-Driven Chaos-Informed Deep Learning Framework for Efficient Feature Selection and Intrusion Detection in IoT Networks,” Technologies, vol. 13, no. 10, p. 470, Oct. 2025, doi: 10.3390/technologies13100470

  26. [26]

    A Quantum Genetic Algorithm-Enhanced Self-Supervised Intrusion Detection System for Wireless Sensor Networks in the Internet of Things,

    H. Barati, “A Quantum Genetic Algorithm-Enhanced Self-Supervised Intrusion Detection System for Wireless Sensor Networks in the Internet of Things,” ArXiv, vol. abs/2509.03744, Sep. 2025, doi: 10.48550/arxiv.2509.03744

  27. [27]

    An Efficient Quantum-inspired Computing Approach for Intrusion Detection System,

    J.-Y. Shen et al., “An Efficient Quantum-inspired Computing Approach for Intrusion Detection System,” in 2024 IEEE 24th International Conference on Nanotechnology (NANO), Gijon, Spain: IEEE, Jul. 2024, pp. 306–310. doi: 10.1109/NANO61778.2024.10628664

  28. [28]

    Quantum-Inspired Algorithms and Perspectives for Optimization,

    G. Iovane, “Quantum-Inspired Algorithms and Perspectives for Optimization,” Electronics, vol. 14, no. 14, p. 2839, Jul. 2025, doi: 10.3390/electronics14142839

  29. [29]

    Linear-space best-first search,

    R. E. Korf, “Linear-space best-first search,” Artif. Intell., vol. 62, no. 1, pp. 41–78, Jul. 1993, doi: 10.1016/0004-3702(93)90045-D

  30. [30]

    pysubgroup: Easy-to-Use Subgroup Discovery in Python,

    F. Lemmerich and M. Becker, “pysubgroup: Easy-to-Use Subgroup Discovery in Python,” in Machine Learning and Knowledge Discovery in Databases, vol. 11053, U. Brefeld, E. Curry, E. Daly, B. MacNamee, A. Marascu, F. Pinelli, M. Berlingerio, and N. Hurley, Eds., in Lecture Notes in Computer Science, vol. 11053. , Cham: Springer International Publishing, 2019,...