pith. sign in

arxiv: 2606.03430 · v1 · pith:MO4GJAYPnew · submitted 2026-06-02 · 💻 cs.CR · cs.AI

FlowGuard: Flow Matching for Identity-Independent Detection of Data-Free Model Stealing Attacks on Energy System Intrusion Detection Systems

Pith reviewed 2026-06-28 09:50 UTC · model grok-4.3

classification 💻 cs.CR cs.AI
keywords model stealing attacksintrusion detection systemsflow matchingout-of-distribution detectioncontinuous normalizing flowSybil attacksdata-free attacksenergy systems
0
0 comments X

The pith

FlowGuard detects data-free model stealing attacks on IDS using flow-based OOD classification independent of client identity.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tries to establish that a Continuous Normalizing Flow trained on legitimate data can identify synthetic queries from data-free model stealing attacks as out-of-distribution due to their lower log-likelihoods from occupying a lower-dimensional manifold. This provides an identity-independent defence for AI-based IDS in energy systems, which are vulnerable to model theft enabling offline evasion. It maintains stable detection rates in both single and distributed Sybil settings, unlike PRADA which drops to zero detection when identity information is unavailable. A sympathetic reader would care because it offers a practical way to protect critical infrastructure IDS without relying on tracking individual clients or using soft labels.

Core claim

FlowGuard classifies incoming queries as out-of-distribution prior to IDS processing by exploiting the fact that queries generated synthetically for data-free model stealing attacks occupy a lower-dimensional manifold than real network traffic, resulting in measurably lower log-likelihoods under a Continuous Normalizing Flow trained on legitimate data. Evaluation against PRADA and FDINet using MAZE and DisGUIDE attacks shows stable detection rates in single-client and 100-client Sybil settings without relying on identity information.

What carries the argument

Continuous Normalizing Flow trained on legitimate data to compute log-likelihoods for identifying lower-dimensional synthetic attack queries as OOD.

If this is right

  • The approach applies to hard-label IDS deployments where soft-label perturbation is inapplicable.
  • Detection remains effective against distributed Sybil attackers that evade identity-bound monitoring.
  • Performance stays stable when the attacker distribution changes from single-client to 100-client.
  • The defence can be applied before any IDS processing occurs.
  • Potential applications to data-dependent attacks are outlined in the paper.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Similar flow-based OOD detection could apply to synthetic data generation in other model extraction scenarios if the manifold separation holds.
  • Hybrid systems combining flow matching with limited identity checks might handle mixed attack types more robustly.
  • Validation on operational energy system traffic would test whether the log-likelihood gap persists in practice.
  • The technique might extend to IDS in other critical infrastructure domains facing model theft.

Load-bearing premise

Synthetic queries generated for data-free model stealing attacks occupy a lower-dimensional manifold than real network traffic.

What would settle it

If the log-likelihood distributions of real network traffic and synthetic attack queries under the trained Continuous Normalizing Flow overlap significantly, the OOD detection would not separate them reliably.

Figures

Figures reproduced from arXiv: 2606.03430 by Johannes Loevenich, Laurin Holz, Maxime Schwarzer, Roberto Rigolin F. Lopes, Thies Moehlenhof, Tobias Huerten, Veit Hagenmeyer.

Figure 1
Figure 1. Figure 1: Flow Matching OOD detection. (A) t-SNE of legitimate and synthetic attack queries in input space. (B) Latent [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Representative MAZE deep dive for ten queries: five benign CIFAR-10 queries and five MAZE attack queries. The left [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗
read the original abstract

Artificial Intelligence (AI)-based Intrusion Detection Systems (IDS) deployed in energy infrastructure are vulnerable to model theft attacks, which allow adversaries to create evasive traffic offline. Current defences against model extraction rely either on identity-bound query monitoring, which is ineffective against distributed attackers (Sybil), or on prediction poisoning through soft-label perturbation, which is inapplicable to hard-label IDS deployments. Therefore, we propose FlowGuard, an identity-independent defence based on flow matching that classifies incoming queries as out-of-distribution (OOD) prior to IDS processing. This approach exploits the fact that queries generated synthetically for data-free model stealing attacks occupy a lower-dimensional manifold than real network traffic. This results in measurably lower log-likelihoods when using a Continuous Normalizing Flow that has been trained on legitimate data. We evaluate our method against PRADA and FDINet using MAZE and DisGUIDE attacks in single-client and distributed (100-client Sybil) settings. While PRADA's detection rate dropped to 0% when the distribution changed, our defence maintained a stable detection rate across both settings without relying on identity information. We discuss the scope and limitations of the approach, and outline potential applications to data-dependent attacks.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 0 minor

Summary. The paper proposes FlowGuard, an identity-independent defense for AI-based IDS in energy systems against data-free model stealing attacks (MAZE, DisGUIDE). It trains a Continuous Normalizing Flow on legitimate traffic and detects attacks by classifying queries as OOD via lower log-likelihood scores, based on the claim that synthetic queries occupy a lower-dimensional manifold. Evaluation is reported against PRADA and FDINet in single-client and 100-client Sybil settings, with the defense maintaining stable detection rates while PRADA drops to 0% in distributed scenarios. The approach is positioned as applicable to hard-label IDS and discusses scope/limitations.

Significance. If the likelihood gap is empirically validated, the method provides a practical defense against Sybil-enabled model extraction without identity tracking or soft-label access, filling a gap for distributed energy IDS deployments. It leverages flow matching for OOD detection in a security context and could extend to other data-dependent attacks if the manifold assumption generalizes.

major comments (2)
  1. [Abstract] Abstract: The central claim that synthetic queries from MAZE/DisGUIDE produce measurably lower log-likelihoods under the CNF (due to lower-dimensional manifold) is load-bearing for the OOD classification, yet the abstract (and by extension the evaluation) supplies no quantitative evidence such as likelihood histograms, mean/variance statistics, or statistical tests confirming a reliable gap exists in either the single-client or 100-client regime.
  2. [Abstract] The defense reduces to thresholding CNF log-likelihoods on queries; if the attack generators can produce samples whose support overlaps the legitimate distribution (possible in high-dimensional IDS feature spaces or with imperfect flow models), the detection rate collapses. No analysis of this failure mode or robustness to manifold overlap is provided.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive feedback. We address each major comment below and indicate planned revisions to improve the manuscript.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The central claim that synthetic queries from MAZE/DisGUIDE produce measurably lower log-likelihoods under the CNF (due to lower-dimensional manifold) is load-bearing for the OOD classification, yet the abstract (and by extension the evaluation) supplies no quantitative evidence such as likelihood histograms, mean/variance statistics, or statistical tests confirming a reliable gap exists in either the single-client or 100-client regime.

    Authors: We agree that the abstract would benefit from explicit quantitative support for the likelihood gap. In the revision we will add mean and variance statistics for log-likelihoods under legitimate and synthetic queries (for both single-client and 100-client regimes) and will include a brief reference to the observed separation. We will also add a figure or table in the evaluation section (or supplementary material) showing likelihood histograms or density plots to make the empirical gap visible. revision: yes

  2. Referee: [Abstract] The defense reduces to thresholding CNF log-likelihoods on queries; if the attack generators can produce samples whose support overlaps the legitimate distribution (possible in high-dimensional IDS feature spaces or with imperfect flow models), the detection rate collapses. No analysis of this failure mode or robustness to manifold overlap is provided.

    Authors: The referee correctly identifies an important unaddressed failure mode. While our experiments with MAZE and DisGUIDE demonstrate reliable separation, we did not analyze robustness when support overlap occurs. We will expand the limitations and scope section to discuss this scenario, including conditions (high-dimensional features, imperfect flows, or stronger generators) under which detection may degrade, and note any mitigating factors or future work directions. revision: yes

Circularity Check

0 steps flagged

No circularity: core detection uses externally trained CNF on legitimate data with standard likelihood OOD scoring

full rationale

The paper's central mechanism trains a Continuous Normalizing Flow exclusively on legitimate network traffic and scores incoming queries by log-likelihood to flag OOD samples. This is a standard density-based OOD detector and does not reduce by construction to any fitted parameter, self-citation chain, or renamed input. The manifold assumption is explicitly stated as the motivating hypothesis rather than derived from the method itself. No equations, self-citations, or 'predictions' that collapse to the training data appear in the provided text. The evaluation against external attacks (MAZE, DisGUIDE) and baselines (PRADA, FDINet) is independent of the training procedure.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Central claim depends on the domain assumption that attack queries lie on a lower-dimensional manifold yielding lower likelihoods; no free parameters or invented entities are identifiable from the abstract.

axioms (1)
  • domain assumption Synthetic queries from data-free model stealing attacks occupy a lower-dimensional manifold than real network traffic, producing lower log-likelihoods under a CNF trained on legitimate data.
    Explicitly stated in abstract as the exploitation basis for OOD classification.

pith-pipeline@v0.9.1-grok · 5776 in / 1189 out tokens · 26171 ms · 2026-06-28T09:50:43.441590+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

27 extracted references · 7 canonical work pages

  1. [1]

    Varun Chandrasekaran, Kamalika Chaudhuri, Irene Giacomelli, Somesh Jha, and Songbai Yan. 2020. Exploring connections between active learning and model extraction. In29th USENIX Security Symposium (USENIX Security 20). 1309–1326

  2. [2]

    Elias Collaert, Abel Rodríguez, Sander Joos, Lieven Desmet, and Vera Rimmer

  3. [3]

    arXiv:2505.13280 [cs.LG] https://arxiv.org/abs/2505.13280

    FlowPure: Continuous Normalizing Flows for Adversarial Purification. arXiv:2505.13280 [cs.LG] https://arxiv.org/abs/2505.13280

  4. [4]

    Benedikt Heidrich, Matthias Hertel, Oliver Neumann, Veit Hagenmeyer, and Ralf Mikut. 2024. Using conditional Invertible Neural Networks to per- form mid-term peak load forecasting.IET Smart Grid7, 4 (2024), 460–

  5. [5]

    arXiv:https://ietresearch.onlinelibrary.wiley.com/doi/pdf/10.1049/stg2.12169 doi:10.1049/stg2.12169

  6. [6]

    Mika Juuti, Sebastian Szyller, Samuel Marchal, and N. Asokan. 2019. PRADA: Protecting Against DNN Model Stealing Attacks . In2019 IEEE European Sympo- sium on Security and Privacy (EuroS&P). IEEE Computer Society, Los Alamitos, CA, USA, 512–527. doi:10.1109/EuroSP.2019.00044

  7. [7]

    Sanjay Kariyappa, Atul Prakash, and Moinuddin K Qureshi. 2021. MAZE: Data- Free Model Stealing Attack Using Zeroth-Order Gradient Estimation. InPro- ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 13814–13823

  8. [8]

    Sanjay Kariyappa and Moinuddin K Qureshi. 2020. Defending against model stealing attacks with adaptive misinformation. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition. 770–778

  9. [9]

    Noora Zidan Khalaf, Israa Ibraheem Al Barazanchi, Israa Ibraheem Al Barazanchi, A. D. Radhi, Sushma Parihar, Pritesh Shah, and Ravi Sekhar. 2025. Development of real-time threat detection systems with AI-driven cybersecurity in critical infrastructure.Mesopotamian Journal of CyberSecurity5, 2 (Jun. 2025), 501–513. doi:10.58496/MJCS/2025/031

  10. [10]

    2009.Learning multiple layers of features from tiny images

    Alex Krizhevsky. 2009.Learning multiple layers of features from tiny images. Technical Report. University of Toronto

  11. [11]

    Taesung Lee, Benjamin Edwards, Ian Molloy, and Dong Su. 2019. Defending against neural network model stealing attacks using deceptive perturbations. In 2019 IEEE Security and Privacy Workshops (SPW). IEEE, 43–49

  12. [12]

    Yaron Lipman, Ricky T. Q. Chen, Heli Ben-Hamu, Maximilian Nickel, and Matt Le. 2023. Flow Matching for Generative Modeling

  13. [13]

    Yaron Lipman, Marton Havasi, Peter Holderrieth, Neta Shaul, Matt Le, Brian Karrer, Ricky T. Q. Chen, David Lopez-Paz, Heli Ben-Hamu, and Itai Gat. 2024. Flow Matching Guide and Code. arXiv:2412.06264 [cs.LG] https://arxiv.org/abs/ 2412.06264

  14. [14]

    Jian-Ping Mei, Weibin Zhang, Jie Chen, Xuyun Zhang, and Tiantian Zhu

  15. [15]

    Defense against model stealing based on account-aware distribution discrepancy. InProceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence and Thirty-Seventh Conference on Innovative Applications of Arti- ficial Intelligence and Fifteenth Symposium on Educational Advances in Artifi- cial Intelligence (AAAI’25/IAAI’25/EAAI’25). AAAI Press...

  16. [16]

    Eric Nalisnick, Akihiro Matsukawa, Yee Whye Teh, Dilan Gorur, and Balaji Lakshminarayanan. 2019. Do deep generative models know what they don’t know? (2019)

  17. [17]

    Tribhuvanesh Orekondy, Bernt Schiele, and Mario Fritz. 2019. Knockoff nets: Stealing functionality of black-box models. InProc. IEEE CVPR. 4954–4963

  18. [18]

    Tribhuvanesh Orekondy, Bernt Schiele, and Mario Fritz. 2020. Prediction Poison- ing: Towards Defenses Against DNN Model Stealing Attacks. InICLR

  19. [19]

    Nicolas Papernot, Patrick McDaniel, Ian Goodfellow, Somesh Jha, Z Berkay Celik, and Ananthram Swami. 2017. Practical black-box attacks against machine learning. InProceedings of the 2017 ACM on Asia conference on computer and communications security. 506–519

  20. [20]

    Jonathan Rosenthal, Eric Enouen, Hung Viet Pham, and Lin Tan. 2023. DisGUIDE: Disagreement-Guided Data-Free Model Extraction.Proceedings of the AAAI Conference on Artificial Intelligence37, 8 (Jun. 2023), 9614–9622. doi:10.1609/aaai. v37i8.26150

  21. [21]

    Minxue Tang, Anna Dai, Louis DiValentin, Aolin Ding, Amin Hass, Neil Zhen- qiang Gong, Yiran Chen, and Hai "Helen" Li. 2024. ModelGuard: Information- Theoretic Defense Against Model Extraction Attacks. In33rd USENIX Security Symposium (USENIX Security 24). USENIX Association, Philadelphia, PA, 5305–

  22. [22]

    https://www.usenix.org/conference/usenixsecurity24/presentation/tang

  23. [23]

    Florian Tramèr, Fan Zhang, Ari Juels, Michael K Reiter, and Thomas Ristenpart

  24. [24]

    Stealing machine learning models via prediction APIs. InProc. USENIX Security. 601–618

  25. [25]

    Marian Turowski, Benedikt Heidrich, Kaleb Phipps, Kai Schmieder, Oliver Neu- mann, Ralf Mikut, and Veit Hagenmeyer. 2022. Enhancing anomaly detection methods for energy time series using latent space data representations. InPro- ceedings of the Thirteenth ACM International Conference on Future Energy Systems (Virtual Event)(e-Energy ’22). Association for ...

  26. [26]

    Zhenyi Wang, Li Shen, Tongliang Liu, Tiehang Duan, Yanjun Zhu, Donglin Zhan, David Doermann, and Mingchen Gao. 2023. Defending against data-free model extraction by distributionally robust defensive training.Advances in Neural Information Processing Systems36 (2023), 624–637

  27. [27]

    Hongwei Yao, Zheng Li, Haiqin Weng, Feng Xue, Zhan Qin, and Kui Ren. 2025. FDINet: Protecting Against DNN Model Extraction Using Feature Distortion Index .IEEE Transactions on Dependable and Secure Computing22, 04 (July 2025), 3179–3191. doi:10.1109/TDSC.2024.3520599