FlowGuard: Flow Matching for Identity-Independent Detection of Data-Free Model Stealing Attacks on Energy System Intrusion Detection Systems
Pith reviewed 2026-06-28 09:50 UTC · model grok-4.3
The pith
FlowGuard detects data-free model stealing attacks on IDS using flow-based OOD classification independent of client identity.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
FlowGuard classifies incoming queries as out-of-distribution prior to IDS processing by exploiting the fact that queries generated synthetically for data-free model stealing attacks occupy a lower-dimensional manifold than real network traffic, resulting in measurably lower log-likelihoods under a Continuous Normalizing Flow trained on legitimate data. Evaluation against PRADA and FDINet using MAZE and DisGUIDE attacks shows stable detection rates in single-client and 100-client Sybil settings without relying on identity information.
What carries the argument
Continuous Normalizing Flow trained on legitimate data to compute log-likelihoods for identifying lower-dimensional synthetic attack queries as OOD.
If this is right
- The approach applies to hard-label IDS deployments where soft-label perturbation is inapplicable.
- Detection remains effective against distributed Sybil attackers that evade identity-bound monitoring.
- Performance stays stable when the attacker distribution changes from single-client to 100-client.
- The defence can be applied before any IDS processing occurs.
- Potential applications to data-dependent attacks are outlined in the paper.
Where Pith is reading between the lines
- Similar flow-based OOD detection could apply to synthetic data generation in other model extraction scenarios if the manifold separation holds.
- Hybrid systems combining flow matching with limited identity checks might handle mixed attack types more robustly.
- Validation on operational energy system traffic would test whether the log-likelihood gap persists in practice.
- The technique might extend to IDS in other critical infrastructure domains facing model theft.
Load-bearing premise
Synthetic queries generated for data-free model stealing attacks occupy a lower-dimensional manifold than real network traffic.
What would settle it
If the log-likelihood distributions of real network traffic and synthetic attack queries under the trained Continuous Normalizing Flow overlap significantly, the OOD detection would not separate them reliably.
Figures
read the original abstract
Artificial Intelligence (AI)-based Intrusion Detection Systems (IDS) deployed in energy infrastructure are vulnerable to model theft attacks, which allow adversaries to create evasive traffic offline. Current defences against model extraction rely either on identity-bound query monitoring, which is ineffective against distributed attackers (Sybil), or on prediction poisoning through soft-label perturbation, which is inapplicable to hard-label IDS deployments. Therefore, we propose FlowGuard, an identity-independent defence based on flow matching that classifies incoming queries as out-of-distribution (OOD) prior to IDS processing. This approach exploits the fact that queries generated synthetically for data-free model stealing attacks occupy a lower-dimensional manifold than real network traffic. This results in measurably lower log-likelihoods when using a Continuous Normalizing Flow that has been trained on legitimate data. We evaluate our method against PRADA and FDINet using MAZE and DisGUIDE attacks in single-client and distributed (100-client Sybil) settings. While PRADA's detection rate dropped to 0% when the distribution changed, our defence maintained a stable detection rate across both settings without relying on identity information. We discuss the scope and limitations of the approach, and outline potential applications to data-dependent attacks.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes FlowGuard, an identity-independent defense for AI-based IDS in energy systems against data-free model stealing attacks (MAZE, DisGUIDE). It trains a Continuous Normalizing Flow on legitimate traffic and detects attacks by classifying queries as OOD via lower log-likelihood scores, based on the claim that synthetic queries occupy a lower-dimensional manifold. Evaluation is reported against PRADA and FDINet in single-client and 100-client Sybil settings, with the defense maintaining stable detection rates while PRADA drops to 0% in distributed scenarios. The approach is positioned as applicable to hard-label IDS and discusses scope/limitations.
Significance. If the likelihood gap is empirically validated, the method provides a practical defense against Sybil-enabled model extraction without identity tracking or soft-label access, filling a gap for distributed energy IDS deployments. It leverages flow matching for OOD detection in a security context and could extend to other data-dependent attacks if the manifold assumption generalizes.
major comments (2)
- [Abstract] Abstract: The central claim that synthetic queries from MAZE/DisGUIDE produce measurably lower log-likelihoods under the CNF (due to lower-dimensional manifold) is load-bearing for the OOD classification, yet the abstract (and by extension the evaluation) supplies no quantitative evidence such as likelihood histograms, mean/variance statistics, or statistical tests confirming a reliable gap exists in either the single-client or 100-client regime.
- [Abstract] The defense reduces to thresholding CNF log-likelihoods on queries; if the attack generators can produce samples whose support overlaps the legitimate distribution (possible in high-dimensional IDS feature spaces or with imperfect flow models), the detection rate collapses. No analysis of this failure mode or robustness to manifold overlap is provided.
Simulated Author's Rebuttal
We thank the referee for their constructive feedback. We address each major comment below and indicate planned revisions to improve the manuscript.
read point-by-point responses
-
Referee: [Abstract] Abstract: The central claim that synthetic queries from MAZE/DisGUIDE produce measurably lower log-likelihoods under the CNF (due to lower-dimensional manifold) is load-bearing for the OOD classification, yet the abstract (and by extension the evaluation) supplies no quantitative evidence such as likelihood histograms, mean/variance statistics, or statistical tests confirming a reliable gap exists in either the single-client or 100-client regime.
Authors: We agree that the abstract would benefit from explicit quantitative support for the likelihood gap. In the revision we will add mean and variance statistics for log-likelihoods under legitimate and synthetic queries (for both single-client and 100-client regimes) and will include a brief reference to the observed separation. We will also add a figure or table in the evaluation section (or supplementary material) showing likelihood histograms or density plots to make the empirical gap visible. revision: yes
-
Referee: [Abstract] The defense reduces to thresholding CNF log-likelihoods on queries; if the attack generators can produce samples whose support overlaps the legitimate distribution (possible in high-dimensional IDS feature spaces or with imperfect flow models), the detection rate collapses. No analysis of this failure mode or robustness to manifold overlap is provided.
Authors: The referee correctly identifies an important unaddressed failure mode. While our experiments with MAZE and DisGUIDE demonstrate reliable separation, we did not analyze robustness when support overlap occurs. We will expand the limitations and scope section to discuss this scenario, including conditions (high-dimensional features, imperfect flows, or stronger generators) under which detection may degrade, and note any mitigating factors or future work directions. revision: yes
Circularity Check
No circularity: core detection uses externally trained CNF on legitimate data with standard likelihood OOD scoring
full rationale
The paper's central mechanism trains a Continuous Normalizing Flow exclusively on legitimate network traffic and scores incoming queries by log-likelihood to flag OOD samples. This is a standard density-based OOD detector and does not reduce by construction to any fitted parameter, self-citation chain, or renamed input. The manifold assumption is explicitly stated as the motivating hypothesis rather than derived from the method itself. No equations, self-citations, or 'predictions' that collapse to the training data appear in the provided text. The evaluation against external attacks (MAZE, DisGUIDE) and baselines (PRADA, FDINet) is independent of the training procedure.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Synthetic queries from data-free model stealing attacks occupy a lower-dimensional manifold than real network traffic, producing lower log-likelihoods under a CNF trained on legitimate data.
Reference graph
Works this paper leans on
-
[1]
Varun Chandrasekaran, Kamalika Chaudhuri, Irene Giacomelli, Somesh Jha, and Songbai Yan. 2020. Exploring connections between active learning and model extraction. In29th USENIX Security Symposium (USENIX Security 20). 1309–1326
2020
-
[2]
Elias Collaert, Abel Rodríguez, Sander Joos, Lieven Desmet, and Vera Rimmer
-
[3]
arXiv:2505.13280 [cs.LG] https://arxiv.org/abs/2505.13280
FlowPure: Continuous Normalizing Flows for Adversarial Purification. arXiv:2505.13280 [cs.LG] https://arxiv.org/abs/2505.13280
-
[4]
Benedikt Heidrich, Matthias Hertel, Oliver Neumann, Veit Hagenmeyer, and Ralf Mikut. 2024. Using conditional Invertible Neural Networks to per- form mid-term peak load forecasting.IET Smart Grid7, 4 (2024), 460–
2024
-
[5]
arXiv:https://ietresearch.onlinelibrary.wiley.com/doi/pdf/10.1049/stg2.12169 doi:10.1049/stg2.12169
-
[6]
Mika Juuti, Sebastian Szyller, Samuel Marchal, and N. Asokan. 2019. PRADA: Protecting Against DNN Model Stealing Attacks . In2019 IEEE European Sympo- sium on Security and Privacy (EuroS&P). IEEE Computer Society, Los Alamitos, CA, USA, 512–527. doi:10.1109/EuroSP.2019.00044
-
[7]
Sanjay Kariyappa, Atul Prakash, and Moinuddin K Qureshi. 2021. MAZE: Data- Free Model Stealing Attack Using Zeroth-Order Gradient Estimation. InPro- ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 13814–13823
2021
-
[8]
Sanjay Kariyappa and Moinuddin K Qureshi. 2020. Defending against model stealing attacks with adaptive misinformation. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition. 770–778
2020
-
[9]
Noora Zidan Khalaf, Israa Ibraheem Al Barazanchi, Israa Ibraheem Al Barazanchi, A. D. Radhi, Sushma Parihar, Pritesh Shah, and Ravi Sekhar. 2025. Development of real-time threat detection systems with AI-driven cybersecurity in critical infrastructure.Mesopotamian Journal of CyberSecurity5, 2 (Jun. 2025), 501–513. doi:10.58496/MJCS/2025/031
-
[10]
2009.Learning multiple layers of features from tiny images
Alex Krizhevsky. 2009.Learning multiple layers of features from tiny images. Technical Report. University of Toronto
2009
-
[11]
Taesung Lee, Benjamin Edwards, Ian Molloy, and Dong Su. 2019. Defending against neural network model stealing attacks using deceptive perturbations. In 2019 IEEE Security and Privacy Workshops (SPW). IEEE, 43–49
2019
-
[12]
Yaron Lipman, Ricky T. Q. Chen, Heli Ben-Hamu, Maximilian Nickel, and Matt Le. 2023. Flow Matching for Generative Modeling
2023
-
[13]
Yaron Lipman, Marton Havasi, Peter Holderrieth, Neta Shaul, Matt Le, Brian Karrer, Ricky T. Q. Chen, David Lopez-Paz, Heli Ben-Hamu, and Itai Gat. 2024. Flow Matching Guide and Code. arXiv:2412.06264 [cs.LG] https://arxiv.org/abs/ 2412.06264
Pith/arXiv arXiv 2024
-
[14]
Jian-Ping Mei, Weibin Zhang, Jie Chen, Xuyun Zhang, and Tiantian Zhu
-
[15]
Defense against model stealing based on account-aware distribution discrepancy. InProceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence and Thirty-Seventh Conference on Innovative Applications of Arti- ficial Intelligence and Fifteenth Symposium on Educational Advances in Artifi- cial Intelligence (AAAI’25/IAAI’25/EAAI’25). AAAI Press...
-
[16]
Eric Nalisnick, Akihiro Matsukawa, Yee Whye Teh, Dilan Gorur, and Balaji Lakshminarayanan. 2019. Do deep generative models know what they don’t know? (2019)
2019
-
[17]
Tribhuvanesh Orekondy, Bernt Schiele, and Mario Fritz. 2019. Knockoff nets: Stealing functionality of black-box models. InProc. IEEE CVPR. 4954–4963
2019
-
[18]
Tribhuvanesh Orekondy, Bernt Schiele, and Mario Fritz. 2020. Prediction Poison- ing: Towards Defenses Against DNN Model Stealing Attacks. InICLR
2020
-
[19]
Nicolas Papernot, Patrick McDaniel, Ian Goodfellow, Somesh Jha, Z Berkay Celik, and Ananthram Swami. 2017. Practical black-box attacks against machine learning. InProceedings of the 2017 ACM on Asia conference on computer and communications security. 506–519
2017
-
[20]
Jonathan Rosenthal, Eric Enouen, Hung Viet Pham, and Lin Tan. 2023. DisGUIDE: Disagreement-Guided Data-Free Model Extraction.Proceedings of the AAAI Conference on Artificial Intelligence37, 8 (Jun. 2023), 9614–9622. doi:10.1609/aaai. v37i8.26150
-
[21]
Minxue Tang, Anna Dai, Louis DiValentin, Aolin Ding, Amin Hass, Neil Zhen- qiang Gong, Yiran Chen, and Hai "Helen" Li. 2024. ModelGuard: Information- Theoretic Defense Against Model Extraction Attacks. In33rd USENIX Security Symposium (USENIX Security 24). USENIX Association, Philadelphia, PA, 5305–
2024
-
[22]
https://www.usenix.org/conference/usenixsecurity24/presentation/tang
-
[23]
Florian Tramèr, Fan Zhang, Ari Juels, Michael K Reiter, and Thomas Ristenpart
-
[24]
Stealing machine learning models via prediction APIs. InProc. USENIX Security. 601–618
-
[25]
Marian Turowski, Benedikt Heidrich, Kaleb Phipps, Kai Schmieder, Oliver Neu- mann, Ralf Mikut, and Veit Hagenmeyer. 2022. Enhancing anomaly detection methods for energy time series using latent space data representations. InPro- ceedings of the Thirteenth ACM International Conference on Future Energy Systems (Virtual Event)(e-Energy ’22). Association for ...
-
[26]
Zhenyi Wang, Li Shen, Tongliang Liu, Tiehang Duan, Yanjun Zhu, Donglin Zhan, David Doermann, and Mingchen Gao. 2023. Defending against data-free model extraction by distributionally robust defensive training.Advances in Neural Information Processing Systems36 (2023), 624–637
2023
-
[27]
Hongwei Yao, Zheng Li, Haiqin Weng, Feng Xue, Zhan Qin, and Kui Ren. 2025. FDINet: Protecting Against DNN Model Extraction Using Feature Distortion Index .IEEE Transactions on Dependable and Secure Computing22, 04 (July 2025), 3179–3191. doi:10.1109/TDSC.2024.3520599
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.