pith. sign in

arxiv: 2605.22749 · v1 · pith:PBVHMBZNnew · submitted 2026-05-21 · 💻 cs.LG · cs.AI

Cyber-Physical Anomaly Detection in IoT-Enabled Smart Grids Using Machine Learning and Metaheuristic Feature Optimization

Pith reviewed 2026-05-22 06:52 UTC · model grok-4.3

classification 💻 cs.LG cs.AI
keywords cyber-physical anomaly detectionsmart gridsfeature selectiongenetic algorithmExtra TreesPMUmachine learningIoT security
0
0 comments X

The pith

Genetic algorithm reduces smart grid PMU features from 112 to 27 while improving attack detection performance.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Modern smart grids collect vast amounts of data from phasor measurement units to monitor operations, but this creates challenges in spotting whether disruptions are natural or caused by hackers. The paper tests standard machine learning classifiers on a public dataset of power system attacks and normal events. It then uses a genetic algorithm to choose the most informative measurements. With the Extra Trees model, the reduced feature set performs slightly better than the full set, suggesting that much of the data is unnecessary for accurate classification. This approach could help make monitoring systems lighter and easier to understand.

Core claim

Applying genetic algorithm feature selection to the clean PMU features in the MSU/ORNL Power System Attack Dataset allows the Extra Trees classifier to achieve a macro-F1 score of 0.9212 and a ROC-AUC of 0.9837 using an average of 27.4 attributes across five runs, compared to 0.9118 and 0.9791 using all 112 attributes.

What carries the argument

The genetic algorithm that searches for an optimal subset of features from the 112 PMU attributes to input into the Extra Trees classifier for classifying events as natural or attack-related.

If this is right

  • Tree-based ensemble methods like Extra Trees and Random Forest outperform other models such as logistic regression or SVM on this dataset.
  • Reducing the feature space by more than 75 percent maintains or improves detection metrics for distinguishing attacks from physical events.
  • A compact set of phasor measurements supports reliable anomaly detection with potentially greater interpretability.
  • Many of the synchronized electrical measurements appear redundant for the classification task.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Lowering the number of features could reduce bandwidth and processing demands on IoT devices in the grid.
  • The same feature selection strategy may help anomaly detection tasks in other high-dimensional sensor environments.
  • Validating the chosen features on live streaming grid data or under added noise would test real-world robustness.
  • Fewer features could let operators focus inspections on the most relevant physical quantities.

Load-bearing premise

The labels in the MSU/ORNL Power System Attack Dataset accurately separate physical incidents from malicious cyber actions, and the modest gains in performance metrics are not the result of random variation in the five runs.

What would settle it

An experiment that applies the identical pipeline to a new dataset collected from a different power grid or that includes a statistical test showing the metric improvements are insignificant would falsify the central claim.

Figures

Figures reproduced from arXiv: 2605.22749 by Adis Alihod\v{z}i\'c, Eva Tuba, Milan Tuba.

Figure 1
Figure 1. Figure 1: Simplified smart grid architecture and cyber-phys [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Simplified AI-based anomaly detection pipeline an [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗
read the original abstract

Modern smart grids rely on dense measurement infrastructures, communication links, and intelligent field devices. Although this improves supervision and control, it also increases vulnerability to cyber-physical disruptions. Operators must distinguish physical incidents, such as faults or line disturbances, from malicious actions, such as false data injection or unauthorized command execution. This chapter investigates this problem using the well-known MSU/ORNL Power System Attack Dataset. The proposed method combines machine learning with genetic-algorithm-based feature selection. The objective is twofold: to classify attack and natural events accurately, and to determine whether a reduced set of physically informative PMU/IED measurements can support reliable detection. Several baseline models are evaluated, including logistic regression, RBF-SVM, XGBoost, Random Forest, and Extra Trees. The results show that tree-based ensemble models are the most effective for the considered dataset, with Extra Trees providing the strongest full-feature baseline. After feature selection, the GA + Extra Trees model reduces the clean PMU feature space from 112 attributes to an average of 27.4 attributes over five runs, while increasing macro-F1 from 0.9118 to 0.9212 and ROC-AUC from 0.9791 to 0.9837. These results indicate that many synchronized electrical measurements are redundant. A compact subset of phasor-based features can still provide accurate and interpretable anomaly detection in smart grids.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The manuscript evaluates machine learning classifiers on the MSU/ORNL Power System Attack Dataset for distinguishing cyber-physical anomalies in smart grids. It proposes genetic-algorithm feature selection to reduce the 112-dimensional clean PMU feature space and reports that the GA + Extra Trees combination yields an average of 27.4 selected features across five runs while raising macro-F1 from 0.9118 to 0.9212 and ROC-AUC from 0.9791 to 0.9837 relative to the full-feature Extra Trees baseline. Tree-based ensembles are shown to outperform logistic regression, RBF-SVM, and XGBoost on this dataset.

Significance. If the reported gains prove robust, the work supplies concrete evidence that a compact, physically interpretable subset of synchronized measurements can support reliable anomaly detection, which is useful for resource-constrained IoT deployments in smart grids. The use of a public dataset, multiple GA runs, and direct comparison against several baselines are strengths that make the empirical claims falsifiable and reproducible in principle.

major comments (1)
  1. [Results] Results (performance tables and GA runs): the headline claim that GA feature selection improves macro-F1 and ROC-AUC rests on small deltas (≈0.0094 and ≈0.0046). No standard deviations, per-run scores, confidence intervals, or hypothesis tests (e.g., Wilcoxon or paired t-test) are reported for the five GA runs. Without these, it is impossible to determine whether the observed gains exceed stochastic variation arising from GA initialization, random train/test splits, or label noise in the MSU/ORNL dataset.
minor comments (2)
  1. [Experimental setup] The manuscript does not specify the train/test split ratios, cross-validation procedure, or hyperparameter search method used for the baseline models and the GA runs.
  2. [Methods] Notation for the GA fitness function and the precise definition of the 'clean PMU feature space' (112 attributes) should be stated explicitly in the methods section.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. The concern about statistical robustness of the reported performance gains is valid and we address it directly below, committing to revisions that will strengthen the presentation of results.

read point-by-point responses
  1. Referee: [Results] Results (performance tables and GA runs): the headline claim that GA feature selection improves macro-F1 and ROC-AUC rests on small deltas (≈0.0094 and ≈0.0046). No standard deviations, per-run scores, confidence intervals, or hypothesis tests (e.g., Wilcoxon or paired t-test) are reported for the five GA runs. Without these, it is impossible to determine whether the observed gains exceed stochastic variation arising from GA initialization, random train/test splits, or label noise in the MSU/ORNL dataset.

    Authors: We agree that the modest deltas require supporting statistical evidence to confirm they exceed stochastic variation. In the revised manuscript we will add a table listing the macro-F1 and ROC-AUC values obtained in each of the five independent GA runs, together with the corresponding mean, standard deviation, and 95% confidence intervals. We will also include the results of a paired statistical test (Wilcoxon signed-rank or paired t-test) comparing the full-feature Extra Trees baseline against the GA-selected feature models across the same data splits. These additions will make the robustness of the gains transparent and address the referee's concern about initialization, split, and label-noise variability. revision: yes

Circularity Check

0 steps flagged

No significant circularity in empirical ML evaluation

full rationale

The paper is a purely empirical study that applies standard machine learning classifiers (logistic regression, SVM, XGBoost, Random Forest, Extra Trees) together with a genetic algorithm for feature selection on the MSU/ORNL Power System Attack Dataset. It reports concrete performance numbers (macro-F1, ROC-AUC) and feature counts before and after selection. No mathematical derivations, equations, fitted parameters presented as predictions, or self-referential steps appear in the described chain. All results are direct experimental outcomes on held-out data, making the analysis self-contained with no reduction of claims to their own inputs by construction.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The work rests on the assumption that the chosen public dataset faithfully represents real cyber-physical events and that standard ML cross-validation practices suffice to validate the feature selection.

free parameters (1)
  • Genetic algorithm hyperparameters
    Population size, crossover, and mutation rates are chosen to run the feature selector but not detailed in the abstract.
axioms (1)
  • domain assumption Dataset labels correctly separate natural faults from cyber attacks.
    The entire classification task depends on the ground-truth annotations in the MSU/ORNL dataset.

pith-pipeline@v0.9.0 · 5797 in / 1320 out tokens · 63743 ms · 2026-05-22T06:52:09.806390+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

32 extracted references · 32 canonical work pages

  1. [1]

    A review of smart grid anomaly detection approaches pertaining to artificial inte lligence,

    M. F. Guato Burgos, J. Morato, and F. P. Vizcaino Imaca˜ na , “A review of smart grid anomaly detection approaches pertaining to artificial inte lligence,” Applied Sciences, vol. 14, no. 3, article 1194, 2024, doi: https://doi.org/10.339 0/app14031194

  2. [2]

    AI- powered cybersecurity for smart grid communication: A systematic review of intrus ion detection and threat mitigation systems,

    S. Afrin, M. R. Al Muttaki, A. I. A. Anil, and S. Hasan, “AI- powered cybersecurity for smart grid communication: A systematic review of intrus ion detection and threat mitigation systems,” Energy Conversion and Management: X , article 101416, 2025, doi: https://doi.org/10.1016/j.ecmx.2025.101416

  3. [3]

    A detection model for f alse data injec- tion attacks in smart grids based on graph spatial features u sing temporal convolu- tional neural networks,

    X. Wang, M. Hu, X. Luo, and X. Guan, “A detection model for f alse data injec- tion attacks in smart grids based on graph spatial features u sing temporal convolu- tional neural networks,” Electric Power Systems Research , article 111126, 2025, doi: https://doi.org/10.1016/j.epsr.2024.111126

  4. [4]

    Cyber attack detection in smart grids: A survey of methods, challe nges and fu- ture directions,

    P. Vigneshwaran, S. Thuseethan, B. Shanmugam, and S. The nnadil, “Cyber attack detection in smart grids: A survey of methods, challe nges and fu- ture directions,” Computer Science Review , vol. 60, article 100915, 2026, doi: https://doi.org/10.1016/j.cosrev.2026.100915

  5. [5]

    A review of artificial intelligence techniques for anomaly detection i n smart grid,

    M. Al Amin Sarker, I. A. Jayaraj, B. Shanmugam, S. Azam, an d S. Thennadil, “A review of artificial intelligence techniques for anomaly detection i n smart grid,” Artificial Intelligence Review, vol. 59, article 69, 2026, doi: https://doi.org/10.1007/ s10462-025-11429-x

  6. [6]

    False data injection attack detection in s mart grid based on learnable unified neighborhood-based anomaly ranking,

    J. Luo et al., “False data injection attack detection in s mart grid based on learnable unified neighborhood-based anomaly ranking,” Electronics, vol. 14, no. 17, article 3396, 2025, doi: https://doi.org/10.3390/electronics14173396

  7. [7]

    Detection of disturbances and cyber-at tacks in smart grids using explain- able machine learning,

    M. Farsi et al., “Detection of disturbances and cyber-at tacks in smart grids using explain- able machine learning,” Scientific Reports, 2026, doi: https://doi.org/10.1038/s41598-026- 35449-x

  8. [8]

    IEEE Access11, 7157–7179 (2023) https://doi.org/10.1109/ACCESS.2023.3237554

    J. Jithish, B. Alangot, N. Mahalingam, and K. S. Yeo, “Dis tributed anomaly detection in smart grids: A federated learning-based approach,” IEEE Access, vol. 11, pp. 7157–7179, 2023, doi: https://doi.org/10.1109/ACCESS.2023.323755 4

  9. [9]

    False data injec tion attack detection in edge-based smart metering networks with federated learnin g,

    M. R. Uddin, R. Rahman, and D. C. Nguyen, “False data injec tion attack detection in edge-based smart metering networks with federated learnin g,” arXiv:2411.01313, 2024, doi: https://doi.org/10.48550/arXiv.2411.01313

  10. [10]

    Graph neural networks based detection of stealt h false data injection at- tacks in smart grids,

    O. Boyaci, A. Umunnakwe, A. Sahu, M. R. Narimani, M. Isma il, K. Davis, and E. Serpedin, “Graph neural networks based detection of stealt h false data injection at- tacks in smart grids,” IEEE Systems Journal , vol. 16, no. 2, pp. 2946–2957, 2022, doi: https://doi.org/10.1109/JSYST.2021.3109082

  11. [11]

    Super-resolution perception assisted spatiotemporal graph deep learning ag ainst false data injection attacks in smart grid,

    J. Ruan, G. Fan, Y. Zhu, G. Liang, J. Zhao, F. Wen, and Z. Y. Dong, “Super-resolution perception assisted spatiotemporal graph deep learning ag ainst false data injection attacks in smart grid,” IEEE Transactions on Smart Grid , vol. 14, no. 5, pp. 4035–4046, 2023, doi: https://doi.org/10.1109/TSG.2023.3241268

  12. [12]

    SimBench—A benchmark dataset of el ectric power systems to compare innovative solutions based on power flow analysis,

    S. Meinecke et al., “SimBench—A benchmark dataset of el ectric power systems to compare innovative solutions based on power flow analysis,” Energies, vol. 13, no. 12, article 3290, 2020, doi: https://doi.org/10.3390/en13123290. 14

  13. [13]

    Deep learning for anomaly detection

    G. Pang, C. Shen, L. Cao, and A. van den Hengel, “Deep lear ning for anomaly de- tection: A review,” ACM Computing Surveys , vol. 54, no. 2, pp. 1–38, 2021, doi: https://doi.org/10.1145/3439950

  14. [14]

    The learning to run a power network chal lenge: A retrospective analysis,

    A. Kelly et al., “The learning to run a power network chal lenge: A retrospective analysis,” arXiv:2103.03104, 2021, doi: https://doi.org/10.48550/ arXiv.2103.03104

  15. [16]

    A survey on IoT-enabled smart grids: Emerging, applications, challenges, and outl ook,

    A. Goudarzi, F. Ghayoor, M. Waseem, S. Fahad, and I. Trao re, “A survey on IoT-enabled smart grids: Emerging, applications, challenges, and outl ook,” Energies, vol. 15, no. 19, article 6984, 2022, doi: https://doi.org/10.3390/en1519 6984

  16. [17]

    A comprehensive r eview of the incor- poration of electric vehicles and renewable energy distrib uted generation regarding smart grids,

    M. Ntombela, K. Musasa, and K. Moloi, “A comprehensive r eview of the incor- poration of electric vehicles and renewable energy distrib uted generation regarding smart grids,” World Electric Vehicle Journal , vol. 14, no. 7, article 176, 2023, doi: https://doi.org/10.3390/wevj14070176

  17. [18]

    S tudy of smart grid cyber-security, examining architectures, communication networks, cyber-attacks, coun- termeasure techniques, and challenges,

    B. Achaal, M. Adda, M. Berger, H. Ibrahim, and A. Awde, “S tudy of smart grid cyber-security, examining architectures, communication networks, cyber-attacks, coun- termeasure techniques, and challenges,” Cybersecurity, vol. 7, article 10, 2024, doi: https://doi.org/10.1186/s42400-023-00200-w

  18. [19]

    Masset, R

    Y. M. Banad, S. S. Sharif, and Z. Rezaei, “Artificial inte lligence and machine learning for smart grids: From foundational paradigms to emerging techn ologies with digital twin and large language model-driven intelligence,” Energy Conversion and Management: X , vol. 28, article 101329, 2025, doi: https://doi.org/10.1016/j .ecmx.2025.101329

  19. [20]

    False data injection a ttacks against state estimation in electric power grids,

    Y. Liu, P. Ning, and M. K. Reiter, “False data injection a ttacks against state estimation in electric power grids,” ACM Transactions on Information and System Security , vol. 14, no. 1, article 13, pp. 1–33, 2011, doi: https://doi.org/10. 1145/1952982.1952995

  20. [21]

    Machine lea rning-based intrusion detection for smart grid computing: A survey,

    N. Sahani, R. Zhu, J.-H. Cho, and C.-C. Liu, “Machine lea rning-based intrusion detection for smart grid computing: A survey,” ACM Transactions on Cyber-Physical Systems , vol. 7, no. 2, article 11, pp. 1–31, 2023, doi: https://doi.org/1 0.1145/3578366

  21. [22]

    Mach ine learning-based feature selection for intrusion detection systems in IEC 61 850-based digital sub- stations,

    A. Eynawi, A. Mumrez, G. Elbez, and V. Hagenmeyer, “Mach ine learning-based feature selection for intrusion detection systems in IEC 61 850-based digital sub- stations,” in Proc. 2024 IEEE International Conference on Communications, Con - trol, and Computing Technologies for Smart Grids (SmartGrid Comm), 2024, doi: https://doi.org/10.1109/SmartGridComm605...

  22. [23]

    False data inject ion attacks on smart grids: Attack models, challenges and future directions,

    T. D. Caleb, S. Shao, and N. Kaabouch, “False data inject ion attacks on smart grids: Attack models, challenges and future directions,” International Journal of Information Security, vol. 25, article 94, 2026, doi: https://doi.org/10.1007/ s10207-026-01262-w

  23. [24]

    Machine learning for power system disturbance and cyber-a ttack discrimination,

    R. C. Borges Hink, J. M. Beaver, M. A. Buckner, T. Morris, U. Adhikari, and S. Pan, “Machine learning for power system disturbance and cyber-a ttack discrimination,” in Proc. 7th International Symposium on Resilient Control Systems (ISR CS), Denver, CO, USA, 2014, article 6900095, doi: https://doi.org/10.1109/ISR CS.2014.6900095

  24. [25]

    A n anomaly-based intrusion detec- tion system for the smart grid based on CART decision tree,

    P. I. Radoglou-Grammatikis and P. G. Sarigiannidis, “A n anomaly-based intrusion detec- tion system for the smart grid based on CART decision tree,” i n Proc. 2018 Global Infor- mation Infrastructure and Networking Symposium (GIIS) , Thessaloniki, Greece, 2018, pp. 1–5, doi: https://doi.org/10.1109/GIIS.2018.8635743. 15

  25. [26]

    Real-time detection of f alse data injection attacks in smart grid: A deep learning-based intelligent mechanism,

    Y. He, G. J. Mendis, and J. Wei, “Real-time detection of f alse data injection attacks in smart grid: A deep learning-based intelligent mechanism,” IEEE Transactions on Smart Grid, vol. 8, no. 5, pp. 2505–2516, 2017, doi: https://doi.org/1 0.1109/TSG.2017.2703842

  26. [27]

    A unified deep learning anomaly detection and classi fication approach for smart grid environments,

    I. Siniosoglou, P. Radoglou-Grammatikis, G. Efstatho poulos, P. Fouliras, and P. Sarigian- nidis, “A unified deep learning anomaly detection and classi fication approach for smart grid environments,” IEEE Transactions on Network and Service Management , vol. 18, no. 2, pp. 1137–1151, 2021, doi: https://doi.org/10.1109/TNS M.2021.3078381

  27. [28]

    LSTM-based false dat a injection attack de- tection in smart grids,

    Y. Zhao, X. Jia, D. An, and Q. Yang, “LSTM-based false dat a injection attack de- tection in smart grids,” in Proc. 2020 35th Youth Academic Annual Conference of Chinese Association of Automation (YAC) , Zhanjiang, China, 2020, pp. 638–644, doi: https://doi.org/10.1109/YAC51587.2020.9337674

  28. [29]

    Deep learning for online AC f alse data injec- tion attack detection in smart grids: An approach using LSTM -autoencoder,

    L. Yang, Y. Zhai, and Z. Li, “Deep learning for online AC f alse data injec- tion attack detection in smart grids: An approach using LSTM -autoencoder,” Jour- nal of Network and Computer Applications , vol. 193, article 103178, 2021, doi: https://doi.org/10.1016/j.jnca.2021.103178

  29. [30]

    Detec tion of false data injection attacks in smart grid: A secure federated dee p learning approach,

    Y. Li, X. Wei, Y. Li, Z. Dong, and M. Shahidehpour, “Detec tion of false data injection attacks in smart grid: A secure federated dee p learning approach,” IEEE Transactions on Smart Grid , vol. 13, no. 6, pp. 4862–4872, 2022, doi: https://doi.org/10.1109/TSG.2022.3204796

  30. [31]

    Proposed algorithm for sm art grid DDoS detec- tion based on deep learning,

    S. Y. Diaba and M. Elmusrati, “Proposed algorithm for sm art grid DDoS detec- tion based on deep learning,” Neural Networks , vol. 159, pp. 175–184, 2023, doi: https://doi.org/10.1016/j.neunet.2022.12.011

  31. [32]

    D. E. Goldberg, Genetic Algorithms in Search, Optimization, and Machine Le arning. Read- ing, MA, USA: Addison-Wesley, 1989

  32. [33]

    Extremely random ized trees,

    P. Geurts, D. Ernst, and L. Wehenkel, “Extremely random ized trees,” Machine Learning, vol. 63, no. 1, pp. 3–42, 2006, doi: https://doi.org/10.100 7/s10994-006-6226-1. 16