Lightweight CNN-Based Anomaly Detection for High Voltage Converter Modulators in the Spallation Neutron Source
Pith reviewed 2026-06-28 23:28 UTC · model grok-4.3
The pith
Ordering temporal filtering before cross-channel mixing in CNNs raises anomaly detection performance on HVCM pulse data to AUC-PR 0.816.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Varying the order of temporal convolution and cross-channel mixing, together with adaptive channel reweighting, produces CNNs whose detection performance on the public HVCM dataset reaches a pooled AUC-PR of 0.816 and AUC-ROC of 0.934, exceeding the previous state of the art on most subsystems and five of the six fault families. Per-fault-family results further indicate that performance tracks whether a fault's precursors appear as amplitude shifts in single channels or as subtler statistical dependencies across channels.
What carries the argument
The ordering of 1-D temporal filtering and cross-channel mixing layers inside the CNN, optionally combined with per-pulse adaptive channel reweighting.
If this is right
- Detection accuracy improves when temporal filtering precedes cross-channel mixing for faults whose signatures are primarily amplitude shifts in individual sensors.
- Adaptive per-pulse channel reweighting further lifts sensitivity on faults that require joint representations across channels.
- Three sensor channels dominate performance across the tested subsystems and fault families.
- The same ordering principle can be applied to other multi-channel pulse or waveform datasets collected at accelerator facilities.
Where Pith is reading between the lines
- The same architectural ordering test could be run on other time-series anomaly problems that mix independent sensor streams with occasional coordinated events.
- If the ordering effect generalizes, it supplies a cheap way to improve existing CNN pipelines without increasing model size.
- Future work could measure whether the identified dominant channels remain stable when the same models are deployed on newer HVCM hardware.
Load-bearing premise
The public HVCM dataset and its fault labels are representative of real operating conditions, and measured performance differences arise from the tested architectural ordering rather than from training details or data splits.
What would settle it
Retraining the same architectures on a fresh random split of the HVCM recordings or with altered hyperparameters and checking whether the reported performance gap over the prior state of the art remains.
Figures
read the original abstract
Unscheduled trips of high-power pulsed converters are a leading source of downtime at large accelerator facilities. At the Spallation Neutron Source (SNS), the High Voltage Converter Modulators (HVCMs) are consistently the second-largest contributor to lost beam time. Each HVCM pulse is recorded across sensor channels spanning currents, voltages, and magnetic fluxes, whose mutual interactions encode the operating state of the system. Fault precursors do not manifest uniformly across these channels: depending on fault type, they may alter the temporal structure of individual signals, change the statistical dependencies among channels, or both. Existing deep-learning approaches typically process multi-channel signals with standard convolutional pipelines that entangle temporal and cross-channel operations from the first layer, giving the model no explicit mechanism to represent channel independence or structured inter-channel interaction. We hypothesise that architectural inductive bias, specifically the ordering of temporal filtering and cross-channel mixing, plays a central role in detection performance on this class of data. To test this, we vary the order in which these two operations are applied, and examine whether per-pulse adaptive channel reweighting further improves sensitivity. Evaluated on the public HVCM dataset across all four SNS subsystems (RFQ, DTL, CCL, SCL), our best variant achieves a pooled AUC-PR of 0.816 and AUC-ROC of 0.934, outperforming the state of the art on most subsystems and five of the six fault families. Ablations identify three dominant input channels and link per-fault-family performance to whether precursors manifest as amplitude shifts in individual channels or as subtler patterns requiring joint channel representations to surface.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that varying the order of temporal filtering versus cross-channel mixing in lightweight CNNs, combined with per-pulse adaptive channel reweighting, yields improved anomaly detection on multi-channel HVCM pulse data from the Spallation Neutron Source. On the public HVCM dataset across RFQ, DTL, CCL, and SCL subsystems, the best variant reports pooled AUC-PR of 0.816 and AUC-ROC of 0.934, outperforming prior art on most subsystems and five of six fault families. Ablations identify three dominant channels and tie per-family performance to whether faults appear as single-channel amplitude shifts or require joint representations.
Significance. If the reported gains are shown to arise from the tested architectural ordering under controlled conditions, the work supplies concrete evidence that inductive bias in the sequencing of temporal and cross-channel operations matters for this class of sensor data. The public dataset and explicit ablation on channel importance are strengths that support reproducibility and domain insight; the result could guide CNN design choices for other multi-channel pulsed systems where faults manifest unevenly across sensors.
major comments (1)
- [Experimental setup] Experimental setup (likely §4 or §5): the central claim that performance differences are driven by the ordering of temporal filtering and channel mixing (plus reweighting) requires that all architectural variants and reimplemented baselines were trained under identical protocols. The manuscript must explicitly confirm or tabulate that optimizer, learning-rate schedule, initialization, regularization, batch size, and train/validation/test splits were held fixed across comparisons; without this, the AUC-PR/ROC gains (0.816/0.934) cannot be confidently attributed to the hypothesized inductive bias rather than unstated implementation details.
minor comments (1)
- [Abstract] Abstract and §3: the phrase 'outperforming the state of the art on most subsystems' would be clearer if accompanied by a per-subsystem table (even if summarized) rather than a pooled statement alone.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on experimental controls. We address the single major comment below and will revise the manuscript to strengthen the documentation of training protocols.
read point-by-point responses
-
Referee: [Experimental setup] Experimental setup (likely §4 or §5): the central claim that performance differences are driven by the ordering of temporal filtering and channel mixing (plus reweighting) requires that all architectural variants and reimplemented baselines were trained under identical protocols. The manuscript must explicitly confirm or tabulate that optimizer, learning-rate schedule, initialization, regularization, batch size, and train/validation/test splits were held fixed across comparisons; without this, the AUC-PR/ROC gains (0.816/0.934) cannot be confidently attributed to the hypothesized inductive bias rather than unstated implementation details.
Authors: All architectural variants and reimplemented baselines were trained under identical protocols: Adam optimizer with the same learning-rate schedule (initial 1e-3 with cosine decay), Xavier initialization, identical regularization (dropout rate 0.2 and weight decay 1e-4), fixed batch size of 32, and the same stratified train/validation/test splits (70/15/15) on the public HVCM dataset. We will add an explicit table in §4 listing these shared hyperparameters across all models to remove any ambiguity and allow direct attribution of performance differences to the architectural ordering and reweighting. revision: yes
Circularity Check
No circularity; empirical results on public dataset
full rationale
The paper reports experimental AUC-PR/ROC scores from CNN variants evaluated on the public HVCM dataset across subsystems and fault families. No equations, parameter fits, or derivations are present that could reduce the performance metrics to inputs by construction. No self-citation load-bearing steps, uniqueness theorems, or ansatzes are invoked. The central claim is an empirical comparison of architectural orderings, which is self-contained against the external benchmark of the public dataset and does not rely on any definitional or fitted-input loop.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
D. E. Anderson, V . Peplov, D. J. Solley, and M. Wezensky. Recent developments in the improvement campaign for the high voltage converter modulator at the spallation neutron source. InIEEE International Power Modulator and High Voltage Conference (IPMHVC), pages 672–675, 2014
2014
-
[2]
T. E. Mason, D. Abernathy, J. Ankner, A. Ekkebus, G. Granroth, M. Hagen, K. Herwig, C. Hoffmann, C. Horak, F. Klose, S. Miller, J. Neuefeind, C. Tulk, and X.-L. Wang. The spallation neutron source: A powerful tool for materials research.AIP Conference Proceedings, 773(1):21–25, 2005
2005
-
[3]
Radaideh, Dan Lu, Pradeep Ramuhalli, and Sarah Cousineau
Yasir Alanazi, Malachi Schram, Kishansingh Rajput, Steven Goldenberg, Lasitha Vidyaratne, Chris Pappas, Majdi I. Radaideh, Dan Lu, Pradeep Ramuhalli, and Sarah Cousineau. Multi-module based CV AE to predict HVCM faults in the SNS accelerator.Machine Learning with Applications, 2023
2023
-
[4]
Radaideh, Chris Pappas, Jared Walden, Dan Lu, Lasitha Vidyaratne, Thomas Britton, Kishansingh Rajput, Malachi Schram, and Sarah Cousineau
Majdi I. Radaideh, Chris Pappas, Jared Walden, Dan Lu, Lasitha Vidyaratne, Thomas Britton, Kishansingh Rajput, Malachi Schram, and Sarah Cousineau. Time series anomaly detection in power electronics signals with recurrent and ConvLSTM autoencoders.Digital Signal Processing, 130:103704, 2022
2022
-
[5]
Particle accelerator power system early fault diagnosis based on deep learning and multi-sensor feature fusion.Engineering Research Express, 6:025225, 2024
Jiqing Zhou, Deming Li, and Haijun Su. Particle accelerator power system early fault diagnosis based on deep learning and multi-sensor feature fusion.Engineering Research Express, 6:025225, 2024. 19 Lightweight CNN-Based Anomaly Detection for HVCM in SNS
2024
-
[6]
Nguyen, Phanwadee Sinthong, and Jayant Kalagnanam
Yuqi Nie, Nam H. Nguyen, Phanwadee Sinthong, and Jayant Kalagnanam. A time series is worth 64 words: Long-term forecasting with transformers. InInternational Conference on Learning Representations (ICLR), 2023
2023
-
[7]
Xception: deep learning with depthwise separable convolutions
Franc ¸ois Chollet. Xception: deep learning with depthwise separable convolutions. InIEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 1251–1258, 2017
2017
-
[8]
Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco An- dreetto, and Hartwig Adam
Andrew G. Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco An- dreetto, and Hartwig Adam. MobileNets: efficient convolutional neural networks for mobile vision applications, 2017
2017
-
[9]
Squeeze-and-excitation networks
Jie Hu, Li Shen, and Gang Sun. Squeeze-and-excitation networks. InIEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 7132–7141, 2018
2018
-
[10]
Radaideh, Chris Pappas, and Sarah Cousineau
Majdi I. Radaideh, Chris Pappas, and Sarah Cousineau. Real electronic signal data from particle accelerator power systems for machine learning anomaly detection.Data in Brief, 43:108473, 2022
2022
-
[11]
W. A. Reass, S. Apgar, D. Baca, D. Borovina, J. Bradle, J. Doss, J. Gonzales, R. Gribble, T. Hardek, M. Lynch, D. Rees, P. Tallerico, P. Trujillo, D. Anderson, D. Heidenreich, J. Hicks, and V . Leontiev. Design, status, and first operations of the spallation neutron source polyphase resonant converter modulator system. InProceedings of the 2003 Particle A...
-
[12]
D. J. Solley, D. E. Anderson, G. P. Patel, V . V . Peplov, R. Saethre, and M. W. Wezensky. HVCM topology enhancements to support a power upgrade required by a second target station (STS) at SNS. InIEEE International Power Modulator and High Voltage Conference (IPMHVC), pages 362–365, 2012
2012
-
[13]
Feasibility of proton bunch compression in an operational high-power accelerator
Austin Hoover, Vasiliy Morozov, Amith Narayan, and Maurice Piller. Feasibility of proton bunch compression in an operational high-power accelerator. In6th North American Particle Accelerator Conference (NAPAC’25), 2025
2025
-
[14]
Machine learning for improved availability of the SNS klystron high voltage converter modulators
Chris Pappas, Dan Lu, Malachi Schram, and Draguna Vrabie. Machine learning for improved availability of the SNS klystron high voltage converter modulators. In12th International Particle Accelerator Conference (IPAC’21), pages 4303–4306, 2021
2021
-
[15]
IEEE Transactions on Nuclear Science 63, 878–897
Auralee L. Edelen, Sandra Biedron, Brian Chase, Daniel Edstrom, Stephen Milton, and Pierluigi Stabile. Neural networks for modeling and control of particle accelerators.IEEE Transactions on Nuclear Science (TNS), 63(2): 878–897, 2016. doi: 10.1109/TNS.2016.2543203
-
[16]
Nguyen, M
D. Nguyen, M. Lee, R. Sass, and H. Shoaee. Accelerator and feedback control simulation using neural networks. Technical Report SLAC-PUB-5503, Stanford Linear Accelerator Center, 1991
1991
-
[17]
Superconducting radio-frequency cavity fault classification using machine learning at Jefferson Laboratory.Physical Review Accelerators and Beams, 23:114601, 2020
Chris Tennant, Adam Carpenter, Tom Powers, Anna Shabalina Solopova, Lasitha Vidyaratne, and Khan Iftekharuddin. Superconducting radio-frequency cavity fault classification using machine learning at Jefferson Laboratory.Physical Review Accelerators and Beams, 23:114601, 2020
2020
-
[18]
Deep learning based superconducting radio-frequency cavity fault classification at Jefferson Laboratory.Frontiers in Artificial Intelligence, 4:718950, 2022
Lasitha Vidyaratne, Adam Carpenter, Tom Powers, Chris Tennant, Khan Iftekharuddin, Md Monibor Rahman, and Anna Shabalina. Deep learning based superconducting radio-frequency cavity fault classification at Jefferson Laboratory.Frontiers in Artificial Intelligence, 4:718950, 2022
2022
-
[19]
Detection of faulty beam position monitors using unsupervised learning.Physical Review Accelerators and Beams, 23:102805, 2020
Elena Fol, Rogelio Tom ´as, Jaime Coello de Portugal, and Giuliano Franchetti. Detection of faulty beam position monitors using unsupervised learning.Physical Review Accelerators and Beams, 23:102805, 2020
2020
-
[20]
Edelen and Nathan M
Jonathan P. Edelen and Nathan M. Cook. Anomaly detection in particle accelerators using autoencoders, 2021
2021
-
[21]
Machine learning-based anomaly detection for particle accelerators
Davide Marcato, Giovanni Arena, Damiano Bortolato, Fabio Gelain, Vincenzo Martinelli, Enrico Munaron, Marco Roetta, Giorgia Savarese, and Gian Antonio Susto. Machine learning-based anomaly detection for particle accelerators. InIEEE Conference on Control Technology and Applications (CCTA), pages 240–246, 2021
2021
-
[22]
Using LSTM recurrent neural networks for monitoring the LHC superconducting magnets.Nuclear Instruments and Methods in Physics Research Section A, 867:40–50, 2017
Maciej Wielgosz, Andrzej Skoczen, and Matej Mertik. Using LSTM recurrent neural networks for monitoring the LHC superconducting magnets.Nuclear Instruments and Methods in Physics Research Section A, 867:40–50, 2017
2017
-
[23]
Maciej Wielgosz, Matej Mertik, and Andrzej Skoczen. The model of an anomaly detector for HiLumi LHC magnets based on recurrent neural networks and adaptive quantization.Engineering Applications of Artificial Intelligence, 74:166–185, 2018. 20 Lightweight CNN-Based Anomaly Detection for HVCM in SNS
2018
-
[24]
Predicting disruptive instabilities in controlled fusion plasmas through deep learning.Nature, 568:526–531, 2019
Julian Kates-Harbeck, Alexey Svyatkovskiy, and William Tang. Predicting disruptive instabilities in controlled fusion plasmas through deep learning.Nature, 568:526–531, 2019
2019
-
[25]
DisruptionPy: an open framework for tokamak disruption-prediction data pipelines.Journal of Open Source Software, 2024
Henry Lucas et al. DisruptionPy: an open framework for tokamak disruption-prediction data pipelines.Journal of Open Source Software, 2024
2024
-
[26]
Anomaly detection of particle accelerators using spatial-temporal contrastive fusion of multi-sensor time series.Reliability Engineering & System Safety, 267:111940, 2026
Zhe Yang, Ruichang Zhou, Yongcheng He, Jianyu Long, Lin Fang, and Chuan Li. Anomaly detection of particle accelerators using spatial-temporal contrastive fusion of multi-sensor time series.Reliability Engineering & System Safety, 267:111940, 2026
2026
-
[27]
Predicting particle accelerator failures using binary clas- sifiers.Nuclear Instruments and Methods in Physics Research Section A, 955:163240, 2020
Miha Re ˇsˇciˇc, Rebecca Seviour, and Willem Blokland. Predicting particle accelerator failures using binary clas- sifiers.Nuclear Instruments and Methods in Physics Research Section A, 955:163240, 2020
2020
-
[28]
Miha Re ˇsˇciˇc, Rebecca Seviour, and Willem Blokland. Improvements of pre-emptive identification of particle accelerator failures using binary classifiers and dimensionality reduction.Nuclear Instruments and Methods in Physics Research Section A, 1025:166064, 2022
2022
-
[29]
Uncertainty aware anomaly detection to predict errant beam pulses in the Oak Ridge spallation neutron source accelerator.Physical Review Accelerators and Beams, 25:122802, 2022
Willem Blokland, Kishansingh Rajput, Malachi Schram, Torri Jeske, Pradeep Ramuhalli, Charles Peters, Yigit Yucesan, and Alexander Zhukov. Uncertainty aware anomaly detection to predict errant beam pulses in the Oak Ridge spallation neutron source accelerator.Physical Review Accelerators and Beams, 25:122802, 2022
2022
-
[30]
Robust errant beam prognostics with conditional modeling for particle accelerators.Machine Learning: Science and Technology, 2023
Kishansingh Rajput, Malachi Schram, Willem Blokland, et al. Robust errant beam prognostics with conditional modeling for particle accelerators.Machine Learning: Science and Technology, 2023
2023
-
[31]
Radaideh, Chris Pappas, Dan Lu, Jared Walden, Sarah Cousineau, Thomas Britton, Kishansingh Rajput, Lasitha Vidyaratne, and Malachi Schram
Majdi I. Radaideh, Chris Pappas, Dan Lu, Jared Walden, Sarah Cousineau, Thomas Britton, Kishansingh Rajput, Lasitha Vidyaratne, and Malachi Schram. Progress on machine learning for the SNS high voltage converter modulators. In5th North American Particle Accelerator Conference (NAPAC’22), pages 709–712, 2022
2022
-
[32]
ModernTCN: a modern pure convolution structure for general time series analysis
Donghao Luo and Xue Wang. ModernTCN: a modern pure convolution structure for general time series analysis. InInternational Conference on Learning Representations (ICLR), 2024
2024
-
[33]
The elephant in the room: towards a reliable time-series anomaly detection benchmark (TSB-AD)
Qinghua Liu and John Paparrizos. The elephant in the room: towards a reliable time-series anomaly detection benchmark (TSB-AD). InAdvancements in Neural Information Processing Systems (NeurIPS), Datasets and Benchmarks Track, 2024. 21
2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.