pith. sign in

arxiv: 2605.31259 · v1 · pith:5TM4DHK5new · submitted 2026-05-29 · 💻 cs.LG

Lightweight CNN-Based Anomaly Detection for High Voltage Converter Modulators in the Spallation Neutron Source

Pith reviewed 2026-06-28 23:28 UTC · model grok-4.3

classification 💻 cs.LG
keywords anomaly detectionconvolutional neural networkshigh voltage converter modulatorsfault detectiontime seriesinductive biasSpallation Neutron Source
0
0 comments X

The pith

Ordering temporal filtering before cross-channel mixing in CNNs raises anomaly detection performance on HVCM pulse data to AUC-PR 0.816.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tests whether the sequence in which a CNN applies time-wise filtering and channel-wise mixing affects its ability to spot fault precursors in multi-channel high-voltage converter modulator recordings. It compares several lightweight architectures that differ only in that ordering and adds an optional per-pulse channel reweighting step. On the public SNS HVCM dataset covering four subsystems and six fault families the strongest variant reaches a pooled AUC-PR of 0.816 and AUC-ROC of 0.934, beating prior published results on most subsystems and five of the six fault types. Ablation results tie the gains to three dominant sensor channels and show that faults whose precursors appear as isolated amplitude changes are easier to catch than those requiring coordinated multi-channel patterns.

Core claim

Varying the order of temporal convolution and cross-channel mixing, together with adaptive channel reweighting, produces CNNs whose detection performance on the public HVCM dataset reaches a pooled AUC-PR of 0.816 and AUC-ROC of 0.934, exceeding the previous state of the art on most subsystems and five of the six fault families. Per-fault-family results further indicate that performance tracks whether a fault's precursors appear as amplitude shifts in single channels or as subtler statistical dependencies across channels.

What carries the argument

The ordering of 1-D temporal filtering and cross-channel mixing layers inside the CNN, optionally combined with per-pulse adaptive channel reweighting.

If this is right

  • Detection accuracy improves when temporal filtering precedes cross-channel mixing for faults whose signatures are primarily amplitude shifts in individual sensors.
  • Adaptive per-pulse channel reweighting further lifts sensitivity on faults that require joint representations across channels.
  • Three sensor channels dominate performance across the tested subsystems and fault families.
  • The same ordering principle can be applied to other multi-channel pulse or waveform datasets collected at accelerator facilities.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same architectural ordering test could be run on other time-series anomaly problems that mix independent sensor streams with occasional coordinated events.
  • If the ordering effect generalizes, it supplies a cheap way to improve existing CNN pipelines without increasing model size.
  • Future work could measure whether the identified dominant channels remain stable when the same models are deployed on newer HVCM hardware.

Load-bearing premise

The public HVCM dataset and its fault labels are representative of real operating conditions, and measured performance differences arise from the tested architectural ordering rather than from training details or data splits.

What would settle it

Retraining the same architectures on a fresh random split of the HVCM recordings or with altered hyperparameters and checking whether the reported performance gap over the prior state of the art remains.

Figures

Figures reproduced from arXiv: 2605.31259 by Alberto D. Cencillo, Isaac Triguero, Juli\'an Luengo, Leonardo Concepci\'on.

Figure 1
Figure 1. Figure 1: Overview of the SNS accelerator. Figure 1: Overview of SNS. Original image by Hoover et al. [13]. [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Waveforms of channels C.Flux, Mod-V and CB-I in every of the four subsystems: RFQ, DTV, CCL and [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Cross-channel correlation analysis across RFQ and CCL subsystems. Each cell shows the absolute difference [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: DS architecture. Time 1D-CNN Features 1D-CNN 1D-CNN Features Features Time Features AvgPool Classifier Head Normal Fault PW-First block Linear Projection Time ... [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: PW-First architecture. construct such combinations before the temporal filters fire. The same total parameter budget is, by construction, very close to that of DS, but the inductive bias is genuinely different. PW-First architecture is illustrated in [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Distribution of test AUC-PR across the four subsystems and 15 seeds ( [PITH_FULL_IMAGE:figures/full_fig_p011_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Channel values per fault family. 14 [PITH_FULL_IMAGE:figures/full_fig_p014_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: C-Flux shape per anomaly type. reliance on Power (0.222) than any of the joint-mixing variants. The Power-dominant ordering is therefore a property of architectures that mix all 14 input channels in their first layer. Once mixing is forbidden until after a per-channel temporal stage (DS), the network’s reliance shifts toward the group whose channels carry the most discriminative temporal structure (Flux), … view at source ↗
read the original abstract

Unscheduled trips of high-power pulsed converters are a leading source of downtime at large accelerator facilities. At the Spallation Neutron Source (SNS), the High Voltage Converter Modulators (HVCMs) are consistently the second-largest contributor to lost beam time. Each HVCM pulse is recorded across sensor channels spanning currents, voltages, and magnetic fluxes, whose mutual interactions encode the operating state of the system. Fault precursors do not manifest uniformly across these channels: depending on fault type, they may alter the temporal structure of individual signals, change the statistical dependencies among channels, or both. Existing deep-learning approaches typically process multi-channel signals with standard convolutional pipelines that entangle temporal and cross-channel operations from the first layer, giving the model no explicit mechanism to represent channel independence or structured inter-channel interaction. We hypothesise that architectural inductive bias, specifically the ordering of temporal filtering and cross-channel mixing, plays a central role in detection performance on this class of data. To test this, we vary the order in which these two operations are applied, and examine whether per-pulse adaptive channel reweighting further improves sensitivity. Evaluated on the public HVCM dataset across all four SNS subsystems (RFQ, DTL, CCL, SCL), our best variant achieves a pooled AUC-PR of 0.816 and AUC-ROC of 0.934, outperforming the state of the art on most subsystems and five of the six fault families. Ablations identify three dominant input channels and link per-fault-family performance to whether precursors manifest as amplitude shifts in individual channels or as subtler patterns requiring joint channel representations to surface.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 1 minor

Summary. The paper claims that varying the order of temporal filtering versus cross-channel mixing in lightweight CNNs, combined with per-pulse adaptive channel reweighting, yields improved anomaly detection on multi-channel HVCM pulse data from the Spallation Neutron Source. On the public HVCM dataset across RFQ, DTL, CCL, and SCL subsystems, the best variant reports pooled AUC-PR of 0.816 and AUC-ROC of 0.934, outperforming prior art on most subsystems and five of six fault families. Ablations identify three dominant channels and tie per-family performance to whether faults appear as single-channel amplitude shifts or require joint representations.

Significance. If the reported gains are shown to arise from the tested architectural ordering under controlled conditions, the work supplies concrete evidence that inductive bias in the sequencing of temporal and cross-channel operations matters for this class of sensor data. The public dataset and explicit ablation on channel importance are strengths that support reproducibility and domain insight; the result could guide CNN design choices for other multi-channel pulsed systems where faults manifest unevenly across sensors.

major comments (1)
  1. [Experimental setup] Experimental setup (likely §4 or §5): the central claim that performance differences are driven by the ordering of temporal filtering and channel mixing (plus reweighting) requires that all architectural variants and reimplemented baselines were trained under identical protocols. The manuscript must explicitly confirm or tabulate that optimizer, learning-rate schedule, initialization, regularization, batch size, and train/validation/test splits were held fixed across comparisons; without this, the AUC-PR/ROC gains (0.816/0.934) cannot be confidently attributed to the hypothesized inductive bias rather than unstated implementation details.
minor comments (1)
  1. [Abstract] Abstract and §3: the phrase 'outperforming the state of the art on most subsystems' would be clearer if accompanied by a per-subsystem table (even if summarized) rather than a pooled statement alone.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback on experimental controls. We address the single major comment below and will revise the manuscript to strengthen the documentation of training protocols.

read point-by-point responses
  1. Referee: [Experimental setup] Experimental setup (likely §4 or §5): the central claim that performance differences are driven by the ordering of temporal filtering and channel mixing (plus reweighting) requires that all architectural variants and reimplemented baselines were trained under identical protocols. The manuscript must explicitly confirm or tabulate that optimizer, learning-rate schedule, initialization, regularization, batch size, and train/validation/test splits were held fixed across comparisons; without this, the AUC-PR/ROC gains (0.816/0.934) cannot be confidently attributed to the hypothesized inductive bias rather than unstated implementation details.

    Authors: All architectural variants and reimplemented baselines were trained under identical protocols: Adam optimizer with the same learning-rate schedule (initial 1e-3 with cosine decay), Xavier initialization, identical regularization (dropout rate 0.2 and weight decay 1e-4), fixed batch size of 32, and the same stratified train/validation/test splits (70/15/15) on the public HVCM dataset. We will add an explicit table in §4 listing these shared hyperparameters across all models to remove any ambiguity and allow direct attribution of performance differences to the architectural ordering and reweighting. revision: yes

Circularity Check

0 steps flagged

No circularity; empirical results on public dataset

full rationale

The paper reports experimental AUC-PR/ROC scores from CNN variants evaluated on the public HVCM dataset across subsystems and fault families. No equations, parameter fits, or derivations are present that could reduce the performance metrics to inputs by construction. No self-citation load-bearing steps, uniqueness theorems, or ansatzes are invoked. The central claim is an empirical comparison of architectural orderings, which is self-contained against the external benchmark of the public dataset and does not rely on any definitional or fitted-input loop.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Only the abstract is available; no free parameters, axioms, or invented entities are described.

pith-pipeline@v0.9.1-grok · 5842 in / 1074 out tokens · 17216 ms · 2026-06-28T23:28:18.270700+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

33 extracted references · 2 canonical work pages

  1. [1]

    D. E. Anderson, V . Peplov, D. J. Solley, and M. Wezensky. Recent developments in the improvement campaign for the high voltage converter modulator at the spallation neutron source. InIEEE International Power Modulator and High Voltage Conference (IPMHVC), pages 672–675, 2014

  2. [2]

    T. E. Mason, D. Abernathy, J. Ankner, A. Ekkebus, G. Granroth, M. Hagen, K. Herwig, C. Hoffmann, C. Horak, F. Klose, S. Miller, J. Neuefeind, C. Tulk, and X.-L. Wang. The spallation neutron source: A powerful tool for materials research.AIP Conference Proceedings, 773(1):21–25, 2005

  3. [3]

    Radaideh, Dan Lu, Pradeep Ramuhalli, and Sarah Cousineau

    Yasir Alanazi, Malachi Schram, Kishansingh Rajput, Steven Goldenberg, Lasitha Vidyaratne, Chris Pappas, Majdi I. Radaideh, Dan Lu, Pradeep Ramuhalli, and Sarah Cousineau. Multi-module based CV AE to predict HVCM faults in the SNS accelerator.Machine Learning with Applications, 2023

  4. [4]

    Radaideh, Chris Pappas, Jared Walden, Dan Lu, Lasitha Vidyaratne, Thomas Britton, Kishansingh Rajput, Malachi Schram, and Sarah Cousineau

    Majdi I. Radaideh, Chris Pappas, Jared Walden, Dan Lu, Lasitha Vidyaratne, Thomas Britton, Kishansingh Rajput, Malachi Schram, and Sarah Cousineau. Time series anomaly detection in power electronics signals with recurrent and ConvLSTM autoencoders.Digital Signal Processing, 130:103704, 2022

  5. [5]

    Particle accelerator power system early fault diagnosis based on deep learning and multi-sensor feature fusion.Engineering Research Express, 6:025225, 2024

    Jiqing Zhou, Deming Li, and Haijun Su. Particle accelerator power system early fault diagnosis based on deep learning and multi-sensor feature fusion.Engineering Research Express, 6:025225, 2024. 19 Lightweight CNN-Based Anomaly Detection for HVCM in SNS

  6. [6]

    Nguyen, Phanwadee Sinthong, and Jayant Kalagnanam

    Yuqi Nie, Nam H. Nguyen, Phanwadee Sinthong, and Jayant Kalagnanam. A time series is worth 64 words: Long-term forecasting with transformers. InInternational Conference on Learning Representations (ICLR), 2023

  7. [7]

    Xception: deep learning with depthwise separable convolutions

    Franc ¸ois Chollet. Xception: deep learning with depthwise separable convolutions. InIEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 1251–1258, 2017

  8. [8]

    Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco An- dreetto, and Hartwig Adam

    Andrew G. Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco An- dreetto, and Hartwig Adam. MobileNets: efficient convolutional neural networks for mobile vision applications, 2017

  9. [9]

    Squeeze-and-excitation networks

    Jie Hu, Li Shen, and Gang Sun. Squeeze-and-excitation networks. InIEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 7132–7141, 2018

  10. [10]

    Radaideh, Chris Pappas, and Sarah Cousineau

    Majdi I. Radaideh, Chris Pappas, and Sarah Cousineau. Real electronic signal data from particle accelerator power systems for machine learning anomaly detection.Data in Brief, 43:108473, 2022

  11. [11]

    W. A. Reass, S. Apgar, D. Baca, D. Borovina, J. Bradle, J. Doss, J. Gonzales, R. Gribble, T. Hardek, M. Lynch, D. Rees, P. Tallerico, P. Trujillo, D. Anderson, D. Heidenreich, J. Hicks, and V . Leontiev. Design, status, and first operations of the spallation neutron source polyphase resonant converter modulator system. InProceedings of the 2003 Particle A...

  12. [12]

    D. J. Solley, D. E. Anderson, G. P. Patel, V . V . Peplov, R. Saethre, and M. W. Wezensky. HVCM topology enhancements to support a power upgrade required by a second target station (STS) at SNS. InIEEE International Power Modulator and High Voltage Conference (IPMHVC), pages 362–365, 2012

  13. [13]

    Feasibility of proton bunch compression in an operational high-power accelerator

    Austin Hoover, Vasiliy Morozov, Amith Narayan, and Maurice Piller. Feasibility of proton bunch compression in an operational high-power accelerator. In6th North American Particle Accelerator Conference (NAPAC’25), 2025

  14. [14]

    Machine learning for improved availability of the SNS klystron high voltage converter modulators

    Chris Pappas, Dan Lu, Malachi Schram, and Draguna Vrabie. Machine learning for improved availability of the SNS klystron high voltage converter modulators. In12th International Particle Accelerator Conference (IPAC’21), pages 4303–4306, 2021

  15. [15]

    IEEE Transactions on Nuclear Science 63, 878–897

    Auralee L. Edelen, Sandra Biedron, Brian Chase, Daniel Edstrom, Stephen Milton, and Pierluigi Stabile. Neural networks for modeling and control of particle accelerators.IEEE Transactions on Nuclear Science (TNS), 63(2): 878–897, 2016. doi: 10.1109/TNS.2016.2543203

  16. [16]

    Nguyen, M

    D. Nguyen, M. Lee, R. Sass, and H. Shoaee. Accelerator and feedback control simulation using neural networks. Technical Report SLAC-PUB-5503, Stanford Linear Accelerator Center, 1991

  17. [17]

    Superconducting radio-frequency cavity fault classification using machine learning at Jefferson Laboratory.Physical Review Accelerators and Beams, 23:114601, 2020

    Chris Tennant, Adam Carpenter, Tom Powers, Anna Shabalina Solopova, Lasitha Vidyaratne, and Khan Iftekharuddin. Superconducting radio-frequency cavity fault classification using machine learning at Jefferson Laboratory.Physical Review Accelerators and Beams, 23:114601, 2020

  18. [18]

    Deep learning based superconducting radio-frequency cavity fault classification at Jefferson Laboratory.Frontiers in Artificial Intelligence, 4:718950, 2022

    Lasitha Vidyaratne, Adam Carpenter, Tom Powers, Chris Tennant, Khan Iftekharuddin, Md Monibor Rahman, and Anna Shabalina. Deep learning based superconducting radio-frequency cavity fault classification at Jefferson Laboratory.Frontiers in Artificial Intelligence, 4:718950, 2022

  19. [19]

    Detection of faulty beam position monitors using unsupervised learning.Physical Review Accelerators and Beams, 23:102805, 2020

    Elena Fol, Rogelio Tom ´as, Jaime Coello de Portugal, and Giuliano Franchetti. Detection of faulty beam position monitors using unsupervised learning.Physical Review Accelerators and Beams, 23:102805, 2020

  20. [20]

    Edelen and Nathan M

    Jonathan P. Edelen and Nathan M. Cook. Anomaly detection in particle accelerators using autoencoders, 2021

  21. [21]

    Machine learning-based anomaly detection for particle accelerators

    Davide Marcato, Giovanni Arena, Damiano Bortolato, Fabio Gelain, Vincenzo Martinelli, Enrico Munaron, Marco Roetta, Giorgia Savarese, and Gian Antonio Susto. Machine learning-based anomaly detection for particle accelerators. InIEEE Conference on Control Technology and Applications (CCTA), pages 240–246, 2021

  22. [22]

    Using LSTM recurrent neural networks for monitoring the LHC superconducting magnets.Nuclear Instruments and Methods in Physics Research Section A, 867:40–50, 2017

    Maciej Wielgosz, Andrzej Skoczen, and Matej Mertik. Using LSTM recurrent neural networks for monitoring the LHC superconducting magnets.Nuclear Instruments and Methods in Physics Research Section A, 867:40–50, 2017

  23. [23]

    Maciej Wielgosz, Matej Mertik, and Andrzej Skoczen. The model of an anomaly detector for HiLumi LHC magnets based on recurrent neural networks and adaptive quantization.Engineering Applications of Artificial Intelligence, 74:166–185, 2018. 20 Lightweight CNN-Based Anomaly Detection for HVCM in SNS

  24. [24]

    Predicting disruptive instabilities in controlled fusion plasmas through deep learning.Nature, 568:526–531, 2019

    Julian Kates-Harbeck, Alexey Svyatkovskiy, and William Tang. Predicting disruptive instabilities in controlled fusion plasmas through deep learning.Nature, 568:526–531, 2019

  25. [25]

    DisruptionPy: an open framework for tokamak disruption-prediction data pipelines.Journal of Open Source Software, 2024

    Henry Lucas et al. DisruptionPy: an open framework for tokamak disruption-prediction data pipelines.Journal of Open Source Software, 2024

  26. [26]

    Anomaly detection of particle accelerators using spatial-temporal contrastive fusion of multi-sensor time series.Reliability Engineering & System Safety, 267:111940, 2026

    Zhe Yang, Ruichang Zhou, Yongcheng He, Jianyu Long, Lin Fang, and Chuan Li. Anomaly detection of particle accelerators using spatial-temporal contrastive fusion of multi-sensor time series.Reliability Engineering & System Safety, 267:111940, 2026

  27. [27]

    Predicting particle accelerator failures using binary clas- sifiers.Nuclear Instruments and Methods in Physics Research Section A, 955:163240, 2020

    Miha Re ˇsˇciˇc, Rebecca Seviour, and Willem Blokland. Predicting particle accelerator failures using binary clas- sifiers.Nuclear Instruments and Methods in Physics Research Section A, 955:163240, 2020

  28. [28]

    Miha Re ˇsˇciˇc, Rebecca Seviour, and Willem Blokland. Improvements of pre-emptive identification of particle accelerator failures using binary classifiers and dimensionality reduction.Nuclear Instruments and Methods in Physics Research Section A, 1025:166064, 2022

  29. [29]

    Uncertainty aware anomaly detection to predict errant beam pulses in the Oak Ridge spallation neutron source accelerator.Physical Review Accelerators and Beams, 25:122802, 2022

    Willem Blokland, Kishansingh Rajput, Malachi Schram, Torri Jeske, Pradeep Ramuhalli, Charles Peters, Yigit Yucesan, and Alexander Zhukov. Uncertainty aware anomaly detection to predict errant beam pulses in the Oak Ridge spallation neutron source accelerator.Physical Review Accelerators and Beams, 25:122802, 2022

  30. [30]

    Robust errant beam prognostics with conditional modeling for particle accelerators.Machine Learning: Science and Technology, 2023

    Kishansingh Rajput, Malachi Schram, Willem Blokland, et al. Robust errant beam prognostics with conditional modeling for particle accelerators.Machine Learning: Science and Technology, 2023

  31. [31]

    Radaideh, Chris Pappas, Dan Lu, Jared Walden, Sarah Cousineau, Thomas Britton, Kishansingh Rajput, Lasitha Vidyaratne, and Malachi Schram

    Majdi I. Radaideh, Chris Pappas, Dan Lu, Jared Walden, Sarah Cousineau, Thomas Britton, Kishansingh Rajput, Lasitha Vidyaratne, and Malachi Schram. Progress on machine learning for the SNS high voltage converter modulators. In5th North American Particle Accelerator Conference (NAPAC’22), pages 709–712, 2022

  32. [32]

    ModernTCN: a modern pure convolution structure for general time series analysis

    Donghao Luo and Xue Wang. ModernTCN: a modern pure convolution structure for general time series analysis. InInternational Conference on Learning Representations (ICLR), 2024

  33. [33]

    The elephant in the room: towards a reliable time-series anomaly detection benchmark (TSB-AD)

    Qinghua Liu and John Paparrizos. The elephant in the room: towards a reliable time-series anomaly detection benchmark (TSB-AD). InAdvancements in Neural Information Processing Systems (NeurIPS), Datasets and Benchmarks Track, 2024. 21