Characterizing the Fault Response of the Intel Neural Compute Stick 2 Under Single-Pulse Electromagnetic Fault Injection
Pith reviewed 2026-05-22 05:19 UTC · model grok-4.3
The pith
Single electromagnetic pulses cause persistent accuracy collapse below 5 percent in CNN inferences on the Intel NCS2 that survives until model reload and evades API detection.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Single pulses produce four reproducible outcome classes interpreted as no-effect, minor SDC, SEU-like persistent corruption, and SEFI-like loss of functionality. The major-degradation class reaches post-collapse top-1 accuracy below five percent, persists across all following inferences until explicit model reload, occurs at 18-31 percent of trials at hotspots, and is inducible on an idle device with the model already loaded, demonstrating that no inference-API-level mechanism detects the regime.
What carries the argument
Four reproducible outcome classes from single-pulse EMFI interpreted as no-effect, SDC, SEU-like persistent corruption, and SEFI-like hangs.
If this is right
- Major degradation produces top-1 accuracy below five percent that remains low on every subsequent inference until explicit reload.
- The regime is undetectable by any inference-API-level mechanism.
- The same persistent degradation can be induced by pulses delivered to an idle device that already holds the loaded model.
- Load-time integrity checks alone are therefore insufficient to prevent the effect.
- Mitigation strategies can be graded by outcome class and implemented at the application level without changes to firmware or the OpenVINO runtime.
Where Pith is reading between the lines
- Application-level output monitoring or consistency checks across consecutive inferences could flag the persistent degradation in deployed systems.
- Similar single-pulse EMFI behavior may appear in other commercial vision-processing units used in edge safety applications.
- Periodic model re-verification or checksums during runtime might reduce the window in which undetected degradation can affect decisions.
Load-bearing premise
The observed major persistent degradation is produced by the single electromagnetic pulses rather than by unrelated factors such as power supply noise or software timing.
What would settle it
Repeating the spot-test trials at the same characterized hotspots while applying no electromagnetic pulse and checking whether the major persistent degradation class still appears at rates near 18-31 percent.
Figures
read the original abstract
Vision processing units and other commercial neural-network inference accelerators are increasingly deployed in safety-relevant edge applications, but their fault response under transient hardware disturbances remains poorly characterized in the open literature. For the Intel Movidius Myriad X, packaged as the Intel Neural Compute Stick 2 (NCS2), only a single feasibility study has been published. We report a systematic single-pulse electromagnetic fault injection (EMFI) campaign on the NCS2 running three ImageNet-trained convolutional neural networks (ResNet-18, ResNet-50, VGG-11) on the OpenVINO runtime. Across 1,536 spot-test trials at characterized hotspots and approximately 16,000 parameter-search trials, single pulses produce four reproducible outcome classes: no measured accuracy change, minor silent data corruption, major persistent degradation that survives across subsequent inferences until model reload, and device hangs requiring USB power-cycling; these outcomes are respectively interpreted as no-effect, SDC with possible SET-like or small persistent-state mechanisms, SEU-like persistent corruption, and SEFI-like loss of functionality. Two findings are central. First, the major-degradation class can be induced at 18-31% of trials at characterized hotspots, with post-collapse top-1 accuracy below five percent and persistence across all subsequent inferences until explicit model reload - a regime that no inference-API-level mechanism detects. Second, this regime is also inducible by pulses delivered to an idle device with the model already loaded, demonstrating that load-time integrity checks alone are insufficient. We discuss mitigation strategies graded by class, focusing on mechanisms implementable at the application level without modification to the device firmware or the OpenVINO runtime.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper reports results from a systematic single-pulse electromagnetic fault injection (EMFI) campaign on the Intel Neural Compute Stick 2 (NCS2) running ResNet-18, ResNet-50, and VGG-11 on the OpenVINO runtime. Across 1,536 spot-test trials at characterized hotspots and approximately 16,000 parameter-search trials, the authors identify four reproducible outcome classes: no measured accuracy change, minor silent data corruption, major persistent degradation (post-collapse top-1 accuracy below 5% that persists across subsequent inferences until explicit model reload), and device hangs requiring USB power-cycling. These are interpreted as no-effect, SDC/SET-like, SEU-like persistent corruption, and SEFI-like loss of functionality. Central claims are that the major-degradation class occurs at 18-31% of trials at hotspots, is undetectable by inference-API mechanisms, and remains inducible on idle devices with the model already loaded (showing load-time checks are insufficient). The authors discuss graded mitigation strategies implementable at the application level.
Significance. If the central experimental findings hold after addressing controls, the work provides a useful open characterization of transient fault responses in a commercial neural inference accelerator, which is relevant for safety-critical edge deployments. The scale of the campaign (over 17,000 total trials) and the identification of a persistent low-accuracy regime that survives reload-free operation are strengths that could inform hardware-security practices. The paper correctly notes the absence of prior systematic studies beyond one feasibility paper and supplies reproducible outcome classes that future work can build upon.
major comments (2)
- [Abstract and §4] Abstract and §4 (Results): The attribution of the major persistent degradation class to an SEU-like mechanism induced by single-pulse EMFI is load-bearing for the headline claims (18-31% induction rate, undetectability by APIs, and insufficiency of load-time checks), yet the manuscript reports no controls that would exclude coincident power-rail transients, USB timing glitches, or OpenVINO runtime state corruption as alternative explanations for the observed persistent accuracy collapse to <5% top-1.
- [§3 and §4] §3 (Experimental Methodology) and §4: The outcome classification and the reported 18-31% induction rates at hotspots rest on post-injection accuracy measurements without visible error bars, per-class trial counts, or explicit exclusion criteria for non-EMFI artifacts; this weakens the statistical grounding of the central claim that the major-degradation regime is reliably produced by the fault injection.
minor comments (2)
- [Abstract] Abstract: The phrase 'approximately 16,000 parameter-search trials' should be replaced by the exact total and a breakdown by outcome class to allow readers to assess coverage.
- [§5] §5 (Discussion): The mitigation strategies are described at a high level; adding a short table or pseudocode examples for the proposed application-level checks would improve clarity without altering the technical contribution.
Simulated Author's Rebuttal
We thank the referee for their careful and constructive review of our manuscript. We address each major comment in turn below, providing the strongest honest responses we can offer based on the experiments performed. Where the comments identify opportunities to strengthen statistical presentation or experimental controls, we have revised the manuscript accordingly.
read point-by-point responses
-
Referee: [Abstract and §4] Abstract and §4 (Results): The attribution of the major persistent degradation class to an SEU-like mechanism induced by single-pulse EMFI is load-bearing for the headline claims (18-31% induction rate, undetectability by APIs, and insufficiency of load-time checks), yet the manuscript reports no controls that would exclude coincident power-rail transients, USB timing glitches, or OpenVINO runtime state corruption as alternative explanations for the observed persistent accuracy collapse to <5% top-1.
Authors: We agree that explicit discussion of alternative explanations strengthens the attribution. The persistence of the low-accuracy state across repeated inferences until an explicit model reload is the primary basis for interpreting the outcome as persistent state corruption rather than a transient power or timing artifact; transient glitches would be expected either to resolve on the next inference or to produce immediate hangs, neither of which matches the observed behavior. Experiments performed on idle devices with the model already resident further reduce the likelihood of load-time runtime corruption. Nevertheless, to address the referee’s concern directly we will add a new subsection in §3 that documents the power-rail monitoring, USB timing verification, and hotspot-characterization procedures used to minimize and detect non-EMFI confounds. revision: yes
-
Referee: [§3 and §4] §3 (Experimental Methodology) and §4: The outcome classification and the reported 18-31% induction rates at hotspots rest on post-injection accuracy measurements without visible error bars, per-class trial counts, or explicit exclusion criteria for non-EMFI artifacts; this weakens the statistical grounding of the central claim that the major-degradation regime is reliably produced by the fault injection.
Authors: We accept that the current presentation would benefit from greater statistical transparency. In the revised manuscript we will report the exact number of trials falling into each outcome class, include binomial confidence intervals or standard-error bars on the 18–31 % hotspot rates, and state the explicit exclusion criteria applied to trials affected by device instability or USB enumeration failures. These additions will appear in §4 with a brief reference in §3. revision: yes
Circularity Check
No circularity: purely experimental characterization
full rationale
The paper conducts direct empirical fault-injection trials on the NCS2 hardware, measures post-injection top-1 accuracy and device state across thousands of trials, and classifies observed outcomes into four reproducible classes. No equations, fitted parameters, predictions, or derivations appear; outcome classes are defined by measured accuracy thresholds and persistence behavior rather than by any self-referential construction. All central claims rest on external benchmarks (ImageNet accuracy, USB power-cycle recovery) and are falsifiable by replication. Self-citations, if present, are not load-bearing for any derivation. This is the normal case of an experimental characterization paper whose results do not reduce to their inputs by construction.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Single electromagnetic pulses produce distinguishable and reproducible outcome classes (no-effect, minor SDC, major persistent degradation, device hang) that can be mapped to hardware fault models such as SET/SEU/SEFI.
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
single pulses produce four reproducible outcome classes: no measured accuracy change, minor silent data corruption, major persistent degradation... interpreted as no-effect, SDC..., SEU-like..., SEFI-like
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
International Electrotechnical Commission, “IEC 61508: Functional safety of electrical/electronic/programmable electronic safety-related systems,” Edition 2.0, multiple parts, Geneva, Switzerland, 2010
work page 2010
-
[2]
ISO 26262: Road ve- hicles — functional safety,
International Organization for Standardization, “ISO 26262: Road ve- hicles — functional safety,” Second edition, multiple parts, Geneva, Switzerland, 2018
work page 2018
-
[3]
ISO/IEC TR 5469: Artificial intelligence — functional safety and AI systems,
International Organization for Standardization and International Elec- trotechnical Commission, “ISO/IEC TR 5469: Artificial intelligence — functional safety and AI systems,” ISO/IEC, Geneva, Switzerland, Technical Report ISO/IEC TR 5469:2024, 2024
work page 2024
-
[4]
ISO 21448: Road vehi- cles — safety of the intended functionality,
International Organization for Standardization, “ISO 21448: Road vehi- cles — safety of the intended functionality,” Geneva, Switzerland, 2022
work page 2022
-
[5]
Late breaking results: Practical electromagnetic fault injection on Intel neural compute stick 2,
S. Bhasin, D. Jap, P. Ravi, M. Kr ˇcek, and S. Picek, “Late breaking results: Practical electromagnetic fault injection on Intel neural compute stick 2,” inDesign, Automation & Test in Europe Conference (DATE). IEEE, 2025, pp. 1–2, also available as Cryptology ePrint Archive, Paper 2025/192, https://eprint.iacr.org/2025/192
work page 2025
-
[6]
EMFI for safety-critical testing of automotive systems,
C. O’Flynn, “EMFI for safety-critical testing of automotive systems,” Cryptology ePrint Archive, Paper 2021/1217, 2021, published at the Workshop on Fault Diagnosis and Tolerance in Cryptography (FDTC). [Online]. Available: https://eprint.iacr.org/2021/1217
work page 2021
-
[7]
Intel neural compute stick 2 product brief,
Intel Corporation, “Intel neural compute stick 2 product brief,” https://www.intel.com/content/www/us/en/products/sku/140109/intel- neural-compute-stick-2/specifications.html, 2018, accessed: May 22, 2026
work page 2018
-
[8]
Myriad 2: Eye of the computational vision storm,
D. Moloney, B. Barry, R. Richmond, F. Connor, C. Brick, and D. Dono- hoe, “Myriad 2: Eye of the computational vision storm,” in2014 IEEE Hot Chips 26 Symposium (HCS). IEEE, 2014, pp. 1–18
work page 2014
-
[9]
X. Hou, J. Breier, D. Jap, L. Ma, S. Bhasin, and Y . Liu, “Physical security of deep learning on edge devices: Comprehensive evaluation of fault injection attack vectors,”Microelectronics Reliability, vol. 120, p. 114116, 2021
work page 2021
-
[10]
G. Li, S. K. S. Hari, M. Sullivan, T. Tsai, K. Pattabiraman, J. Emer, and S. W. Keckler, “Understanding error propagation in deep learning neural network accelerators and application to resilience evaluation,” inProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC). ACM, 2017, pp. 8:1–8:12
work page 2017
-
[11]
Ares: A framework for quantifying the resilience of deep neural networks,
B. Reagen, U. Gupta, L. Pentecost, P. Whatmough, S. K. Lee, N. Mulhol- land, D. Brooks, and G.-Y . Wei, “Ares: A framework for quantifying the resilience of deep neural networks,” inProceedings of the 55th Annual Design Automation Conference (DAC). ACM, 2018, pp. 17:1–17:6
work page 2018
-
[12]
PyTorchFI: A runtime perturbation tool for DNNs,
A. Mahmoud, N. Aggarwal, A. Nobbe, J. R. Sanchez Vicarte, S. V . Adve, C. W. Fletcher, I. Frosio, and S. K. S. Hari, “PyTorchFI: A runtime perturbation tool for DNNs,” in50th Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshops (DSN-W). IEEE, 2020, pp. 25–31
work page 2020
-
[13]
S. Hong, P. Frigo, Y . Kaya, C. Giuffrida, and T. Dumitras ¸, “Terminal brain damage: Exposing the graceless degradation in deep neural networks under hardware fault attacks,” in 28th USENIX Security Symposium. Santa Clara, CA, USA: USENIX Association, 2019, pp. 497–514. [Online]. Available: https://www.usenix.org/conference/usenixsecurity19/presentation/hong 15
work page 2019
-
[14]
On the resilience of deep learning for reduced-voltage FPGAs,
N. Khoshavi, S. Sargolzaei, Y . Bi, and A. Roohi, “On the resilience of deep learning for reduced-voltage FPGAs,” arXiv preprint arXiv:2001.00053, 2020. [Online]. Available: https://arxiv.org/abs/2001.00053
-
[15]
Practical fault attack on deep neural networks,
J. Breier, X. Hou, D. Jap, L. Ma, S. Bhasin, and Y . Liu, “Practical fault attack on deep neural networks,” inProceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security (CCS). ACM, 2018, pp. 2204–2206
work page 2018
-
[16]
Sniff: reverse engineering of neural networks with fault attacks,
J. Breier, D. Jap, X. Hou, S. Bhasin, and Y . Liu, “Sniff: reverse engineering of neural networks with fault attacks,”IEEE Transactions on Reliability, vol. 71, no. 4, pp. 1527–1539, 2021
work page 2021
-
[17]
Fault injection attack on deep neural network,
Y . Liu, L. Wei, B. Luo, and Q. Xu, “Fault injection attack on deep neural network,” inProceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD). IEEE, 2017, pp. 131–138
work page 2017
-
[18]
The Weight of a Bit: EMFI Sensitivity Analysis of Embedded Deep Learning Models
J. Breier, ˇS. Ku ˇcer´ak, and X. Hou, “The Weight of a Bit: EMFI Sen- sitivity Analysis of Embedded Deep Learning Models,”arXiv preprint arXiv:2602.16309, 2026
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[19]
Optuna: A next- generation hyperparameter optimization framework,
T. Akiba, S. Sano, T. Yanase, T. Ohta, and M. Koyama, “Optuna: A next- generation hyperparameter optimization framework,” inProceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, 2019, pp. 2623–2631
work page 2019
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.