Model-free Anomaly Detection for Dynamical Systems with Gaussian Processes

Alejandro Penacho Riveiros; Matthieu Barreau; Nicola Bastianello

arxiv: 2604.11629 · v1 · submitted 2026-04-13 · 📡 eess.SY · cs.SY

Model-free Anomaly Detection for Dynamical Systems with Gaussian Processes

Alejandro Penacho Riveiros , Nicola Bastianello , Matthieu Barreau This is my paper

Pith reviewed 2026-05-10 15:10 UTC · model grok-4.3

classification 📡 eess.SY cs.SY

keywords anomaly detectionGaussian processesdynamical systemsmodel-freenominal datafalse positive ratequality controlsystem degradation

0 comments

The pith

A Gaussian process trained on nominal historical data detects anomalies in dynamical systems by checking new measurements against a false-positive-controlled threshold.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes training a Gaussian process offline on data from normal system operations. This model is then run online to test whether incoming state measurements remain compatible with the nominal behavior or show a deviation. The approach handles process and measurement noise by using a threshold chosen to deliver a specified false positive rate. It targets applications such as quality control for new equipment and spotting degradation or repairs. A reader would care because the method works without needing an explicit mathematical model of the system dynamics.

Core claim

We train a Gaussian process on historical nominal data offline. Online, the model assesses whether new measurements of the system state are consistent with nominal operations or indicate an anomaly, with the decision based on a threshold that ensures a chosen false positive rate even when process and measurement noise are present.

What carries the argument

An offline-trained Gaussian process that models nominal dynamics and supplies a probabilistic compatibility test for new measurements, decided by a threshold calibrated to a target false positive rate.

If this is right

Newly manufactured systems can be tested for compliance with desired nominal performance without constructing an explicit model.
Changes in dynamics from degradation or repairs become detectable during operation.
Detection remains viable when both process noise and measurement noise are present.
The false positive rate can be set in advance to suit operational tolerance.
Performance is illustrated on two example dynamical systems.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Periodic retraining of the Gaussian process could allow the detector to track slowly varying nominal conditions without manual intervention.
The same offline-training and threshold-calibration pattern might transfer to other nonparametric regressors for anomaly detection in time-series data.
A direct test would be to measure how detection accuracy drops when the training set omits some normal operating regimes.
Industrial monitoring applications could use this to lower dependence on first-principles models for fault detection.

Load-bearing premise

The historical nominal data must represent all normal operating conditions, and the Gaussian process must capture the system dynamics accurately enough despite noise.

What would settle it

Apply the detector to a system where known anomalies have been deliberately introduced and check whether the observed detection rate falls short of what the calibrated threshold predicts or whether false positives on fresh nominal data exceed the design value.

Figures

Figures reproduced from arXiv: 2604.11629 by Alejandro Penacho Riveiros, Matthieu Barreau, Nicola Bastianello.

**Figure 3.** Figure 3: Setup for the Van der Pol example. Left plot: tra [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗

**Figure 4.** Figure 4: Results for the Van der Pol example. Left plot: [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗

read the original abstract

In this paper we address the problem of detecting differences or anomalies in a dynamical system, based on historical data of nominal operations. This problem encompasses quality control, where newly manufactured systems are tested against desired nominal operations, and the detection of changes in the dynamics due to degradation or repairs. We propose a model free approach based on Gaussian processes (GPs). The idea is to train offline a GP based on nominal data, which is then deployed online to detect whether measurements of the system state are compatible with nominal operations or if they deviate. Detecting this deviation is made more challenging by the presence of process and measurement noise, which might obfuscate deviations in the dynamics. The detection then is based on a threshold that ensures a specific false positive rate. We showcase the promising performance of the proposed method with two systems, and highlight several interesting future research questions.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives a straightforward GP-based anomaly detector for dynamical systems but the threshold claim for controlled false positives rests on shaky calibration assumptions.

read the letter

The main takeaway is a model-free pipeline that fits a Gaussian process to nominal state trajectories offline and then flags online measurements as anomalous if they fall outside a threshold chosen to hit a target false-positive rate. They demonstrate it on two example systems and note some open questions for later work. That setup is new enough in the anomaly-detection-for-dynamics niche, and it keeps the method simple by avoiding any explicit system model, which is genuinely useful when you only have historical good data from manufacturing or maintenance contexts. The offline-train, online-deploy split is clean and the noise discussion in the abstract shows they are aware of the practical difficulty. The experiments, limited as they are, at least give a concrete sense that the idea can be run end-to-end. The soft spot is exactly the one the stress-test flags. Dynamical systems fold process and measurement noise through an unknown transition, so the GP predictive distribution is unlikely to be well-calibrated for residuals in the way a simple quantile threshold assumes. The abstract states that the threshold “ensures” a specific false-positive rate, yet offers no derivation, no finite-sample argument, and no mention of checking realized FPR on held-out nominal runs. If the full paper does not add those checks or discuss how the threshold is actually computed beyond the high-level description, that part of the claim is under-supported. The usual assumption that nominal data covers all normal regimes is also left implicit. This is aimed at control or reliability engineers who need a quick non-parametric monitor rather than theorists looking for new GP results. A reader already working on industrial anomaly detection would find the concrete recipe and the two-system showcase worth skimming. It is solid enough on its own terms to deserve peer review; the experiments provide initial evidence and the idea is implementable, even if the calibration gap needs addressing in revision.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes a model-free anomaly detection method for dynamical systems using Gaussian processes. A GP is trained offline on historical nominal data and then deployed online to classify new state measurements as consistent with nominal behavior or anomalous, with detection performed via comparison to a threshold selected to achieve a target false-positive rate in the presence of process and measurement noise. The approach is illustrated on two example systems.

Significance. If the central claims hold, the work offers a practical data-driven alternative to model-based monitoring for quality control and degradation detection. The model-free GP formulation is a strength for systems where explicit dynamics are unavailable or uncertain. However, the significance is limited by the absence of any derivation, finite-sample analysis, or calibration study for the threshold, which is load-bearing for the advertised FPR control.

major comments (2)

[Abstract] Abstract: The statement that detection 'is based on a threshold that ensures a specific false positive rate' supplies no derivation, no expression for the threshold in terms of the GP posterior, and no argument that the predictive distribution remains calibrated when residuals are filtered through the unknown dynamics under combined process/measurement noise. This directly undermines the central claim.
[Method and Experiments] Method (presumed §3) and Experiments: No finite-sample guarantee, no held-out nominal calibration check, and no description of how the detection statistic is formed from the GP (e.g., predictive mean/variance, quantile, or log-likelihood) are provided. Without these, it is impossible to verify that the offline threshold delivers the advertised FPR online.

minor comments (2)

[Abstract] The abstract refers to 'two systems' without naming them or indicating whether they are linear/nonlinear, low/high-dimensional, or simulated/real.
[Abstract] Notation for the GP (kernel, noise model, training procedure) is not introduced in the high-level description, making the method difficult to reproduce from the abstract alone.

Simulated Author's Rebuttal

2 responses · 1 unresolved

We thank the referee for the constructive and detailed feedback. The comments highlight important aspects of rigor and clarity that we will address in the revision. Below we respond point by point to the major comments.

read point-by-point responses

Referee: [Abstract] Abstract: The statement that detection 'is based on a threshold that ensures a specific false positive rate' supplies no derivation, no expression for the threshold in terms of the GP posterior, and no argument that the predictive distribution remains calibrated when residuals are filtered through the unknown dynamics under combined process/measurement noise. This directly undermines the central claim.

Authors: We agree that the abstract is overly concise and does not convey the necessary technical details. In the revised manuscript we will expand the abstract to include a brief description of the threshold. The threshold is obtained as the (1 - α) quantile of the negative log predictive density under the GP posterior, where the predictive mean and variance are computed from the offline-trained GP on nominal data that already incorporates process and measurement noise. We will also add a short paragraph in the method section arguing that, because the GP is trained directly on the observed noisy trajectories, the predictive distribution is calibrated to the nominal measurement distribution; any online deviation is therefore detected relative to this empirical nominal law. A complete theoretical proof of calibration for arbitrary unknown dynamics is not supplied and will be noted as a limitation. revision: partial
Referee: [Method and Experiments] Method (presumed §3) and Experiments: No finite-sample guarantee, no held-out nominal calibration check, and no description of how the detection statistic is formed from the GP (e.g., predictive mean/variance, quantile, or log-likelihood) are provided. Without these, it is impossible to verify that the offline threshold delivers the advertised FPR online.

Authors: We acknowledge that the current text does not explicitly define the detection statistic or include a calibration verification step. In the revision we will (i) state that the detection statistic is the negative log predictive probability evaluated at the new measurement under the GP posterior, (ii) describe how the threshold is set offline as the empirical (1 - α) quantile of this statistic on the nominal training set, and (iii) add a held-out nominal calibration experiment that reports the realized false-positive rate on unseen nominal trajectories. A finite-sample guarantee is not derived; the method relies on the consistency of GP regression and empirical quantile estimation. We will add a brief discussion of this reliance and list a theoretical finite-sample analysis as future work. revision: yes

standing simulated objections not resolved

A rigorous finite-sample guarantee that the offline-chosen threshold controls the online false-positive rate for arbitrary unknown dynamics under combined process and measurement noise.

Circularity Check

0 steps flagged

No circularity in GP training plus threshold-based anomaly detection

full rationale

The paper describes an offline GP trained on nominal trajectories followed by online deployment with a threshold chosen to achieve a target false-positive rate. No derivation chain, equation, or self-citation reduces the detection rule to a fitted parameter by construction, renames a known result, or imports uniqueness from the authors' prior work. The approach remains a standard data-driven application of GP regression whose performance claims rest on external calibration properties of GPs rather than tautological re-use of the training data itself.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

Only the abstract is available, so the ledger is necessarily incomplete and based on stated assumptions rather than detailed derivations.

free parameters (1)

false-positive-rate threshold
Chosen to achieve a target false-positive rate; no method for selecting or validating the threshold is given in the abstract.

axioms (1)

domain assumption Gaussian processes trained on nominal data can represent the expected behavior of the dynamical system under noise
Invoked when the paper states that the GP is trained offline on nominal data and then used to check compatibility of new measurements.

pith-pipeline@v0.9.0 · 5447 in / 1257 out tokens · 57571 ms · 2026-05-10T15:10:27.320374+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

3 extracted references · 3 canonical work pages

[1]

Beckers, T., Seidman, J., Perdikaris, P., & Pappas, G. J. (2022). Gaussian Process Port Hamiltonian Systems: Bayesian Learning with Physics Prior. 2022 IEEE 61st Conference on Decision and Control (CDC) , 1447–

work page 2022
[2]

Cybenko, G. (1989). Approximation by superpositions of a sigmoidal function. Mathematics of Control, Signals and Systems, 2(4), 303–314. Frank, P. M. (1990). Fault diagnosis in dynamic systems using analytical and knowledgebased redundancy: A survey and some new results. Automatica, 26(3), 459–

work page 1989
[3]

Latifi, A., & Scoglio, C. M. (2025). A Survey on Uncer tainty Quantification in Dynamical Systems. IEEE Transactions on Systems, Man, And Cybernetics: Sys tems, 55(8), 5137–5151. Liao, Y., Zhang, Y., You, J., Jin, L., Xu, Z., & Shen, Z. (2024). UncertaintyInformed Threshold Assessment of ModelBased Fault Detection for Modular Multi level Converters...

work page 2025

[1] [1]

Beckers, T., Seidman, J., Perdikaris, P., & Pappas, G. J. (2022). Gaussian Process Port Hamiltonian Systems: Bayesian Learning with Physics Prior. 2022 IEEE 61st Conference on Decision and Control (CDC) , 1447–

work page 2022

[2] [2]

Cybenko, G. (1989). Approximation by superpositions of a sigmoidal function. Mathematics of Control, Signals and Systems, 2(4), 303–314. Frank, P. M. (1990). Fault diagnosis in dynamic systems using analytical and knowledgebased redundancy: A survey and some new results. Automatica, 26(3), 459–

work page 1989

[3] [3]

Latifi, A., & Scoglio, C. M. (2025). A Survey on Uncer tainty Quantification in Dynamical Systems. IEEE Transactions on Systems, Man, And Cybernetics: Sys tems, 55(8), 5137–5151. Liao, Y., Zhang, Y., You, J., Jin, L., Xu, Z., & Shen, Z. (2024). UncertaintyInformed Threshold Assessment of ModelBased Fault Detection for Modular Multi level Converters...

work page 2025