Sampling pseudospectrum for data-driven matrices
Pith reviewed 2026-05-19 17:32 UTC · model grok-4.3
The pith
A sampling pseudospectrum estimator lets users test statistically whether eigenvalues from finite data are genuine or sampling artifacts.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The paper claims that the sampling pseudospectrum P(λ) provides probabilistic information on the behaviour of finite-data eigenvalues, and that the estimator ˆP(λ) computed by reprocessing the finite data sample enables statistical tests for the location of the true eigenvalues of the underlying infinite-data operator.
What carries the argument
The sampling pseudospectrum P(λ) and its estimator ˆP(λ), which together describe the distribution of eigenvalues arising from finite sampling of the true operator.
If this is right
- The estimator supplies an objective criterion for classifying peripheral eigenvalues as signal or noise in data-driven spectral methods.
- The approach applies to any matrix constructed by least-squares fitting from finite observations, including dynamical mode decomposition and related algorithms.
- Because the estimator reuses the existing sample, it adds little computational cost to existing analysis pipelines.
- Persistent emergent patterns extracted from complex systems can be assessed for statistical significance with greater rigor.
Where Pith is reading between the lines
- Integration into standard dynamical mode decomposition codebases would allow automatic flagging of likely noisy eigenvalues.
- The same reprocessing idea could be tested on synthetic datasets whose true spectra are known exactly.
- Analogous estimators might be developed for other data-driven constructions such as those arising in machine learning of dynamical systems.
- Connections to classical pseudospectral theory could yield explicit bounds on the estimator's accuracy.
Load-bearing premise
Reprocessing the finite data sample yields an unbiased estimator for the sampling pseudospectrum of the underlying infinite-data operator.
What would settle it
A systematic mismatch between the values of the estimator ˆP(λ) and the actual distribution of eigenvalues obtained from many independent finite samples of the same true operator would falsify the claim.
Figures
read the original abstract
Many complex systems can be reduced to their key components through spectrally decomposing matrices that capture their dynamics. These matrices can in turn be constructed from data, often by least-squares fitting: examples of algorithms to do this include Dynamical Mode Decomposition and variants, subspace identification and eigenvalue realisation algorithms. Typical outputs of these algorithms include a range of isolated, peripheral eigenvalues capturing persistent emergent patterns in the system. However, there is no objective way to assess which of these discrete eigenvalues are artefacts of finite data error, and which are reflections of a fully sampled operator. n this paper, we present a sampling pseudospectrum $P(\lambda)$, that provides probabilistic information on the behaviour of finite-data eigenvalues in the complex plane, and an estimator $\hat P(\lambda)$, which can be obtained by reprocessing our finite data sample. The estimator, which is computationally efficient to implement, allows us to test statistically for the location of the true eigenvalues. This gives us a rigorous and very general way to assess whether the patterns we extract from finite data are likely to be signal or noise.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces a sampling pseudospectrum P(λ) to characterize the probabilistic distribution of eigenvalues extracted from finite-data matrices obtained via least-squares fitting in data-driven methods such as DMD. It further defines an estimator ˆP(λ) computed by reprocessing the given finite sample, which is claimed to enable rigorous statistical tests for distinguishing true eigenvalues (signal) from finite-sample artefacts (noise).
Significance. If the unbiasedness of ˆP(λ) for P(λ) can be established under the stated conditions, the contribution would be significant: it supplies a computationally efficient, general-purpose statistical tool for validating extracted spectral features in dynamical systems identification, addressing a practical limitation in existing DMD and subspace methods.
major comments (1)
- Abstract: the claim that ˆP(λ) 'allows us to test statistically for the location of the true eigenvalues' and supplies a 'rigorous' assessment rests on the premise that reprocessing the finite sample yields an unbiased estimator of the sampling pseudospectrum P(λ). No derivation, expectation calculation, or error analysis establishing E[ˆP(λ)] = P(λ) is supplied in the abstract, and the text does not state the required conditions on sampling-error distributions or resolvent bounds; this is load-bearing for the central statistical-test claim.
minor comments (1)
- Abstract: the sentence beginning 'n this paper' contains a typographical error and should read 'In this paper'.
Simulated Author's Rebuttal
We thank the referee for their careful reading of the manuscript and for highlighting the need for greater clarity on the statistical foundations of the estimator. We address the major comment point by point below and propose targeted revisions to strengthen the presentation.
read point-by-point responses
-
Referee: Abstract: the claim that ˆP(λ) 'allows us to test statistically for the location of the true eigenvalues' and supplies a 'rigorous' assessment rests on the premise that reprocessing the finite sample yields an unbiased estimator of the sampling pseudospectrum P(λ). No derivation, expectation calculation, or error analysis establishing E[ˆP(λ)] = P(λ) is supplied in the abstract, and the text does not state the required conditions on sampling-error distributions or resolvent bounds; this is load-bearing for the central statistical-test claim.
Authors: We agree that the abstract, due to length constraints, does not contain the full derivation or an explicit list of conditions. The main text does derive the expectation E[ˆP(λ)] = P(λ) by direct calculation under the data-generating model (Section 3), where the sampling errors are taken to be zero-mean and independent across snapshots with finite second moments. The resolvent is assumed bounded on the relevant compact set in the complex plane to control the perturbation. We acknowledge that these modeling assumptions and the explicit statement E[ˆP(λ)] = P(λ) could be stated more prominently. In the revised manuscript we will (i) add a short sentence to the abstract referencing the unbiasedness result under the stated conditions and (ii) insert a dedicated paragraph immediately after the definition of ˆP(λ) that lists the precise assumptions on the error distribution and the resolvent bound, together with a pointer to the expectation calculation. These changes will make the load-bearing statistical claim fully transparent without altering the technical content. revision: yes
Circularity Check
No circularity: estimator obtained by independent data reprocessing
full rationale
The paper defines the sampling pseudospectrum P(λ) for the infinite-data operator and introduces the estimator ˆP(λ) explicitly as obtained by reprocessing the finite data sample. This construction does not reduce to a self-definitional loop, a fitted parameter renamed as prediction, or any self-citation chain. The statistical test for eigenvalues follows from the reprocessing procedure rather than assuming the target result by construction, rendering the derivation self-contained.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Finite sampling of data introduces probabilistic perturbations to the eigenvalues of the fitted matrix relative to the true operator.
invented entities (1)
-
Sampling pseudospectrum P(λ)
no independent evidence
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquationwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
P(λ) = sup_{Q≻0} inf_{v≠0} ||Cλ v||_Q² / Vλ[Q]; 1/P(λ) is spectral radius of Sλ[Q] := Vλ[(Cλ^{-1})* Q Cλ^{-1}] (Thm 5.1, Prop 2.3)
-
IndisputableMonolith/Foundation/RealityFromDistinctionreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Sλ preserves positive semi-definite cone; Perron-Frobenius-type results on cone-preserving operators
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Marianne Akian, Stéphane Gaubert, and Roger Nussbaum. A Collatz-Wielandt characterization of the spectral radius of order-preserving homogeneous maps on cones.arXiv preprint arXiv:1112.5968, 2011
work page internal anchor Pith review Pith/arXiv arXiv 2011
-
[2]
Hassan Arbabi and Igor Mezic. Ergodic theory, dynamic mode decomposition, and computation of spectral properties of the koopman operator.SIAM Journal on Applied Dynamical Systems, 16(4):2096–2126, 2017
work page 2096
-
[3]
Arnold.Random dynamical systems
L. Arnold.Random dynamical systems. Springer Monographs in Mathematics. Springer, Berlin, Heidelberg, 1st ed. 1998. edition, 2002
work page 1998
-
[4]
VivianeBaladi.Dynamical zeta functions and dynamical determinants for hyperbolic maps. Springer, 2018
work page 2018
-
[5]
Rayleigh-bénard convection.Contemporary Physics, 25(6):535–582, 1984
P Bergé and M Dubois. Rayleigh-bénard convection.Contemporary Physics, 25(6):535–582, 1984
work page 1984
-
[6]
Abraham Berman and Robert J Plemmons.Nonnegative matrices in the mathematical sciences. SIAM, 1994
work page 1994
-
[7]
Matthew J Colbrook. The mpEDMD algorithm for data-driven computations of measure-preserving dynamical systems.arXiv preprint arXiv:2209.02244, 2022
-
[8]
Limits and powers of koopman learning
Matthew J Colbrook, Igor Mezić, and Alexei Stepanenko. Limits and powers of koopman learning. arXiv preprint arXiv:2407.06312, 2024
-
[9]
Hubert Hennion and Loïc Hervé.Limit theorems for Markov chains and stochastic properties of dynamical systems by quasi-compactness. Springer, 2001
work page 2001
-
[10]
April Herwig, Matthew J Colbrook, Oliver Junge, Péter Koltai, and Julia Slipantschuk. Avoiding spectral pollution for transfer operators using residuals.arXiv preprint arXiv:2507.16915, 2025. 29
-
[11]
Jer-Nan Juang and Richard S Pappa. An eigensystem realization algorithm for modal parameter identification and model reduction.Journal of guidance, control, and dynamics, 8(5):620–627, 1985
work page 1985
-
[12]
Springer Science & Business Media, 2011
Rafail Khasminskii.Stochastic stability of differential equations, volume 66. Springer Science & Business Media, 2011
work page 2011
-
[13]
Péter Koltai and Philipp Kunde. A koopman-takens theorem: Linear least squares prediction of nonlinear time series.arXiv preprint arXiv:2308.02175, 2023
-
[14]
Bernard O Koopman. Hamiltonian systems and transformation in hilbert space.Proceedings of the National Academy of Sciences, 17(5):315–318, 1931
work page 1931
-
[15]
Milan Korda and Igor Mezić. On convergence of extended dynamic mode decomposition to the Koopman operator.Journal of Nonlinear Science, 28(2):687–710, 2018
work page 2018
-
[16]
Central limit theorem for deterministic systems
Carlangelo Liverani. Central limit theorem for deterministic systems. InInternational Conference on Dynamical Systems (Montevideo, 1995), volume 362, pages 56–75, 1996
work page 1995
-
[17]
The mechanics of vacillation.Journal of Atmospheric Sciences, 20(5):448–465, 1963
Edward N Lorenz. The mechanics of vacillation.Journal of Atmospheric Sciences, 20(5):448–465, 1963
work page 1963
-
[18]
Superpolynomial and polynomial mixing for semiflows and flows.Nonlinearity, 31(10):R268, 2018
Ian Melbourne. Superpolynomial and polynomial mixing for semiflows and flows.Nonlinearity, 31(10):R268, 2018
work page 2018
-
[19]
Igor Mezić. Spectral properties of dynamical systems, model reduction and decompositions.Non- linear Dynamics, 41(1):309–325, 2005
work page 2005
-
[20]
Peter Overschee and Bart Moor.Subspace identification for linear systems: Theory-Implementation- Applications. Springer, 1996
work page 1996
-
[21]
Dynamic mode decomposition and its variants.Annual Review of Fluid Mechanics, 54:225–254, 2022
Peter J Schmid. Dynamic mode decomposition and its variants.Annual Review of Fluid Mechanics, 54:225–254, 2022
work page 2022
-
[22]
Julia Slipantschuk, Oscar F Bandtlow, and Wolfram Just. Dynamic mode decomposition for analytic maps.Communications in Nonlinear Science and Numerical Simulation, 84:105179, 2020
work page 2020
-
[23]
Gábor J Székely and Nail K Bakirov. Extremal probabilities for Gaussian quadratic forms.Proba- bility theory and related fields, 126(2):184–202, 2003
work page 2003
-
[24]
Princeton University Press, 2005
Lloyd N Trefethen and Mark Embree.Spectra and Pseudospectra: The Behavior of Nonnormal Matrices and Operators. Princeton University Press, 2005
work page 2005
-
[25]
Joel A Tropp. Freedman’s inequality for matrix martingales.Electronic Communications in Prob- ability, 16:262–270, 2011
work page 2011
-
[26]
Joel A Tropp. User-friendly tail bounds for sums of random matrices.Foundations of computational mathematics, 12(4):389–434, 2012
work page 2012
-
[27]
Joel A Tropp et al. An introduction to matrix concentration inequalities.Foundations and Trends in Machine Learning, 8(1-2):1–230, 2015
work page 2015
-
[28]
Matthew O Williams, Ioannis G Kevrekidis, and Clarence W Rowley. A data–driven approximation of the Koopman operator: Extending dynamic mode decomposition.Journal of Nonlinear Science, 25:1307–1346, 2015
work page 2015
-
[29]
SamplingPseudospectrum.jl.https://github.com/wormell/ SamplingPseudospectrum.jl
Caroline Wormell. SamplingPseudospectrum.jl.https://github.com/wormell/ SamplingPseudospectrum.jl
-
[30]
Caroline Wormell. Spectral Galerkin methods for transfer operators in uniformly expanding dynam- ics.Numerische Mathematik, 142(2):421–463, 2019
work page 2019
-
[31]
Caroline Wormell. Orthogonal polynomial approximation and extended dynamic mode decomposi- tion in chaos.SIAM Journal on Numerical Analysis, 63(1):122–148, 2025. 30
work page 2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.