pith. sign in

arxiv: 2605.02356 · v2 · submitted 2026-05-04 · 💻 cs.LG · cs.NA· math.NA

ZNO: Stable Rational Neural Operators in the Z-Domain for Discrete-Time Dynamics

Pith reviewed 2026-05-09 16:33 UTC · model grok-4.3

classification 💻 cs.LG cs.NAmath.NA
keywords neural operatordiscrete-time dynamicsrational filterz-domainsystem identificationstable poleslong memoryrecurrent model
0
0 comments X

The pith

ZNO learns stable rational discrete-time filters by parameterizing poles directly in the z-domain.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces ZNO to address the fact that most neural operators target continuous-time problems even though many system-identification tasks are inherently discrete. It builds each layer as a causal low-rank MIMO rational filter whose poles are kept inside the unit disk by a smooth reparameterization, making stability explicit and the poles readable. Low-rank channel mixing and an optional short FIR branch are added so the model can capture systems with lightly damped poles and memory lengths from roughly ten to two hundred steps. A reader would care because discrete-time models appear in control, signal processing, and time-series tasks where explicit stability and long-horizon accuracy matter. Experiments show the architecture records the lowest mean error on controlled tasks whose dynamics align with stable rational filters and remains competitive on public benchmarks.

Core claim

ZNO constructs a neural operator whose layers are stable low-rank MIMO rational filters expressed directly in the z-plane. Stability is enforced by a unit-disk pole constraint, poles are made directly interpretable, and each layer combines causal recurrence, low-rank mixing, smooth pole reparameterization, and an optional short FIR branch. This design yields the lowest mean error across a five-bin sweep of near-unit-circle long-memory dynamics and the lowest mean error on public benchmarks whose underlying behavior matches stable rational discrete-time filters.

What carries the argument

The z-domain rational recurrent layer, which realizes the operator as a causal rational transfer function with low-rank channel mixing and reparameterized stable poles inside the unit disk.

If this is right

  • ZNO achieves the lowest mean error across controlled discrete system-identification tasks when validation is used to select configurations.
  • Its advantage is clearest on dynamics with poles near the unit circle and memory lengths of 10 to 100-200 steps.
  • On public nonlinear benchmarks ZNO records the lowest mean error exactly when the systems behave like stable rational discrete-time filters.
  • Classical or state-space methods remain preferable on some systems that deviate from this form.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Direct access to learned poles could let users apply classical control-analysis tools to inspect or modify the trained model.
  • Architectural stability constraints may reduce reliance on post-training stabilization tricks in other recurrent architectures.
  • The same z-domain construction could be combined with continuous-time operators to handle hybrid or multi-rate problems.
  • Low-rank mixing indicates that high-dimensional discrete dynamics can often be captured without full dense MIMO parameterizations.

Load-bearing premise

The target dynamics can be well approximated by stable rational MIMO systems that fit inside the low-rank mixing and pole reparameterization without loss of needed expressiveness.

What would settle it

A controlled experiment on a ground-truth system whose dynamics require either unstable poles or non-rational behavior, in which ZNO produces higher error than an unconstrained baseline.

Figures

Figures reproduced from arXiv: 2605.02356 by Jia Yin, Xianli Zhu.

Figure 1
Figure 1. Figure 1: Spectral geometries of FNO, LNO and ZNO. (a) FNO parameterizes the transfer function at view at source ↗
Figure 2
Figure 2. Figure 2: The ZNO architecture. (a) The full causal network: pointwise lift view at source ↗
Figure 3
Figure 3. Figure 3: Discrete benchmark results: relative L2 test error (mean ± standard deviation, 5 seeds). (a) Matched-budget protocol: all models use 8.5–8.9k parameters on every task. (b) Tuned-best protocol. ZNO has the lowest mean error on the near-unit-circle resonant task and narrowly on NARX, but is not uniformly dominant under the matched budget. Results view at source ↗
Figure 4
Figure 4. Figure 4: Near-unit-circle difficulty sweep. The pole-radius intervals control the effective memory view at source ↗
Figure 5
Figure 5. Figure 5: Long-horizon extrapolation at T ∈ {2048, 4096, 8192} (3 tasks, 5 seeds). ZNO and S4D are both essentially length-invariant because their layers are causal recurrences with length￾independent parameter counts. ZNO retains the lowest absolute error on every task at every length. FNO error grows sharply at T = 8192 on every task, consistent with its frequency grid being tied to the training length. Task Model… view at source ↗
Figure 6
Figure 6. Figure 6: 1D LNO ODE benchmark, relative L2 test error (mean ± standard deviation, 3 seeds). (a) Matched-budget protocol (∼7–9k params). (b) Tuned-best protocol. ZNO has the lowest error on Lorenz ρ = 5 and is competitive elsewhere; LNO and FNO each have the lowest error on some cases, as expected given that the benchmark was designed around continuous-time ODEs. Results. Under matched budget, ZNO has the lowest err… view at source ↗
Figure 7
Figure 7. Figure 7: Internal ablation of ZNO on resonant ARMA and sixth-order IIR cascade (5 seeds). (a) view at source ↗
Figure 8
Figure 8. Figure 8: Accuracy vs training time under the tuned-best protocol (5-seed mean; error bars: standard view at source ↗
Figure 9
Figure 9. Figure 9: Learned ZNO poles on the z-plane per task. Color: layer depth; marker size: residue magnitude. Solid/dashed circles: stability boundary and safe radius ρsafe = 0.95. The maps show near-unit-circle oscillatory poles for resonant ARMA, later-layer IIR poles near the safe radius, and NARX poles near the real axis. 0 500 1000 1500 2000 Discrete time index n −0.5 0.0 0.5 1.0 R esp o nse yn Resonant ARMA (near u… view at source ↗
Figure 10
Figure 10. Figure 10: Qualitative comparison on one held-out test trajectory per task. Black dashed: ground view at source ↗
read the original abstract

We introduce the Z-Domain Neural Operator (ZNO), a causal neural operator whose layers are stable low-rank multiple-input multiple-output (MIMO) rational filters parameterized directly in the $z$-plane. ZNO addresses a limitation of existing operator learning methods, many of which are primarily tailored for continuous-time problems, while a large class of system-identification problems is intrinsically discrete-time. The $z$-domain form expresses stability as a unit-disk pole constraint and makes learned discrete-time poles directly readable. The model combines low-rank channel mixing, smooth stable pole reparameterization, causal recurrence, and an optional short finite impulse response (FIR) branch in a single $z$-domain rational recurrent layer. Across controlled discrete system-identification experiments, ZNO's advantage is most evident when the target dynamics are stable rational systems with lightly damped poles near the unit circle. Under matched parameter budgets, ZNO is not uniformly dominant; however, with validation-selected configurations, the same architecture can achieve the lowest mean error across the controlled tasks. A five-bin difficulty sweep over near-unit-circle / long-memory dynamics shows that ZNO has the lowest mean error across memory regimes, from short (approximately 10 steps) to long (approximately 100-200 steps). On five public nonlinear system-identification benchmarks, ZNO is competitive with neural operator and state-space baselines, achieving the lowest mean error on benchmarks whose dynamics align with stable rational discrete-time filters, while classical or state-space baselines remain preferable on some systems. These results position ZNO as a strong model for stable rational discrete-time dynamics, especially in near-unit-circle and long-memory regimes, but not as a universal replacement for specialized system-identification methods.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces the Z-Domain Neural Operator (ZNO), a causal neural operator for discrete-time dynamics whose layers are implemented as stable low-rank MIMO rational filters parameterized directly in the z-plane. Stability is enforced via a unit-disk pole constraint with smooth reparameterization; the architecture combines this with low-rank channel mixing, causal recurrence, and an optional short FIR branch. On controlled discrete system-identification tasks, ZNO shows advantages for targets with lightly damped poles near the unit circle and long memory, achieving the lowest mean error in a five-bin difficulty sweep; on five public nonlinear benchmarks it is competitive overall and lowest on those whose dynamics align with stable rational discrete-time filters, though not uniformly dominant under matched budgets.

Significance. If the empirical claims hold under the stated modeling assumptions, ZNO supplies a targeted, stability-guaranteed architecture for discrete-time operator learning that makes poles directly interpretable and respects the unit-disk constraint by construction. This addresses a gap left by continuous-time-centric neural operators and could be useful in system-identification settings where long-memory dynamics dominate. The work also illustrates the value of embedding rational-filter inductive biases rather than relying solely on generic sequence models.

major comments (2)
  1. [Abstract and §4 (controlled tasks)] Abstract and controlled-experiments description: the headline claim that ZNO attains the lowest mean error across the five-bin sweep over near-unit-circle/long-memory dynamics is load-bearing on the untested assumption that the generated targets are well-approximated by low-rank stable rational MIMO systems whose cross-channel dependencies fit the low-rank mixing and whose poles remain expressible after the unit-disk-plus-smoothness reparameterization. Without an explicit check (e.g., pole locations of the synthetic systems, rank-ablation, or expressiveness comparison), the advantage may be confined to the narrow subclass the architecture can represent exactly rather than the broader regime advertised.
  2. [§5 (public benchmarks)] Results on public benchmarks and validation protocol: the statement that ZNO achieves lowest mean error on benchmarks whose dynamics align with stable rational filters, together with the use of validation-selected configurations to reach the reported numbers, risks post-hoc selection. The paper should either fix the hyper-parameter regime in advance, report performance for a single canonical configuration across all tasks, or supply statistical tests over multiple independent runs to substantiate the comparative claims.
minor comments (2)
  1. [Method description] The low-rank dimension for channel mixing and the FIR branch length are listed as free parameters; a short sensitivity table or default-value justification would improve reproducibility.
  2. [§3] Notation for the z-domain rational filter (poles, residues, low-rank factors) should be collected in a single table or equation block for quick reference.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. We address each major comment point by point below, providing the strongest honest defense of the manuscript while agreeing to revisions that strengthen the claims without misrepresentation.

read point-by-point responses
  1. Referee: [Abstract and §4 (controlled tasks)] Abstract and controlled-experiments description: the headline claim that ZNO attains the lowest mean error across the five-bin sweep over near-unit-circle/long-memory dynamics is load-bearing on the untested assumption that the generated targets are well-approximated by low-rank stable rational MIMO systems whose cross-channel dependencies fit the low-rank mixing and whose poles remain expressible after the unit-disk-plus-smoothness reparameterization. Without an explicit check (e.g., pole locations of the synthetic systems, rank-ablation, or expressiveness comparison), the advantage may be confined to the narrow subclass the architecture can represent exactly rather than the broader regime advertised.

    Authors: We agree that explicit verification of the synthetic target properties would strengthen the interpretation. The controlled tasks generate stable discrete-time linear MIMO systems with poles deliberately sampled near the unit circle (light damping, long memory) to probe exactly the regime where rational filters excel, as stated in §4. In revision we will add a supplementary figure with the empirical distribution of pole magnitudes and angles across the five difficulty bins, confirming all poles lie inside the unit disk and match the targeted damping/memory characteristics. We will also include a low-rank ablation (varying the channel-mixing rank) showing that performance saturates at modest ranks consistent with the generated cross-channel dependencies, and a short discussion noting that the smooth reparameterization is bijective for all stable poles. These additions demonstrate that the observed advantage arises from the architecture's inductive bias rather than an exact representational match. revision: yes

  2. Referee: [§5 (public benchmarks)] Results on public benchmarks and validation protocol: the statement that ZNO achieves lowest mean error on benchmarks whose dynamics align with stable rational filters, together with the use of validation-selected configurations to reach the reported numbers, risks post-hoc selection. The paper should either fix the hyper-parameter regime in advance, report performance for a single canonical configuration across all tasks, or supply statistical tests over multiple independent runs to substantiate the comparative claims.

    Authors: We accept that validation-selected configurations introduce a risk of post-hoc selection. In the revised manuscript we will adopt a single fixed hyper-parameter regime (chosen once via a preliminary study on a single representative benchmark and then frozen) and report its performance uniformly across all five public tasks. We will additionally supply mean and standard deviation over five independent random seeds for each method, providing statistical support for the comparative statements. This protocol removes per-task tuning while preserving the observation that ZNO is strongest on benchmarks whose dynamics are well-aligned with stable rational discrete-time filters. revision: yes

Circularity Check

0 steps flagged

No circularity: architecture and empirical results are independent of reported metrics

full rationale

The paper defines the ZNO architecture, its z-domain rational filter parameterization, unit-disk stability constraint, low-rank mixing, and causal recurrence independently of any performance numbers. All reported results (five-bin sweep, benchmark errors) are obtained from separate training and evaluation runs on controlled tasks and public datasets. No equation reduces a claimed prediction to a fitted input by construction, no uniqueness theorem is imported from self-citation to force the form, and no ansatz is smuggled via prior work. The derivation chain consists of standard neural operator design choices followed by experimental validation; the central claims therefore remain falsifiable outside the fitted values.

Axiom & Free-Parameter Ledger

2 free parameters · 2 axioms · 1 invented entities

The model rests on standard discrete-time stability theory and the assumption that low-rank rational filters suffice for the targeted dynamics; no new physical entities are postulated.

free parameters (2)
  • low-rank dimension for channel mixing
    Hyperparameter controlling the rank of the MIMO mixing matrices; chosen per experiment.
  • FIR branch length
    Optional design parameter for the short finite impulse response component.
axioms (2)
  • domain assumption Poles of the rational filter must lie strictly inside the unit disk to guarantee BIBO stability
    Invoked to justify the pole reparameterization technique.
  • domain assumption The target discrete-time dynamics admit a low-rank rational representation
    Central modeling assumption for the architecture to be expressive enough.
invented entities (1)
  • ZNO recurrent layer no independent evidence
    purpose: To combine causal recurrence, stable pole parameterization, and low-rank mixing in one z-domain block
    New architectural component introduced by the paper; no independent evidence outside this work.

pith-pipeline@v0.9.0 · 5615 in / 1401 out tokens · 34002 ms · 2026-05-09T16:33:16.425718+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

21 extracted references · 21 canonical work pages

  1. [1]

    Beintema and Maarten Schoukens

    Gerben I. Beintema and Maarten Schoukens. nonlinear-benchmarks: The official dataloader of nonlinearbenchmark.org, 2025. Python package, version 1.0.1

  2. [2]

    Laplace neural operator for solving differential equations.Nature Machine Intelligence, 6(6):631–640, 2024

    Qianying Cao, Somdatta Goswami, and George Em Karniadakis. Laplace neural operator for solving differential equations.Nature Machine Intelligence, 6(6):631–640, 2024

  3. [3]

    Tianping Chen and Hong Chen. Universal approximation to nonlinear operators by neural networks with arbitrary activation functions and its application to dynamical systems.IEEE Transactions on Neural Networks, 6(4):911–917, 1995

  4. [4]

    On the parameterization and initialization of diagonal state space models

    Albert Gu, Karan Goel, Ankit Gupta, and Christopher Ré. On the parameterization and initialization of diagonal state space models. InAdvances in Neural Information Processing Systems (NeurIPS), volume 35, pages 35971–35983, 2022

  5. [5]

    Efficiently modeling long sequences with structured state spaces

    Albert Gu, Karan Goel, and Christopher Ré. Efficiently modeling long sequences with structured state spaces. InInternational Conference on Learning Representations (ICLR), 2022

  6. [6]

    Data set and reference models of EMPS

    Alexandre Janot, Maxime Gautier, and Mathieu Brunot. Data set and reference models of EMPS. In2019 Workshop on Nonlinear System Identification Benchmarks, Eindhoven, The Netherlands, 2019. April 10–12, 2019

  7. [7]

    Neural operator: Learning maps between function spaces with applications to PDEs.Journal of Machine Learning Research, 24(89):1–97, 2023

    Nikola Kovachki, Zongyi Li, Burigede Liu, Kamyar Azizzadenesheli, Kaushik Bhattacharya, Andrew Stuart, and Animashree Anandkumar. Neural operator: Learning maps between function spaces with applications to PDEs.Journal of Machine Learning Research, 24(89):1–97, 2023

  8. [8]

    SysIdentPy: A Python package for system identification using NARMAX models.Journal of Open Source Software, 5(54):2384, 2020

    Wilson Rocha Lacerda, Luan Pascoal Costa da Andrade, Samuel Carlos Pessoa Oliveira, and Samir Angelo Milani Martins. SysIdentPy: A Python package for system identification using NARMAX models.Journal of Open Source Software, 5(54):2384, 2020

  9. [9]

    Kovachki, Kamyar Azizzadenesheli, Burigede Liu, Kaushik Bhattacharya, Andrew Stuart, and Animashree Anandkumar

    Zongyi Li, Nikola B. Kovachki, Kamyar Azizzadenesheli, Burigede Liu, Kaushik Bhattacharya, Andrew Stuart, and Animashree Anandkumar. Fourier neural operator for parametric partial differential equations. InInternational Conference on Learning Representations (ICLR), 2021

  10. [10]

    Prentice Hall, Upper Saddle River, NJ, 2nd edition, 1999

    Lennart Ljung.System Identification: Theory for the User. Prentice Hall, Upper Saddle River, NJ, 2nd edition, 1999

  11. [11]

    Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators

    Lu Lu, Pengzhan Jin, Guofei Pang, Zhongqiang Zhang, and George Em Karniadakis. Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators. Nature Machine Intelligence, 3:218–229, 2021

  12. [12]

    Oppenheim, Ronald W

    Alan V . Oppenheim, Ronald W. Schafer, and John R. Buck.Discrete-Time Signal Processing. Prentice Hall, Upper Saddle River, NJ, 2nd edition, 1999

  13. [13]

    PyTorch: An imperative style, high-performance deep learning library

    Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Kopf, Edward Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. PyTorch: An imperative style, high-perfo...

  14. [14]

    Wiener–Hammerstein benchmark

    Johan Schoukens, Johan Suykens, and Lennart Ljung. Wiener–Hammerstein benchmark. In 15th IFAC Symposium on System Identification (SYSID 2009), St. Malo, France, July 2009. July 6–8, 2009

  15. [15]

    Three benchmarks addressing open challenges in nonlinear system identification.IFAC-PapersOnLine, 50(1):446–451, 2017

    Maarten Schoukens and Jean-Philippe Noël. Three benchmarks addressing open challenges in nonlinear system identification.IFAC-PapersOnLine, 50(1):446–451, 2017

  16. [16]

    Triton: An intermediate language and compiler for tiled neural network computations

    Philippe Tillet, Hsiang-Tsung Kung, and David Cox. Triton: An intermediate language and compiler for tiled neural network computations. InProceedings of the 3rd ACM SIGPLAN International Workshop on Machine Learning and Programming Languages, pages 10–19, 2019. 10

  17. [17]

    Three free data sets for development and benchmarking in nonlinear system identification

    Torbjörn Wigren and Johan Schoukens. Three free data sets for development and benchmarking in nonlinear system identification. In2013 European Control Conference (ECC), pages 2933–

  18. [18]

    Coupled electric drives data set and reference models

    Torbjörn Wigren and Maarten Schoukens. Coupled electric drives data set and reference models. Technical Report 2017-024, Department of Information Technology, Uppsala University, 2017. A Parameter count of the ZNO layer For the even pole counts used by the reported configurations, layer ℓ has rw (in-projection) +rw (out-projection) +w 2 +w (skip) +Kr (com...

  19. [19]

    For native discrete-time data this interval is an interface choice rather than a physical modeling variable

    Direct z-plane parameterization.For a continuous-time pole µ, the discrete pole is p= exp(µ∆t) , so an s-plane implementation must choose a sampling interval ∆t before evaluating the discrete recurrence. For native discrete-time data this interval is an interface choice rather than a physical modeling variable. ZNO parameterizes p directly in the unit dis...

  20. [20]

    The ZNO dynamic branch (3.2)–(3.3) is a low-rank factorization of a MIMO rational transfer matrix with r≪w state channels, and the full residual layer adds a dense pointwise skip

    Low-rank MIMO rational layer.The reference LNO layer applies a dense per-channel multiplication in the Laplace domain. The ZNO dynamic branch (3.2)–(3.3) is a low-rank factorization of a MIMO rational transfer matrix with r≪w state channels, and the full residual layer adds a dense pointwise skip. Under matched parameter budget this shifts capacity from s...

  21. [21]

    In contrast, ZNO evaluates fixed-rank, fixed-pole filters by a causal recurrent scan whose cost is linear inTfor fixedrandK

    Fused causal recurrent implementation.The reference LNO evaluates pole-residue terms using FFTs and dense frequency-by-pole contractions over the time grid and retained modes. In contrast, ZNO evaluates fixed-rank, fixed-pole filters by a causal recurrent scan whose cost is linear inTfor fixedrandK. C Training protocol and per-task configurations Hyper-pa...