ZNO: Stable Rational Neural Operators in the Z-Domain for Discrete-Time Dynamics
Pith reviewed 2026-05-09 16:33 UTC · model grok-4.3
The pith
ZNO learns stable rational discrete-time filters by parameterizing poles directly in the z-domain.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
ZNO constructs a neural operator whose layers are stable low-rank MIMO rational filters expressed directly in the z-plane. Stability is enforced by a unit-disk pole constraint, poles are made directly interpretable, and each layer combines causal recurrence, low-rank mixing, smooth pole reparameterization, and an optional short FIR branch. This design yields the lowest mean error across a five-bin sweep of near-unit-circle long-memory dynamics and the lowest mean error on public benchmarks whose underlying behavior matches stable rational discrete-time filters.
What carries the argument
The z-domain rational recurrent layer, which realizes the operator as a causal rational transfer function with low-rank channel mixing and reparameterized stable poles inside the unit disk.
If this is right
- ZNO achieves the lowest mean error across controlled discrete system-identification tasks when validation is used to select configurations.
- Its advantage is clearest on dynamics with poles near the unit circle and memory lengths of 10 to 100-200 steps.
- On public nonlinear benchmarks ZNO records the lowest mean error exactly when the systems behave like stable rational discrete-time filters.
- Classical or state-space methods remain preferable on some systems that deviate from this form.
Where Pith is reading between the lines
- Direct access to learned poles could let users apply classical control-analysis tools to inspect or modify the trained model.
- Architectural stability constraints may reduce reliance on post-training stabilization tricks in other recurrent architectures.
- The same z-domain construction could be combined with continuous-time operators to handle hybrid or multi-rate problems.
- Low-rank mixing indicates that high-dimensional discrete dynamics can often be captured without full dense MIMO parameterizations.
Load-bearing premise
The target dynamics can be well approximated by stable rational MIMO systems that fit inside the low-rank mixing and pole reparameterization without loss of needed expressiveness.
What would settle it
A controlled experiment on a ground-truth system whose dynamics require either unstable poles or non-rational behavior, in which ZNO produces higher error than an unconstrained baseline.
Figures
read the original abstract
We introduce the Z-Domain Neural Operator (ZNO), a causal neural operator whose layers are stable low-rank multiple-input multiple-output (MIMO) rational filters parameterized directly in the $z$-plane. ZNO addresses a limitation of existing operator learning methods, many of which are primarily tailored for continuous-time problems, while a large class of system-identification problems is intrinsically discrete-time. The $z$-domain form expresses stability as a unit-disk pole constraint and makes learned discrete-time poles directly readable. The model combines low-rank channel mixing, smooth stable pole reparameterization, causal recurrence, and an optional short finite impulse response (FIR) branch in a single $z$-domain rational recurrent layer. Across controlled discrete system-identification experiments, ZNO's advantage is most evident when the target dynamics are stable rational systems with lightly damped poles near the unit circle. Under matched parameter budgets, ZNO is not uniformly dominant; however, with validation-selected configurations, the same architecture can achieve the lowest mean error across the controlled tasks. A five-bin difficulty sweep over near-unit-circle / long-memory dynamics shows that ZNO has the lowest mean error across memory regimes, from short (approximately 10 steps) to long (approximately 100-200 steps). On five public nonlinear system-identification benchmarks, ZNO is competitive with neural operator and state-space baselines, achieving the lowest mean error on benchmarks whose dynamics align with stable rational discrete-time filters, while classical or state-space baselines remain preferable on some systems. These results position ZNO as a strong model for stable rational discrete-time dynamics, especially in near-unit-circle and long-memory regimes, but not as a universal replacement for specialized system-identification methods.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces the Z-Domain Neural Operator (ZNO), a causal neural operator for discrete-time dynamics whose layers are implemented as stable low-rank MIMO rational filters parameterized directly in the z-plane. Stability is enforced via a unit-disk pole constraint with smooth reparameterization; the architecture combines this with low-rank channel mixing, causal recurrence, and an optional short FIR branch. On controlled discrete system-identification tasks, ZNO shows advantages for targets with lightly damped poles near the unit circle and long memory, achieving the lowest mean error in a five-bin difficulty sweep; on five public nonlinear benchmarks it is competitive overall and lowest on those whose dynamics align with stable rational discrete-time filters, though not uniformly dominant under matched budgets.
Significance. If the empirical claims hold under the stated modeling assumptions, ZNO supplies a targeted, stability-guaranteed architecture for discrete-time operator learning that makes poles directly interpretable and respects the unit-disk constraint by construction. This addresses a gap left by continuous-time-centric neural operators and could be useful in system-identification settings where long-memory dynamics dominate. The work also illustrates the value of embedding rational-filter inductive biases rather than relying solely on generic sequence models.
major comments (2)
- [Abstract and §4 (controlled tasks)] Abstract and controlled-experiments description: the headline claim that ZNO attains the lowest mean error across the five-bin sweep over near-unit-circle/long-memory dynamics is load-bearing on the untested assumption that the generated targets are well-approximated by low-rank stable rational MIMO systems whose cross-channel dependencies fit the low-rank mixing and whose poles remain expressible after the unit-disk-plus-smoothness reparameterization. Without an explicit check (e.g., pole locations of the synthetic systems, rank-ablation, or expressiveness comparison), the advantage may be confined to the narrow subclass the architecture can represent exactly rather than the broader regime advertised.
- [§5 (public benchmarks)] Results on public benchmarks and validation protocol: the statement that ZNO achieves lowest mean error on benchmarks whose dynamics align with stable rational filters, together with the use of validation-selected configurations to reach the reported numbers, risks post-hoc selection. The paper should either fix the hyper-parameter regime in advance, report performance for a single canonical configuration across all tasks, or supply statistical tests over multiple independent runs to substantiate the comparative claims.
minor comments (2)
- [Method description] The low-rank dimension for channel mixing and the FIR branch length are listed as free parameters; a short sensitivity table or default-value justification would improve reproducibility.
- [§3] Notation for the z-domain rational filter (poles, residues, low-rank factors) should be collected in a single table or equation block for quick reference.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed feedback. We address each major comment point by point below, providing the strongest honest defense of the manuscript while agreeing to revisions that strengthen the claims without misrepresentation.
read point-by-point responses
-
Referee: [Abstract and §4 (controlled tasks)] Abstract and controlled-experiments description: the headline claim that ZNO attains the lowest mean error across the five-bin sweep over near-unit-circle/long-memory dynamics is load-bearing on the untested assumption that the generated targets are well-approximated by low-rank stable rational MIMO systems whose cross-channel dependencies fit the low-rank mixing and whose poles remain expressible after the unit-disk-plus-smoothness reparameterization. Without an explicit check (e.g., pole locations of the synthetic systems, rank-ablation, or expressiveness comparison), the advantage may be confined to the narrow subclass the architecture can represent exactly rather than the broader regime advertised.
Authors: We agree that explicit verification of the synthetic target properties would strengthen the interpretation. The controlled tasks generate stable discrete-time linear MIMO systems with poles deliberately sampled near the unit circle (light damping, long memory) to probe exactly the regime where rational filters excel, as stated in §4. In revision we will add a supplementary figure with the empirical distribution of pole magnitudes and angles across the five difficulty bins, confirming all poles lie inside the unit disk and match the targeted damping/memory characteristics. We will also include a low-rank ablation (varying the channel-mixing rank) showing that performance saturates at modest ranks consistent with the generated cross-channel dependencies, and a short discussion noting that the smooth reparameterization is bijective for all stable poles. These additions demonstrate that the observed advantage arises from the architecture's inductive bias rather than an exact representational match. revision: yes
-
Referee: [§5 (public benchmarks)] Results on public benchmarks and validation protocol: the statement that ZNO achieves lowest mean error on benchmarks whose dynamics align with stable rational filters, together with the use of validation-selected configurations to reach the reported numbers, risks post-hoc selection. The paper should either fix the hyper-parameter regime in advance, report performance for a single canonical configuration across all tasks, or supply statistical tests over multiple independent runs to substantiate the comparative claims.
Authors: We accept that validation-selected configurations introduce a risk of post-hoc selection. In the revised manuscript we will adopt a single fixed hyper-parameter regime (chosen once via a preliminary study on a single representative benchmark and then frozen) and report its performance uniformly across all five public tasks. We will additionally supply mean and standard deviation over five independent random seeds for each method, providing statistical support for the comparative statements. This protocol removes per-task tuning while preserving the observation that ZNO is strongest on benchmarks whose dynamics are well-aligned with stable rational discrete-time filters. revision: yes
Circularity Check
No circularity: architecture and empirical results are independent of reported metrics
full rationale
The paper defines the ZNO architecture, its z-domain rational filter parameterization, unit-disk stability constraint, low-rank mixing, and causal recurrence independently of any performance numbers. All reported results (five-bin sweep, benchmark errors) are obtained from separate training and evaluation runs on controlled tasks and public datasets. No equation reduces a claimed prediction to a fitted input by construction, no uniqueness theorem is imported from self-citation to force the form, and no ansatz is smuggled via prior work. The derivation chain consists of standard neural operator design choices followed by experimental validation; the central claims therefore remain falsifiable outside the fitted values.
Axiom & Free-Parameter Ledger
free parameters (2)
- low-rank dimension for channel mixing
- FIR branch length
axioms (2)
- domain assumption Poles of the rational filter must lie strictly inside the unit disk to guarantee BIBO stability
- domain assumption The target discrete-time dynamics admit a low-rank rational representation
invented entities (1)
-
ZNO recurrent layer
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Beintema and Maarten Schoukens
Gerben I. Beintema and Maarten Schoukens. nonlinear-benchmarks: The official dataloader of nonlinearbenchmark.org, 2025. Python package, version 1.0.1
work page 2025
-
[2]
Qianying Cao, Somdatta Goswami, and George Em Karniadakis. Laplace neural operator for solving differential equations.Nature Machine Intelligence, 6(6):631–640, 2024
work page 2024
-
[3]
Tianping Chen and Hong Chen. Universal approximation to nonlinear operators by neural networks with arbitrary activation functions and its application to dynamical systems.IEEE Transactions on Neural Networks, 6(4):911–917, 1995
work page 1995
-
[4]
On the parameterization and initialization of diagonal state space models
Albert Gu, Karan Goel, Ankit Gupta, and Christopher Ré. On the parameterization and initialization of diagonal state space models. InAdvances in Neural Information Processing Systems (NeurIPS), volume 35, pages 35971–35983, 2022
work page 2022
-
[5]
Efficiently modeling long sequences with structured state spaces
Albert Gu, Karan Goel, and Christopher Ré. Efficiently modeling long sequences with structured state spaces. InInternational Conference on Learning Representations (ICLR), 2022
work page 2022
-
[6]
Data set and reference models of EMPS
Alexandre Janot, Maxime Gautier, and Mathieu Brunot. Data set and reference models of EMPS. In2019 Workshop on Nonlinear System Identification Benchmarks, Eindhoven, The Netherlands, 2019. April 10–12, 2019
work page 2019
-
[7]
Nikola Kovachki, Zongyi Li, Burigede Liu, Kamyar Azizzadenesheli, Kaushik Bhattacharya, Andrew Stuart, and Animashree Anandkumar. Neural operator: Learning maps between function spaces with applications to PDEs.Journal of Machine Learning Research, 24(89):1–97, 2023
work page 2023
-
[8]
Wilson Rocha Lacerda, Luan Pascoal Costa da Andrade, Samuel Carlos Pessoa Oliveira, and Samir Angelo Milani Martins. SysIdentPy: A Python package for system identification using NARMAX models.Journal of Open Source Software, 5(54):2384, 2020
work page 2020
-
[9]
Zongyi Li, Nikola B. Kovachki, Kamyar Azizzadenesheli, Burigede Liu, Kaushik Bhattacharya, Andrew Stuart, and Animashree Anandkumar. Fourier neural operator for parametric partial differential equations. InInternational Conference on Learning Representations (ICLR), 2021
work page 2021
-
[10]
Prentice Hall, Upper Saddle River, NJ, 2nd edition, 1999
Lennart Ljung.System Identification: Theory for the User. Prentice Hall, Upper Saddle River, NJ, 2nd edition, 1999
work page 1999
-
[11]
Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators
Lu Lu, Pengzhan Jin, Guofei Pang, Zhongqiang Zhang, and George Em Karniadakis. Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators. Nature Machine Intelligence, 3:218–229, 2021
work page 2021
-
[12]
Alan V . Oppenheim, Ronald W. Schafer, and John R. Buck.Discrete-Time Signal Processing. Prentice Hall, Upper Saddle River, NJ, 2nd edition, 1999
work page 1999
-
[13]
PyTorch: An imperative style, high-performance deep learning library
Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Kopf, Edward Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. PyTorch: An imperative style, high-perfo...
work page 2019
-
[14]
Johan Schoukens, Johan Suykens, and Lennart Ljung. Wiener–Hammerstein benchmark. In 15th IFAC Symposium on System Identification (SYSID 2009), St. Malo, France, July 2009. July 6–8, 2009
work page 2009
-
[15]
Maarten Schoukens and Jean-Philippe Noël. Three benchmarks addressing open challenges in nonlinear system identification.IFAC-PapersOnLine, 50(1):446–451, 2017
work page 2017
-
[16]
Triton: An intermediate language and compiler for tiled neural network computations
Philippe Tillet, Hsiang-Tsung Kung, and David Cox. Triton: An intermediate language and compiler for tiled neural network computations. InProceedings of the 3rd ACM SIGPLAN International Workshop on Machine Learning and Programming Languages, pages 10–19, 2019. 10
work page 2019
-
[17]
Three free data sets for development and benchmarking in nonlinear system identification
Torbjörn Wigren and Johan Schoukens. Three free data sets for development and benchmarking in nonlinear system identification. In2013 European Control Conference (ECC), pages 2933–
-
[18]
Coupled electric drives data set and reference models
Torbjörn Wigren and Maarten Schoukens. Coupled electric drives data set and reference models. Technical Report 2017-024, Department of Information Technology, Uppsala University, 2017. A Parameter count of the ZNO layer For the even pole counts used by the reported configurations, layer ℓ has rw (in-projection) +rw (out-projection) +w 2 +w (skip) +Kr (com...
work page 2017
-
[19]
Direct z-plane parameterization.For a continuous-time pole µ, the discrete pole is p= exp(µ∆t) , so an s-plane implementation must choose a sampling interval ∆t before evaluating the discrete recurrence. For native discrete-time data this interval is an interface choice rather than a physical modeling variable. ZNO parameterizes p directly in the unit dis...
-
[20]
Low-rank MIMO rational layer.The reference LNO layer applies a dense per-channel multiplication in the Laplace domain. The ZNO dynamic branch (3.2)–(3.3) is a low-rank factorization of a MIMO rational transfer matrix with r≪w state channels, and the full residual layer adds a dense pointwise skip. Under matched parameter budget this shifts capacity from s...
-
[21]
Fused causal recurrent implementation.The reference LNO evaluates pole-residue terms using FFTs and dense frequency-by-pole contractions over the time grid and retained modes. In contrast, ZNO evaluates fixed-rank, fixed-pole filters by a causal recurrent scan whose cost is linear inTfor fixedrandK. C Training protocol and per-task configurations Hyper-pa...
work page 2048
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.