Model-Free Neural Filtering: A Comparison with Classical Filters in Nonlinear Systems
Pith reviewed 2026-05-16 10:01 UTC · model grok-4.3
The pith
Structured state-space models match strong classical filters in nonlinear systems without needing explicit models.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Among neural estimators, structured state-space models (SSMs), in particular Mamba and Mamba-2, are consistently strong. They approach strong classical filters in several nonlinear systems and outperform weaker classical baselines without access to system models, while achieving substantially higher inference throughput. The relative strength is attributed to filtering-aligned inductive bias from recursive latent-state updates.
What carries the argument
Filtering-aligned inductive bias from recursive latent-state updates in structured state-space models, which makes them structurally closer to classical filters under fixed parameter budgets, finite data, and long-horizon evaluation.
If this is right
- Neural estimators without system models can outperform weaker classical filters in nonlinear scenarios.
- Structured SSMs achieve substantially higher inference throughput than classical methods on tested hardware.
- Accurate model-based filters still dominate when their assumptions match the true system dynamics well.
- Recursive latent-state updates provide an inductive bias suited to filtering under fixed budgets and long horizons.
Where Pith is reading between the lines
- Neural filters could support state estimation in black-box environments where deriving explicit dynamics is impractical.
- The throughput advantage may enable real-time applications in resource-constrained settings.
- Extensions could test performance when training data includes noise levels that mismatch classical assumptions.
Load-bearing premise
The neural estimators can be trained purely from data in a manner that allows fair comparison to classical filters across multiple nonlinear scenarios without access to system models for the neural side.
What would settle it
Measure estimation error of Mamba-based estimators versus particle filters on a new nonlinear system where the model is known but neural training uses limited data from trajectories.
read the original abstract
Neural network models are increasingly used for state estimation in control and decision-making, yet it remains unclear to what extent they behave as principled filters in nonlinear dynamical systems. Unlike classical filters, which rely on explicit dynamics and noise models, neural estimators can be trained purely from data. We present a systematic comparison between model-free neural estimators and classical filtering methods across multiple nonlinear scenarios. On the neural side, we evaluate Transformer-based models, recurrent neural networks, and state-space models; on the classical side, we compare against particle filters and nonlinear Kalman filters. Results show that structured state-space models (SSMs), in particular Mamba and Mamba-2, are consistently strong among neural estimators. They approach strong classical filters in several nonlinear systems and outperform weaker classical baselines without access to system models, while the evaluated neural implementations achieve substantially higher inference throughput on the tested hardware. Accurate model-based filters can still dominate when their assumptions are well matched. We attribute the relative strength of SSMs to filtering-aligned inductive bias: recursive latent-state updates make them structurally closer to classical filters under fixed parameter budgets, finite data, and long-horizon evaluation.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper conducts a systematic empirical comparison of model-free neural state estimators—including Transformers, RNNs, and structured state-space models (SSMs) such as Mamba and Mamba-2—against classical model-based filters (particle filters and nonlinear Kalman filters) across multiple nonlinear dynamical systems. It claims that SSMs are the strongest neural performers, approaching the accuracy of strong classical filters while outperforming weaker baselines, without access to system models, and with substantially higher inference throughput; the relative strength is attributed to recursive latent-state updates providing filtering-aligned inductive bias under fixed parameter budgets and long-horizon evaluation.
Significance. If the results hold under strictly model-free training, the work would offer practical guidance on neural architecture choice for sequential estimation tasks in control and decision-making, highlighting SSMs as efficient alternatives to classical methods when explicit models are unavailable. The throughput comparison adds engineering relevance for real-time deployment.
major comments (2)
- [§4 and §4.1] §4 (Experimental Setup) and §4.1 (Data Generation): The protocol for generating training sequences and the supervision signal (state estimation MSE versus observation likelihood only) is not specified. This leaves open whether neural models receive ground-truth states from the identical simulator used to instantiate the classical filters, which would undermine the central claim that the neural side operates 'purely from data' without system-model access while classical filters receive models only at inference.
- [§5 and tables] §5 (Results) and associated tables: No error bars, number of independent runs, or statistical significance tests are reported for the performance metrics. Without these, it is impossible to determine whether the reported outperformance of Mamba/Mamba-2 over weaker classical baselines or their approach to strong classical filters is reliable rather than within-run variance.
minor comments (2)
- [Abstract] Abstract: The claim of 'substantially higher inference throughput' is not quantified (e.g., no speedup factor or hardware specification), reducing clarity for readers interested in practical deployment.
- [§3] §3 (Methods): Acronyms such as SSM are used before explicit definition; add a brief expansion on first use for accessibility.
Simulated Author's Rebuttal
We thank the referee for the constructive comments, which help improve the clarity and rigor of our work. We address the major concerns below and will make the corresponding revisions to the manuscript.
read point-by-point responses
-
Referee: [§4 and §4.1] §4 (Experimental Setup) and §4.1 (Data Generation): The protocol for generating training sequences and the supervision signal (state estimation MSE versus observation likelihood only) is not specified. This leaves open whether neural models receive ground-truth states from the identical simulator used to instantiate the classical filters, which would undermine the central claim that the neural side operates 'purely from data' without system-model access while classical filters receive models only at inference.
Authors: We thank the referee for pointing this out. The neural estimators are trained in a supervised manner on simulated trajectories, using ground-truth states as targets for the MSE loss; this is the standard protocol for learning model-free filters from data. Critically, the neural models receive neither the explicit dynamics functions, noise parameters, nor any other system-model information at training or inference time—they operate solely on the observed sequences. Classical filters, by contrast, are instantiated with the full model at test time. To eliminate ambiguity we will expand §4.1 with an explicit description of the data-generation procedure, the precise supervision signal (state MSE), and a statement confirming that no model information is supplied to the neural side beyond the raw training sequences. revision: yes
-
Referee: [§5 and tables] §5 (Results) and associated tables: No error bars, number of independent runs, or statistical significance tests are reported for the performance metrics. Without these, it is impossible to determine whether the reported outperformance of Mamba/Mamba-2 over weaker classical baselines or their approach to strong classical filters is reliable rather than within-run variance.
Authors: We agree that the absence of error bars and statistical analysis weakens the current presentation. We will rerun all experiments with at least five independent random seeds per configuration, report mean performance together with standard deviations in the revised tables, and add paired statistical significance tests (e.g., t-tests) for the key comparisons between Mamba/Mamba-2 and the classical baselines. revision: yes
Circularity Check
No circularity in empirical comparison of neural and classical filters
full rationale
The paper reports an empirical study comparing model-free neural estimators (Transformers, RNNs, SSMs including Mamba) to classical particle and nonlinear Kalman filters across nonlinear dynamical systems. No derivation chain, first-principles predictions, or fitted parameters are claimed; performance results are obtained from direct experiments on simulated data. The attribution of SSM strength to 'filtering-aligned inductive bias' is an interpretive remark after the fact and does not reduce any quantitative claim to a self-defined quantity or self-citation. Training is described as 'purely from data' with no equations that would make the reported metrics tautological by construction. The comparison is therefore self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/AbsoluteFloorClosure.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
neural estimators can be trained purely from data without access to the underlying system equations
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
structured state-space models (SSMs), in particular Mamba and Mamba-2
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Forward citations
Cited by 1 Pith paper
-
On the Generalization Properties of Selective State-Space Models for Filtering Tasks for Unknown Systems
Selective state-space models achieve online filtering for unknown systems from the same class with generalization bounds derived under appropriate assumptions.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.