Natural Gradient Bayesian Filtering: Geometry-Aware Filter for Dynamical Systems
Pith reviewed 2026-05-08 18:37 UTC · model grok-4.3
The pith
Natural gradient descent on the Gaussian manifold recovers the Kalman measurement update exactly in the linear case and extends geometrically to nonlinear filtering.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The NANO filter performs Bayesian filtering by iteratively applying natural gradient steps on the statistical manifold of multivariate Gaussians, using the Fisher information metric to update both mean and covariance parameters while guaranteeing that the covariance remains positive definite. In the linear-Gaussian case this procedure coincides exactly with the Kalman measurement update after one step; in nonlinear cases it supplies an iterative refinement of the Gaussian posterior approximation.
What carries the argument
Natural gradient descent on the statistical manifold of Gaussian distributions, where the Fisher-Rao metric defines the direction of steepest ascent for the negative log-likelihood or KL divergence objective.
If this is right
- A single natural-gradient step recovers the exact Kalman measurement update when the system and measurement models are linear and the noise is Gaussian.
- Covariance matrices remain positive definite by construction throughout the iterative updates.
- The filter can be applied directly to nonlinear estimation tasks such as satellite attitude determination, SLAM, and state estimation on quadruped and humanoid robots.
- Prediction and measurement updates are treated uniformly as optimization steps on the same Gaussian manifold.
Where Pith is reading between the lines
- The geometric view may allow systematic incorporation of constraints or additional manifold structure (for example, on rotation groups) into the filter without ad-hoc projections.
- Because each step is an optimization move, the method could be combined with line-search or adaptive step-size rules to improve robustness in strongly nonlinear regimes.
- The exact recovery of Kalman in the linear case suggests that differences in performance on nonlinear problems arise purely from how the geometry-aware steps handle higher-order effects compared with linearization or sigma-point sampling.
Load-bearing premise
That natural gradient steps on the Gaussian manifold produce a valid approximation to the true posterior in nonlinear systems without introducing new instabilities or biases beyond those already present in standard Gaussian filters.
What would settle it
Apply the NANO filter to a linear-Gaussian problem and check whether the mean and covariance after one step match the closed-form Kalman update to machine precision; or apply it to a nonlinear benchmark such as satellite attitude estimation and verify whether the covariance matrices remain positive definite while the estimation error stays below that of an unscented Kalman filter baseline.
Figures
read the original abstract
Bayesian filtering is a cornerstone of state estimation in complex systems such as aerospace systems, yet exact solutions are available only for linear Gaussian models. In practice,nonlinear systems are handled through tractable approximations,with Gaussian filters such as the extended and unscented Kalman filters being among the most widely used methods. This tutorial revisits Gaussian filtering from an information-geometric perspective, viewing the prediction and measurement update steps as inference procedures over state distributions. Within this framework, we introduce a geometry-aware Gaussian filtering approach that leverages natural gradient descent on the statistical manifold of Gaussian distributions. The resulting Natural Gradient Gaussian Approximation (NANO) filter iteratively refines the posterior mean and covariance while respecting the intrinsic geometry of the Gaussian family and preserving the positive definiteness of the covariance matrix. We further highlight fundamental connections to the classical Kalman filtering, showing that a single natural-gradient step exactly recovers the Kalman measurement update in the linear-Gaussian case. The practical implications of the proposed framework are illustrated through case studies in representative nonlinear estimation problems,including satellite attitude estimation, simultaneous localization and mapping, and state estimation for robotic systems including quadruped and humanoid robots.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript presents a tutorial on Gaussian Bayesian filtering from an information-geometric viewpoint. It introduces the Natural Gradient Gaussian Approximation (NANO) filter, which performs iterative natural-gradient descent on the statistical manifold of Gaussian distributions to refine posterior mean and covariance. A central claim is that a single natural-gradient step exactly recovers the classical Kalman measurement update in the linear-Gaussian case; the method is shown to preserve positive definiteness by construction. The framework is illustrated via case studies on nonlinear problems including satellite attitude estimation, SLAM, and state estimation for quadruped and humanoid robots.
Significance. If the derivations are correct, the work supplies a geometrically principled reinterpretation of Gaussian filtering that recovers the Kalman filter as a special case and automatically respects the positive-definiteness constraint. The exact recovery result anchors the approach in classical theory and may help researchers design manifold-aware updates for robotics and aerospace applications. The tutorial format usefully connects optimization on statistical manifolds with filtering practice.
minor comments (2)
- Abstract: missing space in 'In practice,nonlinear systems'.
- Abstract: the list of case studies repeats 'including' ('including satellite attitude estimation, simultaneous localization and mapping, and state estimation for robotic systems including quadruped...').
Simulated Author's Rebuttal
We thank the referee for the careful reading and positive assessment of the manuscript. The summary accurately reflects the tutorial's focus on information geometry for Gaussian filtering, the introduction of the NANO filter, and the exact recovery of the Kalman update as a special case. We appreciate the recommendation for minor revision and the recognition of potential utility in robotics and aerospace applications. No specific major comments were listed in the report.
Circularity Check
No significant circularity detected
full rationale
The paper's derivation chain is self-contained. The central result—that one natural-gradient step on the Gaussian statistical manifold recovers the Kalman measurement update—follows directly from the known coincidence between the Fisher information metric and the information-form update under linear-Gaussian assumptions; this is a standard geometric identity, not a redefinition or fit of the target quantity. The NANO filter's iterative refinement is defined as natural-gradient descent on the manifold, which by construction stays within the positive-definite cone; this is an intrinsic property of the chosen geometry rather than an input smuggled back as output. No self-citations are used as load-bearing premises for uniqueness or ansatz, and the abstract plus skeptic analysis show no equations that reduce the claimed prediction to a fitted parameter or prior result by construction. The argument therefore does not collapse to its own inputs.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Gaussian distributions form a suitable statistical manifold for approximating posteriors in the considered dynamical systems
- domain assumption Natural gradient descent on this manifold yields a valid refinement of the posterior
Lean theorems connected to this paper
-
Cost / FunctionalEquation (J(x)=½(x+x⁻¹)−1 uniqueness)washburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
the parameter space of probability distributions [is] a Riemannian manifold, where the Riemannian metric is defined by Fisher information matrix ... I_ij(θ) = ∂²ψ/∂θ_i∂θ_j
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Bayesian filtering: From Kalman filters to particle filters, and beyond,
Z. Chenet al., “Bayesian filtering: From Kalman filters to particle filters, and beyond,”Statistics, vol. 182, no. 1, pp. 1–69, 2003
work page 2003
-
[2]
C. Liu, S. E. Li, and J. K. Hedrick, “Measurement dissemination- based distributed bayesian filter using the latest-in-and-full-out exchange protocol for networked unmanned vehicles,”IEEE Transactions on Industrial Electronics, vol. 64, no. 11, pp. 8756–8766, 2017
work page 2017
-
[3]
A new approach to linear filtering and prediction problems,
R. Kalman, “A new approach to linear filtering and prediction problems,” Journal of Basic Engineering, vol. 82, no. 1, pp. 35–45, 1960
work page 1960
-
[4]
S. S ¨arkk¨a and L. Svensson,Bayesian filtering and smoothing, vol. 17. Cambridge university press, 2023
work page 2023
-
[5]
Particle filters: A hands- on tutorial,
J. Elfring, E. Torta, and R. van de Molengraft, “Particle filters: A hands- on tutorial,”Sensors, vol. 21, no. 2, p. 438, 2021
work page 2021
-
[6]
S. Thrun, “Probabilistic robotics,”Communications of the ACM, vol. 45, no. 3, pp. 52–57, 2002
work page 2002
-
[7]
G. L. Smith, S. F. Schmidt, and L. A. McGee,Application of statistical filter theory to the optimal estimation of position and velocity on board a circumlunar vehicle, vol. 135. National Aeronautics and Space Administration, 1962
work page 1962
-
[8]
An assessment of the navigation and course corrections for a manned flyby of mars or venus,
B. A. McElhoe, “An assessment of the navigation and course corrections for a manned flyby of mars or venus,”IEEE Transactions on Aerospace and Electronic Systems, no. 4, pp. 613–623, 1966
work page 1966
-
[9]
Gelbet al.,Applied optimal estimation
A. Gelbet al.,Applied optimal estimation. MIT press, 1974
work page 1974
-
[10]
The iterated Kalman filter update as a Gauss-newton method,
B. M. Bell and F. W. Cathey, “The iterated Kalman filter update as a Gauss-newton method,”IEEE Transactions on Automatic Control, vol. 38, no. 2, pp. 294–297, 1993
work page 1993
-
[11]
A new approach for filtering nonlinear systems,
S. J. Julier, J. K. Uhlmann, and H. F. Durrant-Whyte, “A new approach for filtering nonlinear systems,” in1995 American Control Conference (ACC), vol. 3, pp. 1628–1632, 1995
work page 1995
-
[12]
Discrete-time nonlinear filtering algorithms using Gauss–hermite quadrature,
I. Arasaratnam, S. Haykin, and R. J. Elliott, “Discrete-time nonlinear filtering algorithms using Gauss–hermite quadrature,”Proceedings of the IEEE, vol. 95, no. 5, pp. 953–977, 2007
work page 2007
-
[13]
I. Arasaratnam and S. Haykin, “Cubature Kalman filters,”IEEE Trans- actions on Automatic Control, vol. 54, no. 6, pp. 1254–1269, 2009
work page 2009
-
[14]
Nonlinear bayesian filtering with natural gradient gaussian approximation,
W. Cao, T. Zhang, Z. Sun, C. Liu, S. S.-T. Yau, and S. E. Li, “Nonlinear bayesian filtering with natural gradient gaussian approximation,”IEEE Transactions on Pattern Analysis and Machine Intelligence, 2026 (online in arXiv, Oct. 2024)
work page 2026
-
[15]
Algorithm design and comparative test of natural gradient gaussian approximation filter,
W. Cao, T. Zhang, and S. E. Li, “Algorithm design and comparative test of natural gradient gaussian approximation filter,” inIFAC Modeling Estimation and Control Conference (MECC), pp. 1–7, 2025
work page 2025
-
[16]
Nonlinear Kalman filtering with divergence minimization,
S. Gultekin and J. Paisley, “Nonlinear Kalman filtering with divergence minimization,”IEEE Transactions on Signal Processing, vol. 65, no. 23, pp. 6319–6331, 2017
work page 2017
-
[17]
An optimization-centric view on bayes’ rule: Reviewing and generalizing variational inference,
J. Knoblauch, J. Jewson, and T. Damoulas, “An optimization-centric view on bayes’ rule: Reviewing and generalizing variational inference,” Journal of Machine Learning Research, vol. 23, no. 132, pp. 1–109, 2022
work page 2022
-
[18]
W. Cao, C. Liu, Z. Lan, Y . Piao, and S. E. Li, “Generalized moving hori- zon estimation for nonlinear systems with robustness to measurement outliers,” in2023 American Control Conference (ACC), pp. 1614–1621, 2023
work page 2023
-
[19]
Ro- bust bayesian inference for moving horizon estimation,
W. Cao, C. Liu, Z. Lan, S. E. Li, W. Pan, and A. Alessandri, “Ro- bust bayesian inference for moving horizon estimation,”Automatica, vol. 173, p. 112108, 2025
work page 2025
-
[20]
Information and accuracy attainable in the estimation of statistical parameters,
C. R. Rao, “Information and accuracy attainable in the estimation of statistical parameters,”Bulletin of the Calcutta Mathematical Society, vol. 37, no. 3, pp. 81–91, 1945
work page 1945
-
[21]
The utilization of multiple measurements in problems of biological classification,
C. R. Rao, “The utilization of multiple measurements in problems of biological classification,”Journal of the Royal Statistical Society: Series B (Statistical Methodology), vol. 10, no. 2, pp. 159–193, 1948
work page 1948
-
[22]
S. Amari and H. Nagaoka,Methods of Information Geometry, vol. 191 ofTranslations of Mathematical Monographs. American Mathematical Society (Translations of Mathematical Monographs), 2000
work page 2000
-
[23]
N. N. Chentsov,Statistical decision rules and optimal inference. Amer- ican Mathematical Society, 1982
work page 1982
-
[24]
Transformations des signaux al ´eatoires a travers les sys- temes non lin´eaires sans m´emoire,
G. Bonnet, “Transformations des signaux al ´eatoires a travers les sys- temes non lin´eaires sans m´emoire,” inAnnales des T ´el´ecommunications, vol. 19, pp. 203–220, Springer, 1964
work page 1964
-
[25]
A useful theorem for nonlinear devices having Gaussian inputs,
R. Price, “A useful theorem for nonlinear devices having Gaussian inputs,”IRE Transactions on Information Theory, vol. 4, no. 2, pp. 69– 72, 1958
work page 1958
-
[26]
Unscented filtering and nonlinear estimation,
S. J. Julier and J. K. Uhlmann, “Unscented filtering and nonlinear estimation,”Proceedings of the IEEE, vol. 92, no. 3, pp. 401–422, 2004
work page 2004
-
[27]
Natural gradient gaussian approx- imation filter with positive definiteness guarantee,
T. Zhang, W. Cao, and S. E. Li, “Natural gradient gaussian approx- imation filter with positive definiteness guarantee,” in2026 American Control Conference (ACC), 2026
work page 2026
-
[28]
Tractable structured natural-gradient descent using local parameterizations,
W. Lin, F. Nielsen, K. M. Emtiyaz, and M. Schmidt, “Tractable structured natural-gradient descent using local parameterizations,” in International Conference on Machine Learning, pp. 6680–6691, 2021
work page 2021
-
[29]
Gaussian variational inference with covariance constraints applied to range-only localization,
A. Goudar, W. Zhao, T. D. Barfoot, and A. P. Schoellig, “Gaussian variational inference with covariance constraints applied to range-only localization,” in2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 2872–2879, IEEE, 2022
work page 2022
-
[30]
T. D. Barfoot,State estimation for robotics. Cambridge University Press, 2024
work page 2024
-
[31]
The invariant extended Kalman filter as a stable observer,
A. Barrau and S. Bonnabel, “The invariant extended Kalman filter as a stable observer,”IEEE Transactions on Automatic Control, vol. 62, no. 4, pp. 1797–1812, 2016. 18
work page 2016
-
[32]
Contact- aided invariant extended Kalman filtering for robot state estimation,
R. Hartley, M. Ghaffari, R. M. Eustice, and J. W. Grizzle, “Contact- aided invariant extended Kalman filtering for robot state estimation,”The International Journal of Robotics Research, vol. 39, no. 4, pp. 402–430, 2020
work page 2020
-
[33]
Nano- slam: natural gradient gaussian approximation for vehicle slam,
T. Zhang, W. Cao, C. Liu, F. Zhang, W. Wu, and S. E. Li, “Nano- slam: natural gradient gaussian approximation for vehicle slam,” in 2025 IEEE 28th International Conference on Intelligent Transportation Systems (ITSC), pp. 4014–4019, 2025
work page 2025
-
[34]
Fast-lio: A fast, robust lidar-inertial odometry package by tightly-coupled iterated kalman filter,
W. Xu and F. Zhang, “Fast-lio: A fast, robust lidar-inertial odometry package by tightly-coupled iterated kalman filter,”IEEE Robotics and Automation Letters, vol. 6, no. 2, pp. 3317–3324, 2021
work page 2021
-
[35]
A code for unscented kalman filtering on manifolds (ukf-m),
M. Brossard, A. Barrau, and S. Bonnabel, “A code for unscented kalman filtering on manifolds (ukf-m),” in2020 IEEE International Conference on Robotics and Automation (ICRA), pp. 5701–5708, 2020
work page 2020
-
[36]
Convergence and consistency analysis for extended kalman filter based slam,
S. Huang and G. Dissanayake, “Convergence and consistency analysis for extended kalman filter based slam,”IEEE Transactions on robotics, vol. 23, no. 5, pp. 1036–1049, 2007
work page 2007
-
[37]
Unscented fastslam: a robust and efficient solution to the slam problem,
C. Kim, R. Sakthivel, and W. K. Chung, “Unscented fastslam: a robust and efficient solution to the slam problem,”IEEE Transactions on Robotics, vol. 24, no. 4, pp. 808–820, 2008
work page 2008
-
[38]
Natural Gradient Gaussian Approximation Filter on Lie Groups for Robot State Estimation
T. Zhang, W. Cao, C. Liu, Y . Lyu, and S. E. Li, “Natural gradient gaussian approximation filter on lie groups for robot state estimation,” arXiv preprint arXiv:2604.10057, 2026
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[39]
State estimation for legged robots: Consistent fusion of leg kinematics and IMU,
M. Bloesch, M. Hutter, M. A. Hoepflinger, S. Leutenegger, C. Gehring, C. D. Remy, and R. Siegwart, “State estimation for legged robots: Consistent fusion of leg kinematics and IMU,”Robotics: Science and Systems VIII, p. 17, 2013
work page 2013
-
[40]
A tutorial on quantitative trajectory eval- uation for visual (-inertial) odometry,
Z. Zhang and D. Scaramuzza, “A tutorial on quantitative trajectory eval- uation for visual (-inertial) odometry,” in2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 7244–7251, IEEE, 2018
work page 2018
-
[41]
Invariant smoother for legged robot state estimation with dynamic contact event information,
Z. Yoon, J.-H. Kim, and H.-W. Park, “Invariant smoother for legged robot state estimation with dynamic contact event information,”IEEE Transactions on Robotics, vol. 40, pp. 193–212, 2023
work page 2023
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.