pith. sign in

arxiv: 2409.14585 · v3 · submitted 2024-09-22 · 🧮 math.NA · cs.NA· math.PR· stat.CO· stat.ML

A convergent scheme for the Bayesian filtering problem based on the Fokker--Planck equation and deep splitting

Pith reviewed 2026-05-23 20:58 UTC · model grok-4.3

classification 🧮 math.NA cs.NAmath.PRstat.COstat.ML
keywords nonlinear filteringFokker-Planck equationdeep splittingBayesian filteringconvergence rateFeynman-Kac representationHörmander conditionnumerical approximation
0
0 comments X

The pith

A deep splitting scheme approximates the nonlinear filtering density by solving the Fokker-Planck equation and converges under the parabolic Hörmander condition.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a numerical method for nonlinear filtering that approximates the evolution of the filtering density between observations by solving the Fokker-Planck equation with a deep splitting technique. At each measurement time, it applies an exact Bayesian update using the new observation. The method is designed to work online after an initial training phase and uses sampling to address high-dimensional problems. Theoretical convergence is established when the diffusion satisfies the parabolic Hörmander condition, with supporting numerical evidence in a ten-dimensional example.

Core claim

The central claim is that the proposed prediction-update algorithm, where the prediction step employs a deep splitting scheme based on the Feynman-Kac representation to approximate the Fokker-Planck equation and the update step uses Bayes' formula, converges to the true nonlinear filtering density at a rate determined by the approximation error of the deep splitting method under the parabolic Hörmander condition.

What carries the argument

The deep splitting scheme for approximating solutions to the Fokker-Planck equation via a sampling-based Feynman-Kac approach, which enables the prediction step in the filtering algorithm.

If this is right

  • The filtering algorithm operates online for new observations after training.
  • The same convergence rate applies to approximating the Fokker-Planck equation independently.
  • The sampling approach helps mitigate the curse of dimensionality in high-dimensional settings.
  • Numerical robustness is demonstrated in a 10-dimensional nonlinear example.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The method may be extended to other stochastic processes where the Fokker-Planck equation governs the density evolution.
  • Similar deep splitting techniques could be applied to related problems in stochastic differential equations without the filtering context.
  • The empirical performance in high dimensions suggests potential for real-world applications in signal processing or data assimilation where analytical solutions are unavailable.

Load-bearing premise

The diffusion process underlying the signal must satisfy the parabolic Hörmander condition for the theoretical convergence rate to hold.

What would settle it

Numerical experiments on a diffusion that violates the parabolic Hörmander condition, checking whether the observed convergence rate matches the theoretical prediction or degrades.

Figures

Figures reproduced from arXiv: 2409.14585 by Adam Andersson, Filip Rydin, Kasper B{\aa}gmark, Stig Larsson.

Figure 1
Figure 1. Figure 1: The figure illustrates the L 2 (Ω;L∞(R d ; R))-error over time for five different discretizations averaged over 10 instances. To the left we see the error for the drifted Brownian motion example and to the right we see the error for the example with the bistable process. 100 101 10−1 100 N L 2L ∞-error Drifted Brownian motion 100 101 10−1 100 N Bistable process Instances Average O(N −1 ) O(N −1/2 ) [PITH_… view at source ↗
Figure 2
Figure 2. Figure 2: The figure presents the convergence for the numerical scheme for 10 individual instances of the scheme in red, their average in blue, and in black we see reference lines of order 1 and 1 2 respectively. To the left we have the errors corresponding to the drifted Brownian motion example and to the right the example with the bistable process. For transparency, we also remark that if we let M and L stay const… view at source ↗
read the original abstract

A numerical scheme for approximating the nonlinear filtering density is introduced and its convergence rate is established, theoretically under a parabolic H\"{o}rmander condition, and empirically in numerical examples. In a prediction step, between the noisy and partial measurements at discrete times, the scheme approximates the Fokker--Planck equation with a deep splitting scheme, followed by an exact update through Bayes' formula. This results in a classical prediction-update filtering algorithm that operates online for new observation sequences post-training. The algorithm employs a sampling-based Feynman--Kac approach, designed to mitigate the curse of dimensionality. As a corollary we obtain the convergence rate for the approximation of the Fokker--Planck equation alone, disconnected from the filtering problem. The convergence analysis is complemented by a nonlinear $10$-dimensional numerical example demonstrating the robustness of the method.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The manuscript introduces a numerical scheme for the Bayesian filtering problem that approximates the nonlinear filtering density via a deep splitting scheme for the Fokker-Planck prediction step between discrete noisy measurements, followed by an exact Bayes update. It establishes a theoretical convergence rate under the parabolic Hörmander condition on the underlying diffusion, supplies a corollary for the standalone Fokker-Planck equation, employs a sampling-based Feynman-Kac representation to mitigate the curse of dimensionality, and demonstrates empirical performance in a 10-dimensional nonlinear example. The resulting algorithm is online after training.

Significance. If the convergence analysis is rigorous, the work supplies a theoretically grounded, high-dimensional method for nonlinear filtering that combines PDE approximation with exact updates and provides both a filtering result and an independent Fokker-Planck corollary; this is a meaningful contribution to numerical analysis of stochastic filtering problems where dimensionality is a central obstacle.

major comments (1)
  1. The central claim asserts a convergence rate under the parabolic Hörmander condition, yet the abstract and available description do not state the explicit rate (e.g., dependence on time-step size, network width, or sampling error); without this quantitative statement the strength of the result cannot be assessed and the claim remains load-bearing but underspecified.
minor comments (2)
  1. The abstract refers to 'a nonlinear 10-dimensional numerical example' without naming the SDE or observation model; a one-sentence description of the test problem would aid reproducibility and allow readers to judge the relevance of the Hörmander condition.
  2. The distinction between the filtering algorithm and the standalone Fokker-Planck corollary should be emphasized with a dedicated statement or remark early in the introduction to clarify the scope of each result.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the positive assessment and the recommendation of minor revision. We address the single major comment below.

read point-by-point responses
  1. Referee: The central claim asserts a convergence rate under the parabolic Hörmander condition, yet the abstract and available description do not state the explicit rate (e.g., dependence on time-step size, network width, or sampling error); without this quantitative statement the strength of the result cannot be assessed and the claim remains load-bearing but underspecified.

    Authors: We agree that the abstract would benefit from an explicit quantitative statement of the rate. The body of the manuscript (Theorem 3.4 and the subsequent filtering result) establishes the rate under the parabolic Hörmander condition, with the leading term controlled by the time-step size together with network approximation and Monte-Carlo sampling errors. We will revise the abstract to include this dependence. revision: yes

Circularity Check

0 steps flagged

No significant circularity

full rationale

The derivation establishes a convergence rate for the deep splitting scheme applied to the Fokker-Planck prediction step (with exact Bayes update) under the standard parabolic Hörmander condition on the diffusion; this is an external hypothesis, not derived from or equivalent to the scheme itself. The corollary for the standalone Fokker-Planck problem is explicitly independent of the filtering application. No self-definitional steps, fitted inputs renamed as predictions, or load-bearing self-citations appear in the stated claims, and the numerical example provides separate empirical support. The argument is self-contained against the stated assumptions.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the parabolic Hörmander condition for the diffusion process and on standard assumptions underlying deep neural network approximation of PDEs; no free parameters or invented entities are introduced in the abstract.

axioms (1)
  • domain assumption Parabolic Hörmander condition on the underlying stochastic process
    Invoked to obtain the theoretical convergence rate of the deep splitting scheme for the Fokker-Planck equation.

pith-pipeline@v0.9.0 · 5695 in / 1100 out tokens · 25654 ms · 2026-05-23T20:58:49.002437+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Nonlinear filtering based on density approximation and deep BSDE prediction

    math.NA 2025-08 conditional novelty 6.0

    A deep BSDE neural network method approximates unnormalized filtering densities for nonlinear Bayesian filtering, trained offline and applied online, with a hybrid a priori-a posteriori error bound proved under the pa...

  2. High-dimensional Bayesian filtering through deep density approximation

    math.NA 2025-11 unverdicted novelty 5.0

    The logarithmic deep backward SDE filter succeeds in a 100-dimensional Lorenz-96 model where particle and ensemble Kalman filters fail, while cutting inference time by two to five orders of magnitude.

Reference graph

Works this paper leans on

53 extracted references · 53 canonical work pages · cited by 2 Pith papers · 1 internal anchor

  1. [1]

    Andersson, A

    K. Andersson, A. Andersson, and C. W. Oosterlee. Converg ence of a robust deep FBSDE method for stochastic control. SIAM J. Sci. Comput. , 45:A226–A255, 2023

  2. [2]

    A. Apte, C. K. R. T. Jones, A. M. Stuart, and J. Voss. Data as similation: Mathematical and statistical perspectives. Int. J. Numer. Methods Fluids , 56:1033–1046, 2008. 20 K. B ˚ AGMARK, A. ANDERSSON, S. LARSSON, AND F. RYDIN

  3. [3]

    B ˚ agmark, A

    K. B ˚ agmark, A. Andersson, and S. Larsson. An energy-bas ed deep splitting method for the nonlinear filtering problem. Partial Differ. Equ. Appl. , 4, 2023

  4. [4]

    C. Beck, S. Becker, P. Cheridito, A. Jentzen, and A. Neufe ld. Deep learning based numerical approxima- tion algorithms for stochastic partial differential equati ons and high-dimensional nonlinear filtering problems. arXiv:2012.01194, 2020

  5. [5]

    C. Beck, S. Becker, P. Cheridito, A. Jentzen, and A. Neufe ld. Deep splitting method for parabolic PDEs. SIAM J. Sci. Comput. , 43:A3135–A3154, 2021

  6. [6]

    C. Beck, S. Becker, P. Grohs, N. Jaafari, and A. Jentzen. S olving the Kolmogorov PDE by means of deep learning. arXiv:1806.00421v2, 2021

  7. [7]

    S. S. Blackman and R. Popoli. Design and Analysis of Modern Tracking Systems . Artech House Publishers, 1999

  8. [8]

    S. C. Brenner and L. R. Scott. The Mathematical Theory of Finite Element Methods , volume 15 of Texts in Applied Mathematics . Springer, New York, third edition, 2008

  9. [9]

    Cassola and M

    F. Cassola and M. Burlando. Wind speed and wind energy for ecast through Kalman filtering of numerical weather prediction model output. Appl. Energy , 99:154–166, 2012

  10. [10]

    Cattiaux and L

    P. Cattiaux and L. Mesnager. Hypoelliptic non-homogen eous diffusions. Probab. Theory Related Fields, 123:453– 483, 2002

  11. [11]

    Challa and Y

    S. Challa and Y. Bar-Shalom. Nonlinear filter design usi ng Fokker-Planck-Kolmogorov probability density evolutions. IEEE Trans. Aerosp. Electron. Syst. , 36:309–315, 2000

  12. [12]

    Corenflos and A

    A. Corenflos and A. Finke. Particle-MALA and Particle-m GRAD: Gradient-based MCMC methods for high- dimensional state-space models. arXiv:2401.14868, 2024

  13. [13]

    Corenflos, Z

    A. Corenflos, Z. Zhao, S. S¨ arkk¨ a, J. Sj¨ olund, and T. B. Sch¨ on. Conditioning diffusion models by explicit forward-backward bridging. arXiv:2405.13794, 2024

  14. [14]

    Crisan, A

    D. Crisan, A. Lobbe, and S. Ortiz-Latorre. An applicati on of the splitting-up method for the computation of a neural network representation for the solution for the filt ering equations. Stoch. Partial Differ. Equ.: Anal. Comput., 10:1050–1081, 2022

  15. [15]

    N. Cui, L. Hong, and J. R. Layne. A comparison of nonlinea r filtering approaches with an application to ground target tracking. Signal Processing, 85:1469–1492, 2005

  16. [16]

    Da Prato

    G. Da Prato. Introduction to Stochastic Analysis and Malliavin Calculu s, volume 13 of Lecture Notes. Scuola Normale Superiore di Pisa (New Series) . Edizioni della Normale, Pisa, third edition, 2014

  17. [17]

    Date and K

    P. Date and K. Ponomareva. Linear and non-linear filteri ng in mathematical finance: a review. IMA J. Manag. Math., 22:195–211, 2011

  18. [18]

    Demissie, M

    B. Demissie, M. A. Khan, and F. Govaers. Nonlinear filter design using Fokker-Planck propagator in Kronecker tensor format. In 2016 19th International Conference on Information Fusion ( FUSION), pages 1–8. IEEE, 2016

  19. [19]

    L. Duc, T. Kuroda, K. Saito, and T. Fujita. Ensemble Kalm an filter data assimilation and storm surge experi- ments of tropical cyclone nargis. Tellus A , 67:25941, 2015

  20. [20]

    W. E, J. Han, and A. Jentzen. Deep learning-based numeri cal methods for high-dimensional parabolic partial differential equations and backward stochastic differentia l equations. Commun. Math. Stat , 5:349–380, Nov. 2017

  21. [21]

    W. E and B. Yu. The deep Ritz method: A deep learning-base d numerical algorithm for solving variational problems. Commun. Math. Stat , 1:1–12, 2018

  22. [22]

    Finke and A

    A. Finke and A. H. Thiery. Conditional sequential Monte Carlo in high dimensions. Ann. Statist. , 51:437–463, 2023

  23. [23]

    Frey and V

    R. Frey and V. K¨ ock. Convergence analysis of the deep sp litting scheme: the case of partial integro-differential equations and the associated FBSDEs with jumps. arXiv:2206.01597, 2022

  24. [24]

    Galanis, P

    G. Galanis, P. Louka, P. Katsafados, I. Pytharoulis, an d G. Kallos. Applications of Kalman filters based on non-linear functions to numerical weather predictions. Ann. Geophys, 24:1–10, 2006

  25. [25]

    E. Gobet. Monte-Carlo Methods and Stochastic Processes . CRC Press, Boca Raton, FL, 2016

  26. [26]

    I. R. Goodman, R. P. S. Mahler, and H. T. Nguyen. Mathematics of Data Fusion , volume 37 of Theory and Decision Library. Series B: Mathematical and Statistic al Methods . Kluwer Academic Publishers Group, Dordrecht, 1997

  27. [27]

    M. Hairer. On Malliavin’s proof of H¨ ormander’s theore m. B. Sci. Math. , 135:650–666, 2011

  28. [28]

    Hairer, M

    M. Hairer, M. Hutzenthaler, and A. Jentzen. Loss of regu larity for Kolmogorov equations. Ann. Probab. , 43:468–527, 2015

  29. [29]

    J. Han, W. Hu, J. Long, and Y. Zhao. Deep Picard iteration for high-dimensional nonlinear PDEs. arXiv:2409.08526, 2024

  30. [30]

    Z. Hu, Z. Zhang, G. E. Karniadakis, and K. Kawaguchi. Sco re-based physics-informed neural networks for high-dimensional Fokker–Planck equations. arXiv:2402.07465, 2024

  31. [31]

    R. E. Kalman and R. S. Bucy. New results in linear filterin g and prediction theory. J. Basic Eng. , 83:95–108, 1961

  32. [32]

    F. C. Klebaner. Introduction to Stochastic Calculus with Applications . Imperial College Press, London, third edition, 2012

  33. [33]

    A. Klenke. Probability Theory. Universitext. Springer, London, second edition, 2014

  34. [34]

    P. E. Kloeden and E. Platen. Numerical Solution of Stochastic Differential Equations , volume 23 of Applications of Mathematics (New York) . Springer-Verlag, Berlin, 1992. A CONVERGENT SCHEME FOR THE FOKKER–PLANCK EQUATION WITH DEE P SPLITTING 21

  35. [35]

    M. Kunze. An introduction to Malliavin calculus. Lecture notes , 2013. https://www.uni-ulm.de/fileadmin/website_uni_ulm/mawi.inst.020/kunze/malliavin/Malliavin_skript.pdf

  36. [36]

    H. J. Kushner. On the differential equations satisfied by conditional probablitity densities of Markov processes, with applications. J. Soc. Industrial Appl. Math., Series A: Control , 2:106–119, 1964

  37. [37]

    A. Lobbe. Deep learning for the Benes filter. Stochastic Transport in Upper Ocean Dynamics , 10:195–210, 2023

  38. [38]

    L. Lu, P. Jin, G. Pang, Z. Zhang, and G. E. Karniadakis. Le arning nonlinear operators via DeepONet based on the universal approximation theorem of operators. Nat. Mach. Intell. , 3:218–229, 2021

  39. [39]

    E. Luk, E. Bach, R. Baptista, and A. Stuart. Learning opt imal filters using variational inference. arXiv:2406.18066, 2024

  40. [40]

    C. A. Naesseth, F. Lindsten, and T. B. Sch¨ on. High-dime nsional filtering using nested sequential Monte Carlo. IEEE Trans. Signal Process. , 67:4177–4188, 2019

  41. [41]

    D. Nualart. The Malliavin Calculus and Related Topics . Probability and its Applications (New York). Springer- Verlag, Berlin, second edition, 2006

  42. [42]

    J. Quinn. A high-dimensional particle filter algorithm . arXiv:1901.10543, 2019

  43. [43]

    Raissi, P

    M. Raissi, P. Perdikaris, and G. E. Karniadakis. Physic s-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonline ar partial differential equations. J. Comput. Phys. , 378:686–707, 2019

  44. [44]

    Rebeschini and R

    P. Rebeschini and R. van Handel. Can local particle filte rs beat the curse of dimensionality? Ann. Appl. Probab., 25:2809–2866, 2015

  45. [45]

    W. Rutzler. Nonlinear and adaptive parameter estimati on methods for tubular reactors. Ind. Eng. Chem. Res. , 26:325–333, 1987

  46. [46]

    S¨ arkk¨ a and L

    S. S¨ arkk¨ a and L. Svensson.Bayesian Filtering and Smoothing , volume 17 of Institute of Mathematical Statistics Textbooks. Cambridge University Press, Cambridge, second edition, 2 023

  47. [47]

    J. W. Siegel. Optimal approximation rates for deep ReLU neural networks on Sobolev and Besov spaces. J. Mach. Learn. Res. , 24:1–52, 2023

  48. [48]

    C. Snyder. Particle filters, the “optimal” proposal and high-dimensional systems. In Proceedings of the ECMWF Seminar on Data Assimilation for atmosphere and ocean , pages 1–10, 2011

  49. [49]

    Snyder, T

    C. Snyder, T. Bengtsson, and M. Morzfeld. Performance b ounds for particle filters using the optimal proposal. Mon. Weather Rev. , 143:4750–4761, 2015

  50. [50]

    Automatic Backward Filtering Forward Guiding for Markov processes and graphical models,

    F. van der Meulen and M. Schauer. Automatic backward filt ering forward guiding for Markov processes and graphical models. arXiv:2010.03509, 2020

  51. [51]

    M. Zakai. On the optimal filtering of diffusion processes . Zeitschrift f¨ ur Wahrscheinlichkeitstheorie und ver- wandte Gebiete , 11:230–243, 1969

  52. [52]

    Zeng and S

    Y. Zeng and S. W u, editors. State-Space Models. Statistics and Econometrics for Finance. Springer, New Yo rk, 2013

  53. [53]

    Z. Zhao, Z. Luo, J. Sj¨ olund, and T. B. Sch¨ on. Condition al sampling within generative diffusion models. arXiv:2409.09650, 2024. Appendix A. Malliavin integration by parts Here we prove Lemma 3.1 by means of Malliavin integration by parts. Th is proof follows the notes of [35]; similar material can be found in [27] or [41]. This is provided for th e rea...