On the Forgetting of Particle Filters

Anthony Lee; Joona Karjalainen; Matti Vihola; Sumeetpal S. Singh

arxiv: 2309.08517 · v3 · submitted 2023-09-15 · 🧮 math.PR · stat.CO

On the Forgetting of Particle Filters

Joona Karjalainen , Anthony Lee , Sumeetpal S. Singh , Matti Vihola This is my paper

Pith reviewed 2026-05-24 06:43 UTC · model grok-4.3

classification 🧮 math.PR stat.CO

keywords particle filtersforgettingexponential mixingFeynman-Kac modelsconditional particle filterpropagation of chaos

0 comments

The pith

Under a mixing condition the particle filter forgets its initial state after O(log N) steps.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper treats the collection of N particles in a particle filter as the state of a Markov chain and shows that, when the underlying Feynman-Kac model satisfies a strong mixing assumption, this chain is exponentially ergodic. Consequently the distribution of the particle system converges to its unique stationary distribution at a rate that makes the dependence on the arbitrary initial condition decay to a negligible level after only O(log N) algorithm steps. An explicit example demonstrates that the logarithmic rate cannot be improved in general. The same exponential mixing result is proved for the conditional particle filter, supported by new time-uniform L^p error bounds.

Core claim

When the Feynman-Kac model satisfies a strong mixing assumption, the particle filter viewed as a Markov chain on the space of N-particle configurations is exponentially mixing and therefore forgets its initial configuration after O(log N) resampling-mutation steps; the same conclusion holds for the conditional particle filter.

What carries the argument

the particle filter regarded as a Markov chain whose transition kernel inherits exponential contraction from the strong mixing assumption on the underlying Feynman-Kac model

If this is right

The number of steps needed to erase initialization dependence scales only logarithmically rather than exponentially in the number of particles.
The conditional particle filter likewise forgets its starting state in O(log N) steps once equipped with the new uniform error estimates.
Propagation-of-chaos bounds obtained via the same contraction arguments become time-uniform.
The improved mixing rate supports more efficient coupling constructions between independent particle filters.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The O(log N) forgetting time may allow practitioners to restart or reinitialize particle filters more frequently without paying an exponential penalty in computation.
The contraction technique could be tested on other resampling schemes such as systematic or stratified resampling to see whether the same logarithmic rate holds.
Because the result is sharp, it supplies a concrete benchmark against which future analyses of particle-filter stability can be compared.

Load-bearing premise

The underlying Feynman-Kac model must satisfy a strong mixing condition that produces uniform contraction in total variation or a suitable Wasserstein distance.

What would settle it

Construct a Feynman-Kac model that satisfies the strong mixing assumption yet requires more than c log N steps for the particle system to reach a fixed total-variation distance to stationarity for some constant c independent of N.

Figures

Figures reproduced from arXiv: 2309.08517 by Anthony Lee, Joona Karjalainen, Matti Vihola, Sumeetpal S. Singh.

**Figure 1.** Figure 1: Propagation of chaos total variation distances (13) as a function of q/N in a discrete state-space model example for times k = 4 and k = 20. The lines illustrate the mentioned orders of decay. behaviour for a range of q, and the total variations can be upper and lower bounded between Θ(q/N) curves, but these may be specific to the example. The propagation of chaos can be combined with forgetting, leading t… view at source ↗

read the original abstract

We study the forgetting properties of the particle filter when its state - the collection of particles - is regarded as a Markov chain. Under a strong mixing assumption on the particle filter's underlying Feynman-Kac model, we find that the particle filter is exponentially mixing, and forgets its initial state in $O(\log N )$ 'time', where $N$ is the number of particles and time refers to the number of particle filter algorithm steps, each comprising a selection (or resampling) and mutation (or prediction) operation. We present an example which shows that this rate is optimal. In contrast to our result, available results to-date are extremely conservative, suggesting $O(\alpha^N)$ time steps are needed, for some $\alpha>1$, for the particle filter to forget its initialisation. We also study the conditional particle filter (CPF) and extend our forgetting result to this context. We establish a similar conclusion, namely, CPF is exponentially mixing and forgets its initial state in $O(\log N )$ time. To support this analysis, we establish new time-uniform $L^p$ error estimates for CPF, which can be of independent interest. We also establish new propagation of chaos type results using our proof techniques, discuss implications to couplings of particle filters and an application to processing out-of-sequence measurements.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gets an O(log N) forgetting time for the particle filter under a strong mixing assumption on the Feynman-Kac model, shows the rate is sharp via an example, and extends the result to the conditional particle filter with new L^p bounds.

read the letter

The core result is that the N-particle system, viewed as a Markov chain, contracts geometrically with a rate independent of N, so the total variation distance to stationarity drops below any fixed level after O(log N) steps. They also prove the same for the conditional particle filter and supply fresh time-uniform L^p error bounds that look reusable elsewhere. An explicit example demonstrates that O(log N) cannot be improved in general. These claims sit in direct contrast to the O(α^N) literature bounds mentioned in the abstract. The proofs rely on the standard strong mixing condition that gives a uniform minorization constant on the underlying kernels; the paper then verifies that the composed resampling-plus-mutation kernel inherits an N-uniform Dobrushin coefficient. That step addresses the obvious worry that multinomial weights could degrade the contraction by a factor of N. The optimality example is concrete and rules out faster rates, which strengthens the contribution. The new propagation-of-chaos statements and the out-of-sequence measurement application are presented as corollaries rather than the main point. The assumption remains strong, so the result applies cleanly only when the underlying model satisfies uniform ergodicity; outside that regime the bounds do not hold. No other load-bearing gaps appear in the argument. This work is aimed at researchers who already use or analyze sequential Monte Carlo methods and want sharper mixing-time control. It is technically grounded enough and improves on cited results enough to merit a full referee report rather than a desk rejection.

Referee Report

2 major / 2 minor

Summary. The manuscript studies the forgetting properties of the particle filter by regarding the N-particle configuration as a Markov chain on the product space. Under a strong mixing assumption on the underlying Feynman-Kac model, it claims that the N-particle kernel is geometrically ergodic with rate independent of N, so that the filter forgets its initial state in O(log N) steps; an example establishes optimality of this rate. The analysis is extended to the conditional particle filter, yielding new time-uniform L^p error bounds, propagation-of-chaos results, and applications to couplings and out-of-sequence measurements.

Significance. If the central uniformity claim holds, the work replaces the extremely conservative O(α^N) mixing-time bounds in the literature with the tight and optimal O(log N) rate. The optimality example, the new CPF error estimates, and the propagation-of-chaos arguments are concrete strengths that would be of independent interest to the sequential Monte Carlo community.

major comments (2)

[§3, Theorem 3.1] §3, Theorem 3.1 and the subsequent contraction argument: the O(log N) forgetting time requires that the one-step Dobrushin coefficient r_N of the full N-particle kernel (resampling composed with mutation) satisfies r_N ≤ ρ < 1 with ρ independent of N. The strong-mixing assumption supplies a uniform minorization on the underlying FK kernels, but the manuscript must explicitly verify that the multinomial resampling step does not introduce a 1/N deterioration in the joint contraction coefficient; without this uniform bound the mixing time could become Θ(N).
[§5] §5, the optimality example: while the lower bound construction is welcome, it only shows that some initial configurations require Ω(log N) steps; the manuscript should confirm that the same example does not force a worse (e.g., polynomial) upper bound under the stated mixing assumption.

minor comments (2)

[§2] The definition of the N-particle state space and the precise form of the resampling kernel could be stated more explicitly in §2 to make the subsequent Dobrushin-coefficient calculations easier to follow.
A short remark comparing the obtained O(log N) rate with the existing O(α^N) literature bounds would help readers locate the improvement.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading and the constructive comments on the forgetting properties of particle filters. We address each major comment below.

read point-by-point responses

Referee: [§3, Theorem 3.1] §3, Theorem 3.1 and the subsequent contraction argument: the O(log N) forgetting time requires that the one-step Dobrushin coefficient r_N of the full N-particle kernel (resampling composed with mutation) satisfies r_N ≤ ρ < 1 with ρ independent of N. The strong-mixing assumption supplies a uniform minorization on the underlying FK kernels, but the manuscript must explicitly verify that the multinomial resampling step does not introduce a 1/N deterioration in the joint contraction coefficient; without this uniform bound the mixing time could become Θ(N).

Authors: We agree that an explicit verification of the uniform bound on r_N is valuable for clarity. In the proof of Theorem 3.1 the Dobrushin coefficient of the composed kernel is controlled by first applying the contraction of the mutation step (which inherits the N-independent minorization constant from the strong-mixing assumption on the Feynman-Kac kernels) and then noting that multinomial resampling is a convex combination of the particles; as a stochastic matrix it cannot increase the Dobrushin coefficient. Consequently r_N ≤ ρ < 1 holds uniformly in N. We will insert a short remark immediately after the statement of Theorem 3.1 making this preservation property explicit. revision: yes
Referee: [§5] §5, the optimality example: while the lower bound construction is welcome, it only shows that some initial configurations require Ω(log N) steps; the manuscript should confirm that the same example does not force a worse (e.g., polynomial) upper bound under the stated mixing assumption.

Authors: The example in §5 is constructed under exactly the same strong-mixing assumption used for the upper bound. Theorem 3.1 therefore applies directly to the example and guarantees an O(log N) upper bound for every initial configuration, including those used in the lower-bound construction. The example saturates the rate but does not violate the general upper bound; no polynomial (or worse) deterioration is forced. revision: no

Circularity Check

0 steps flagged

No circularity; derivation rests on external mixing assumption

full rationale

The paper's central result (exponential mixing of the N-particle filter with O(log N) forgetting time) is derived from an explicitly stated strong mixing assumption on the underlying Feynman-Kac model. This assumption is external and not shown to reduce to any self-referential definition, fitted parameter renamed as prediction, or self-citation chain within the paper. The optimality example is presented separately and does not close a definitional loop. No load-bearing step equates the claimed contraction rate or mixing time to its own inputs by construction. The analysis therefore remains self-contained against the stated external hypothesis.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the strong mixing assumption for the Feynman-Kac model; no free parameters or invented entities are introduced in the abstract.

axioms (1)

domain assumption strong mixing assumption on the particle filter's underlying Feynman-Kac model
Invoked to establish that the particle system Markov chain is exponentially mixing.

pith-pipeline@v0.9.0 · 5770 in / 1233 out tokens · 20172 ms · 2026-05-24T06:43:21.538218+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

47 extracted references · 47 canonical work pages · 1 internal anchor

[1]

Andrieu, A

C. Andrieu, A. Doucet, and R. Holenstein. Particle Markov chain Monte Carlo methods. J. R. Stat. Soc. Ser. B Stat. Methodol. , 72(3):269–342, 2010

work page 2010
[2]

Capp´ e, E

O. Capp´ e, E. Moulines, and T. Ryd´ en.Inference in Hidden Markov Models . Springer Series in Statistics. Springer, 2005

work page 2005
[3]

Cardoso, Y

G. Cardoso, Y. Janati El Idrissi, S. Le Corff, E. Moulines, and J. Olsson. State and parameter learning with PARIS particle Gibbs. In Proceedings of the 40th International Conference on Machine Learning , pages 3625–3675, 2023

work page 2023
[4]

C´ erou, P

F. C´ erou, P. Del Moral, and A. Guyader. A nonasymptotic theorem for un- normalized Feynman–Kac particle models. Ann. Inst. Henri Poincar´ e Probab. Stat., 47(3):629–649, 2011

work page 2011
[5]

H. P. Chan and T. L. Lai. A general theory of particle filters in hidden Markov models and some applications. Ann. Statist., 41(6):2877–2904, 2013

work page 2013
[6]

N. Chopin. Central limit theorem for sequential Monte Carlo methods and its application to Bayesian inference. Ann. Statist., 32(6):2385 – 2411, 2004

work page 2004
[7]

Chopin and O

N. Chopin and O. Papaspiliopoulos. An introduction to sequential Monte Carlo, volume 4 of Springer Series in Statistics . Springer, 2020

work page 2020
[8]

Chopin and S

N. Chopin and S. S. Singh. On particle Gibbs sampling. Bernoulli, 21(3): 1855–1883, 2015. 26 JOONA KARJALAINEN, ANTHONY LEE, SUMEETPAL S. SINGH, AND MATTI VIHOLA

work page 2015
[9]

Dau and N

H.-D. Dau and N. Chopin. On backward smoothing algorithms. The Annals of Statistics, 51(5):2145–2169, 2023

work page 2023
[10]

Del Moral

P. Del Moral. Feynman-Kac formulae: Genealogical and Interacting Particle Systems with Applications , volume 88. Springer, 2004

work page 2004
[11]

Del Moral and A

P. Del Moral and A. Guionnet. Large deviations for interacting particle sys- tems: Applications to non-linear filtering. Stochastic Process. Appl. , 78(1): 69–95, 1998

work page 1998
[12]

Del Moral and A

P. Del Moral and A. Guionnet. Central limit theorem for nonlinear filtering and interacting particle systems. Ann. Appl. Probab., 9(2):275 – 297, 1999

work page 1999
[13]

Del Moral, A

P. Del Moral, A. Doucet, and G. W. Peters. Sharp propagation of chaos estimates for Feynman–Kac particle models. Theory Probab. Appl. , 51(3): 459–485, 2007

work page 2007
[14]

Del Moral, A

P. Del Moral, A. Doucet, and S. S. Singh. A backward particle interpretation of Feynman–Kac formulae. ESAIM Math. Model. Numer. Anal. , 44(5):947–975, 2010

work page 2010
[15]

Del Moral, F

P. Del Moral, F. Patras, and S. Rubenthaler. Convergence of U-statistics for interacting particle systems. J. Theoret. Probab., 24:1002–1027, 2011

work page 2011
[16]

Del Moral, A

P. Del Moral, A. Doucet, and S. S. Singh. Uniform stability of a particle approximation of the optimal filter derivative. SIAM J. Control Optim. , 53 (3):1278–1304, 2015

work page 2015
[17]

Douc and E

R. Douc and E. Moulines. Limit theorems for weighted samples with appli- cations to sequential Monte Carlo methods. Ann. Statist., 36(5):2344 – 2376, 2008

work page 2008
[18]

R. Douc, A. Garivier, E. Moulines, and J. Olsson. Sequential Monte Carlo smoothing for general state space hidden Markov models. Ann. Appl. Probab., 21(6):2109–2145, 2011

work page 2011
[19]

R. Douc, A. Garivier, E. Moulines, and J. Olsson. Sequential Monte Carlo smoothing for general state space hidden Markov models. The Annals of Ap- plied Probability, 21(6):2109–2145, 2011

work page 2011
[20]

R. Douc, E. Moulines, P. Priouret, and P. Soulier. Markov chains. Springer, 2018

work page 2018
[21]

S. J. Godsill, A. Doucet, and M. West. Monte Carlo smoothing for nonlinear time series. Journal of the American Statistical Association , 99(465):156–168, 2004

work page 2004
[22]

N. J. Gordon, D. J. Salmond, and A. F. M. Smith. Novel approach to nonlinear/non-Gaussian Bayesian state estimation. IEE Proceedings-F, 140 (2):107–113, 1993

work page 1993
[23]

Gregory, C

A. Gregory, C. J. Cotter, and S. Reich. Multilevel ensemble transform particle filtering. SIAM J. Sci. Comput. , 38(3):A1317–A1338, 2016

work page 2016
[24]

Hoeffding and J

W. Hoeffding and J. Wolfowitz. Distinguishability of sets of distributions. The Annals of Mathematical Statistics , 29(3):700–718, 1958

work page 1958
[25]

P. E. Jacob, F. Lindsten, and T. B. Sch¨ on. Smoothing with couplings of conditional particle filters. J. Amer. Statist. Assoc. , 115(530):721–729, 2020

work page 2020
[26]

Jasra and F

A. Jasra and F. Yu. Central limit theorems for coupled particle filters. Adv. in Appl. Probab., 52:942–1001, 2020

work page 2020
[27]

Jasra, K

A. Jasra, K. Kamatani, K. J. Law, and Y. Zhou. Multilevel particle filters. SIAM J. Numer. Anal. , 55(6):3068–3096, 2017

work page 2017
[28]

Kantas, A

N. Kantas, A. Doucet, S. S. Singh, J. Maciejowski, and N. Chopin. On particle methods for parameter estimation in state-space models. Statist. Sci. , 30(3): 328–351, 2015

work page 2015
[29]

Karjalainen, A

J. Karjalainen, A. Lee, S. S. Singh, and M. Vihola. Mixing time of the condi- tional backward sampling particle filter. Preprint arXiv:2312.17572, 2023. ON THE FORGETTING OF PARTICLE FILTERS 27

work page arXiv 2023
[30]

Stability of Conditional Sequential Monte Carlo

B. Kuhlenschmidt and S. S. Singh. Stability of conditional sequential Monte Carlo. Preprint arXiv:1806.06520, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018
[31]

Lee and N

A. Lee and N. Whiteley. Variance estimation in the particle filter. Biometrika, 105(3):609–625, 2018

work page 2018
[32]

A. Lee, S. S. Singh, and M. Vihola. Coupled conditional backward sampling particle filter. Ann. Statist., 48(5):3066–3089, 2020

work page 2020
[33]

Lindvall

T. Lindvall. Lectures on the coupling method . Wiley, 1992. Reprint: Dover paperback edition, 2002

work page 1992
[34]

Mastrototaro and J

A. Mastrototaro and J. Olsson. Adaptive online variance estimation in particle filters: the ALVar estimator. Statist. Comput., 33(4):1–26, 2023

work page 2023
[35]

Olsson and R

J. Olsson and R. Douc. Numerically stable online estimation of variance in particle filters. Bernoulli, 25(2):1504–1535, 2019

work page 2019
[36]

Orguner and F

U. Orguner and F. Gustafsson. Storage efficient particle filters for the out of sequence measurement problem. In 2008 11th International Conference on Information Fusion, pages 1–8. IEEE, 2008

work page 2008
[37]

Orton and A

M. Orton and A. Marrs. Particle filters for tracking with out-of-sequence measurements. IEEE Transactions on Aerospace and Electronic Systems , 41 (2):693–702, 2005

work page 2005
[38]

M. K. Pitt and N. Shephard. Filtering via simulation: Auxiliary particle filters. J. Amer. Statist. Assoc. , 94(446):590–599, 1999

work page 1999
[39]

Poyiadjis, A

G. Poyiadjis, A. Doucety, and S. S. Singh. Particle approximations of the score and observed information matrix in state space models with application to parameter estimation. Biometrika, 98(1):65–80, 2011

work page 2011
[40]

T. B. Sch¨ on, A. Wills, and B. Ninness. System identification of nonlinear state-space models. Automatica, 47(1):39–49, 2011

work page 2011
[41]

D. Sen, A. H. Thiery, and A. Jasra. On coupling particle filter trajectories. Statist. Comput., 28(2):461–475, 2018

work page 2018
[42]

A. N. Shiryaev. Probability. Springer-Verlag, New York, second edition, 1996. ISBN 0-387-94549-0

work page 1996
[43]

V. Z. Tadi´ c and A. Doucet. Asymptotic properties of recursive particle max- imum likelihood estimation. IEEE Trans. Inform. Theory , 67(3):1825–1848, 2021

work page 2021
[44]

Thorisson

H. Thorisson. Coupling, Stationarity, and Regeneration . Springer, 2000

work page 2000
[45]

M. J. Wainwright. High-dimensional statistics: A non-asymptotic viewpoint , volume 48. Cambridge university press, 2019

work page 2019
[46]

Whiteley

N. Whiteley. Stability properties of some particle filters. Ann. Appl. Probab., 23(6):2500–2537, 2013. 28 JOONA KARJALAINEN, ANTHONY LEE, SUMEETPAL S. SINGH, AND MATTI VIHOLA Appendix A. Technical lemmas Lemma 30. For any probability measures µ and ν on E it holds that ∥µ − ν∥2 TV ≤ 1 − (1 − H2(µ, ν))2. Proof. Denote by H2(µ ∥ ν) := 2H2(µ, ν) the (unnorma...

work page 2013
[47]

Under (A2), we obtain a lower bound for the densities of the ideal filters for large enough time indices, not depending on the initial distribution η0

give analogues of Lemmas 1 and 2 under (A2): sup µ,ν ∥Φn,n+k(µ) − Φn,n+k(ν)∥TV ≤ β⌊k/m⌋ m ,(19) sup n≥0 sup osc(ϕ)≤1 ∥ηN n (ϕ) − ηn(ϕ)∥p ≤ cp,m√ N ,(20) for some βm < 1, cp,m < ∞, which only depend on p, m, and the bounds on G and Mt:t+m. Under (A2), we obtain a lower bound for the densities of the ideal filters for large enough time indices, not dependin...

work page

[1] [1]

Andrieu, A

C. Andrieu, A. Doucet, and R. Holenstein. Particle Markov chain Monte Carlo methods. J. R. Stat. Soc. Ser. B Stat. Methodol. , 72(3):269–342, 2010

work page 2010

[2] [2]

Capp´ e, E

O. Capp´ e, E. Moulines, and T. Ryd´ en.Inference in Hidden Markov Models . Springer Series in Statistics. Springer, 2005

work page 2005

[3] [3]

Cardoso, Y

G. Cardoso, Y. Janati El Idrissi, S. Le Corff, E. Moulines, and J. Olsson. State and parameter learning with PARIS particle Gibbs. In Proceedings of the 40th International Conference on Machine Learning , pages 3625–3675, 2023

work page 2023

[4] [4]

C´ erou, P

F. C´ erou, P. Del Moral, and A. Guyader. A nonasymptotic theorem for un- normalized Feynman–Kac particle models. Ann. Inst. Henri Poincar´ e Probab. Stat., 47(3):629–649, 2011

work page 2011

[5] [5]

H. P. Chan and T. L. Lai. A general theory of particle filters in hidden Markov models and some applications. Ann. Statist., 41(6):2877–2904, 2013

work page 2013

[6] [6]

N. Chopin. Central limit theorem for sequential Monte Carlo methods and its application to Bayesian inference. Ann. Statist., 32(6):2385 – 2411, 2004

work page 2004

[7] [7]

Chopin and O

N. Chopin and O. Papaspiliopoulos. An introduction to sequential Monte Carlo, volume 4 of Springer Series in Statistics . Springer, 2020

work page 2020

[8] [8]

Chopin and S

N. Chopin and S. S. Singh. On particle Gibbs sampling. Bernoulli, 21(3): 1855–1883, 2015. 26 JOONA KARJALAINEN, ANTHONY LEE, SUMEETPAL S. SINGH, AND MATTI VIHOLA

work page 2015

[9] [9]

Dau and N

H.-D. Dau and N. Chopin. On backward smoothing algorithms. The Annals of Statistics, 51(5):2145–2169, 2023

work page 2023

[10] [10]

Del Moral

P. Del Moral. Feynman-Kac formulae: Genealogical and Interacting Particle Systems with Applications , volume 88. Springer, 2004

work page 2004

[11] [11]

Del Moral and A

P. Del Moral and A. Guionnet. Large deviations for interacting particle sys- tems: Applications to non-linear filtering. Stochastic Process. Appl. , 78(1): 69–95, 1998

work page 1998

[12] [12]

Del Moral and A

P. Del Moral and A. Guionnet. Central limit theorem for nonlinear filtering and interacting particle systems. Ann. Appl. Probab., 9(2):275 – 297, 1999

work page 1999

[13] [13]

Del Moral, A

P. Del Moral, A. Doucet, and G. W. Peters. Sharp propagation of chaos estimates for Feynman–Kac particle models. Theory Probab. Appl. , 51(3): 459–485, 2007

work page 2007

[14] [14]

Del Moral, A

P. Del Moral, A. Doucet, and S. S. Singh. A backward particle interpretation of Feynman–Kac formulae. ESAIM Math. Model. Numer. Anal. , 44(5):947–975, 2010

work page 2010

[15] [15]

Del Moral, F

P. Del Moral, F. Patras, and S. Rubenthaler. Convergence of U-statistics for interacting particle systems. J. Theoret. Probab., 24:1002–1027, 2011

work page 2011

[16] [16]

Del Moral, A

P. Del Moral, A. Doucet, and S. S. Singh. Uniform stability of a particle approximation of the optimal filter derivative. SIAM J. Control Optim. , 53 (3):1278–1304, 2015

work page 2015

[17] [17]

Douc and E

R. Douc and E. Moulines. Limit theorems for weighted samples with appli- cations to sequential Monte Carlo methods. Ann. Statist., 36(5):2344 – 2376, 2008

work page 2008

[18] [18]

R. Douc, A. Garivier, E. Moulines, and J. Olsson. Sequential Monte Carlo smoothing for general state space hidden Markov models. Ann. Appl. Probab., 21(6):2109–2145, 2011

work page 2011

[19] [19]

R. Douc, A. Garivier, E. Moulines, and J. Olsson. Sequential Monte Carlo smoothing for general state space hidden Markov models. The Annals of Ap- plied Probability, 21(6):2109–2145, 2011

work page 2011

[20] [20]

R. Douc, E. Moulines, P. Priouret, and P. Soulier. Markov chains. Springer, 2018

work page 2018

[21] [21]

S. J. Godsill, A. Doucet, and M. West. Monte Carlo smoothing for nonlinear time series. Journal of the American Statistical Association , 99(465):156–168, 2004

work page 2004

[22] [22]

N. J. Gordon, D. J. Salmond, and A. F. M. Smith. Novel approach to nonlinear/non-Gaussian Bayesian state estimation. IEE Proceedings-F, 140 (2):107–113, 1993

work page 1993

[23] [23]

Gregory, C

A. Gregory, C. J. Cotter, and S. Reich. Multilevel ensemble transform particle filtering. SIAM J. Sci. Comput. , 38(3):A1317–A1338, 2016

work page 2016

[24] [24]

Hoeffding and J

W. Hoeffding and J. Wolfowitz. Distinguishability of sets of distributions. The Annals of Mathematical Statistics , 29(3):700–718, 1958

work page 1958

[25] [25]

P. E. Jacob, F. Lindsten, and T. B. Sch¨ on. Smoothing with couplings of conditional particle filters. J. Amer. Statist. Assoc. , 115(530):721–729, 2020

work page 2020

[26] [26]

Jasra and F

A. Jasra and F. Yu. Central limit theorems for coupled particle filters. Adv. in Appl. Probab., 52:942–1001, 2020

work page 2020

[27] [27]

Jasra, K

A. Jasra, K. Kamatani, K. J. Law, and Y. Zhou. Multilevel particle filters. SIAM J. Numer. Anal. , 55(6):3068–3096, 2017

work page 2017

[28] [28]

Kantas, A

N. Kantas, A. Doucet, S. S. Singh, J. Maciejowski, and N. Chopin. On particle methods for parameter estimation in state-space models. Statist. Sci. , 30(3): 328–351, 2015

work page 2015

[29] [29]

Karjalainen, A

J. Karjalainen, A. Lee, S. S. Singh, and M. Vihola. Mixing time of the condi- tional backward sampling particle filter. Preprint arXiv:2312.17572, 2023. ON THE FORGETTING OF PARTICLE FILTERS 27

work page arXiv 2023

[30] [30]

Stability of Conditional Sequential Monte Carlo

B. Kuhlenschmidt and S. S. Singh. Stability of conditional sequential Monte Carlo. Preprint arXiv:1806.06520, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018

[31] [31]

Lee and N

A. Lee and N. Whiteley. Variance estimation in the particle filter. Biometrika, 105(3):609–625, 2018

work page 2018

[32] [32]

A. Lee, S. S. Singh, and M. Vihola. Coupled conditional backward sampling particle filter. Ann. Statist., 48(5):3066–3089, 2020

work page 2020

[33] [33]

Lindvall

T. Lindvall. Lectures on the coupling method . Wiley, 1992. Reprint: Dover paperback edition, 2002

work page 1992

[34] [34]

Mastrototaro and J

A. Mastrototaro and J. Olsson. Adaptive online variance estimation in particle filters: the ALVar estimator. Statist. Comput., 33(4):1–26, 2023

work page 2023

[35] [35]

Olsson and R

J. Olsson and R. Douc. Numerically stable online estimation of variance in particle filters. Bernoulli, 25(2):1504–1535, 2019

work page 2019

[36] [36]

Orguner and F

U. Orguner and F. Gustafsson. Storage efficient particle filters for the out of sequence measurement problem. In 2008 11th International Conference on Information Fusion, pages 1–8. IEEE, 2008

work page 2008

[37] [37]

Orton and A

M. Orton and A. Marrs. Particle filters for tracking with out-of-sequence measurements. IEEE Transactions on Aerospace and Electronic Systems , 41 (2):693–702, 2005

work page 2005

[38] [38]

M. K. Pitt and N. Shephard. Filtering via simulation: Auxiliary particle filters. J. Amer. Statist. Assoc. , 94(446):590–599, 1999

work page 1999

[39] [39]

Poyiadjis, A

G. Poyiadjis, A. Doucety, and S. S. Singh. Particle approximations of the score and observed information matrix in state space models with application to parameter estimation. Biometrika, 98(1):65–80, 2011

work page 2011

[40] [40]

T. B. Sch¨ on, A. Wills, and B. Ninness. System identification of nonlinear state-space models. Automatica, 47(1):39–49, 2011

work page 2011

[41] [41]

D. Sen, A. H. Thiery, and A. Jasra. On coupling particle filter trajectories. Statist. Comput., 28(2):461–475, 2018

work page 2018

[42] [42]

A. N. Shiryaev. Probability. Springer-Verlag, New York, second edition, 1996. ISBN 0-387-94549-0

work page 1996

[43] [43]

V. Z. Tadi´ c and A. Doucet. Asymptotic properties of recursive particle max- imum likelihood estimation. IEEE Trans. Inform. Theory , 67(3):1825–1848, 2021

work page 2021

[44] [44]

Thorisson

H. Thorisson. Coupling, Stationarity, and Regeneration . Springer, 2000

work page 2000

[45] [45]

M. J. Wainwright. High-dimensional statistics: A non-asymptotic viewpoint , volume 48. Cambridge university press, 2019

work page 2019

[46] [46]

Whiteley

N. Whiteley. Stability properties of some particle filters. Ann. Appl. Probab., 23(6):2500–2537, 2013. 28 JOONA KARJALAINEN, ANTHONY LEE, SUMEETPAL S. SINGH, AND MATTI VIHOLA Appendix A. Technical lemmas Lemma 30. For any probability measures µ and ν on E it holds that ∥µ − ν∥2 TV ≤ 1 − (1 − H2(µ, ν))2. Proof. Denote by H2(µ ∥ ν) := 2H2(µ, ν) the (unnorma...

work page 2013

[47] [47]

Under (A2), we obtain a lower bound for the densities of the ideal filters for large enough time indices, not depending on the initial distribution η0

give analogues of Lemmas 1 and 2 under (A2): sup µ,ν ∥Φn,n+k(µ) − Φn,n+k(ν)∥TV ≤ β⌊k/m⌋ m ,(19) sup n≥0 sup osc(ϕ)≤1 ∥ηN n (ϕ) − ηn(ϕ)∥p ≤ cp,m√ N ,(20) for some βm < 1, cp,m < ∞, which only depend on p, m, and the bounds on G and Mt:t+m. Under (A2), we obtain a lower bound for the densities of the ideal filters for large enough time indices, not dependin...

work page