Observed Fisher Information in hidden Markov models - Application to a noisy Gaussian random walk

Alexandra Lefebvre; Gr\'egory Nuel (LPSM (UMR\_8001))

arxiv: 2606.02118 · v1 · pith:CPGD6OYCnew · submitted 2026-06-01 · 📊 stat.CO

Observed Fisher Information in hidden Markov models - Application to a noisy Gaussian random walk

Alexandra Lefebvre , Gr\'egory Nuel (LPSM (UMR\_8001)) This is my paper

Pith reviewed 2026-06-28 11:51 UTC · model grok-4.3

classification 📊 stat.CO

keywords observed Fisher informationhidden Markov modelsGaussian random walkOakes identityforward-backward algorithmscore functionNewton-Raphsonconfidence intervals

0 comments

The pith

Closed-form expressions yield the exact score and observed Fisher information for a noisy Gaussian random walk.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper derives analytical closed-form expressions for the exact computation of the score and the observed Fisher information matrix in a hidden Markov model where a Gaussian random walk is observed through Gaussian noise. The derivation uses Oakes' identity together with the forward-backward algorithm to achieve linear time complexity in the sequence length. A sympathetic reader would care because the resulting exact observed information supports direct construction of confidence intervals and Newton-Raphson parameter estimation without numerical differentiation or Monte Carlo approximations. The method is demonstrated on simulated data to produce both point estimates and intervals.

Core claim

We provide analytical and closed-form expressions for the exact computation of the score and the observed Fisher information matrix in a Gaussian random walk observed through Gaussian noise. Our method is based on the Oakes' identity and, as for the computation of the log-likelihood, its complexity in time is linear in the length of the sequence with the forward-backward (or Baum-Welch) algorithm. We illustrate the method over various simulation studies and provide parameter estimates computed with the Newton-Raphson algorithm along with confidence intervals.

What carries the argument

Oakes' identity applied to forward-backward recursions to obtain the observed Fisher information from the score in the Gaussian HMM

If this is right

The score and observed Fisher information matrix can be obtained exactly with the same linear-time forward-backward procedure used for the log-likelihood.
Newton-Raphson maximization of the likelihood becomes practical because the exact observed information supplies the Hessian approximation.
Asymptotic confidence intervals follow directly from the inverse of the computed observed information matrix.
The approach is validated on simulated sequences of varying lengths, confirming that parameter estimates and intervals are produced in practice.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same Oakes-plus-forward-backward strategy might be adapted to other linear-Gaussian state-space models that admit closed-form complete-data moments.
Exact observed information could reduce the variance of uncertainty estimates in downstream tasks such as change-point detection on noisy random-walk trajectories.
One could check whether the closed forms remain tractable when the observation noise variance is allowed to vary across time steps.

Load-bearing premise

Oakes' identity applies directly to this Gaussian hidden Markov model to obtain the observed Fisher information from the score without requiring additional approximations beyond the forward-backward recursions.

What would settle it

Numerical comparison of the analytical observed Fisher information against a finite-difference approximation of the Hessian of the log-likelihood on a fixed short simulated sequence; any systematic discrepancy would falsify the closed-form expressions.

Figures

Figures reproduced from arXiv: 2606.02118 by Alexandra Lefebvre, Gr\'egory Nuel (LPSM (UMR\_8001)).

read the original abstract

In this work we provide analytical and closed-form expressions for the exact computation of the score and the observed Fisher information matrix in a Gaussian random walk observed through Gaussian noise. Our method is based on the Oakes' identity and, as for the computation of the log-likelihood, its complexity in time is linear in the length of the sequence with the forward-backward (or Baum-Welch) algorithm. We illustrate the method over various simulation studies and provide parameter estimates computed with the Newton-Raphson algorithm along with confidence intervals.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Closed-form expressions for the score and observed Fisher information in this noisy Gaussian random walk HMM, obtained exactly via Oakes' identity inside forward-backward.

read the letter

The paper supplies closed-form expressions for the score and the observed Fisher information in a Gaussian random walk observed through Gaussian noise. These come from Oakes' identity applied inside the forward-backward algorithm, keeping everything exact and linear in the length of the sequence.

What is new is the explicit derivation for this specific HMM. The model is linear Gaussian, so the conditional expectations and variances required by Oakes' formula are available in closed form from the recursions. This lets them run Newton-Raphson with proper standard errors without numerical differentiation. The approach is a direct extension of standard HMM tools rather than a new algorithm.

The abstract mentions simulation studies and parameter estimates with confidence intervals, but gives no quantitative details on accuracy or speed. That makes it hard to judge how much of an improvement this is over existing methods for linear state-space models, where a Kalman filter would produce equivalent results. If the full paper includes verification against known cases or timing comparisons, that would address the gap.

This work is aimed at statisticians who fit HMMs to sequential data and need reliable observed information matrices. A reader already comfortable with Oakes' identity and Baum-Welch will pick it up quickly. The central claim holds up on the description given, with no obvious circularity or unstated approximations. It deserves a serious referee because the contribution is technically grounded even if the scope is narrow.

Referee Report

2 major / 2 minor

Summary. The manuscript claims to derive analytical closed-form expressions for the score and observed Fisher information matrix of a Gaussian random walk observed in Gaussian noise. The approach relies on Oakes' identity combined with the forward-backward algorithm to achieve exact computation in linear time; the method is illustrated via simulation studies and used to obtain Newton-Raphson parameter estimates together with confidence intervals.

Significance. If the derivations are correct, the work supplies an exact, non-approximate route to the observed information matrix for this linear-Gaussian state-space model, enabling reliable likelihood-based inference and standard-error computation without numerical differentiation. The linear complexity matches that of the likelihood itself, which is a practical advantage. The stress-test concern regarding direct applicability of Oakes' identity does not materialize: for this model the required conditional moments remain Gaussian and are furnished exactly by the same Kalman-type recursions already employed for the likelihood.

major comments (2)

The central claim of closed-form expressions is load-bearing, yet the manuscript provides no explicit derivation, error analysis, or verification steps showing how the conditional expectations and variances in Oakes' identity are obtained from the forward-backward quantities. This omission prevents assessment of whether the claimed exactness holds without hidden approximations.
Simulation studies are invoked to illustrate the method, but no quantitative results, error metrics, or comparison baselines against numerical differentiation or other information estimators are reported, leaving the practical performance of the closed forms unverified.

minor comments (2)

Notation for the state and observation processes should be introduced with explicit definitions of all parameters before the application of Oakes' identity.
The abstract states linear complexity but does not contrast it with the quadratic cost of naive numerical Hessian evaluation, which would clarify the computational gain.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful review and for highlighting areas where the manuscript can be strengthened. We address the two major comments below and will revise the manuscript to incorporate the requested clarifications and quantitative results.

read point-by-point responses

Referee: The central claim of closed-form expressions is load-bearing, yet the manuscript provides no explicit derivation, error analysis, or verification steps showing how the conditional expectations and variances in Oakes' identity are obtained from the forward-backward quantities. This omission prevents assessment of whether the claimed exactness holds without hidden approximations.

Authors: We agree that an explicit derivation is necessary for full assessment. In the revised manuscript we will add a dedicated subsection that starts from the forward-backward recursions, derives the required conditional means and variances under the linear-Gaussian structure, substitutes them into Oakes' identity, and verifies that all quantities remain exact (no hidden approximations or numerical integration). We will also include a short error-analysis paragraph confirming that the resulting expressions for the score and observed information are algebraically closed-form. revision: yes
Referee: Simulation studies are invoked to illustrate the method, but no quantitative results, error metrics, or comparison baselines against numerical differentiation or other information estimators are reported, leaving the practical performance of the closed forms unverified.

Authors: We accept that the current simulation section lacks the quantitative benchmarks needed to demonstrate practical performance. In the revision we will expand the numerical experiments to report (i) the element-wise absolute and relative differences between the analytical observed information matrix and a high-precision finite-difference reference, (ii) wall-clock timings, and (iii) coverage rates of the resulting confidence intervals, all averaged over repeated Monte Carlo replications and across a range of sequence lengths and signal-to-noise ratios. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation applies external identity to new model

full rationale

The paper derives closed-form expressions for the score and observed Fisher information by applying Oakes' identity (external prior literature) together with the standard forward-backward algorithm to the specific linear-Gaussian HMM. No step reduces a claimed prediction or result to a fitted parameter or self-citation by construction; the central contribution is the model-specific analytic application rather than a re-derivation of inputs. The method remains self-contained against external benchmarks such as the cited Oakes identity and the well-known Baum-Welch recursions.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the applicability of Oakes' identity to this HMM and the existence of closed-form recursions via forward-backward; no free parameters or new entities are introduced in the abstract.

axioms (1)

domain assumption Oakes' identity can be applied to compute the observed Fisher information matrix exactly from the score in this Gaussian HMM
The method is explicitly based on Oakes' identity as stated in the abstract.

pith-pipeline@v0.9.1-grok · 5618 in / 1242 out tokens · 31785 ms · 2026-06-28T11:51:52.849665+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

18 extracted references · 11 canonical work pages

[1]

L. E. Baum, T. Petrie, Statistical Inference for Probabilistic Functions of Fi- nite State Markov Chains, The Annals of Mathematical Statistics 37 (1966) 1554–1563. URL: http://www.jstor.org/stable/2238772. doi: https://doi.org/ 10.1214/aoms/1177699147

work page doi:10.1214/aoms/1177699147 1966
[2]

L. E. Baum, T. Petrie, G. Soules, N. Weiss, A maximization technique occurring in the statistical analysis of probabilistic functions of markov chains, The Annals of Mathe- matical Statistics 41 (1970) 164–171. URL: http://www.jstor.org/stable/2239727. doi:https://doi.org/10.1214/aoms/1177697196

work page doi:10.1214/aoms/1177697196 1970
[3]

A. P. Dempster, N. M. Laird, D. B. Rubin, Maximum Likelihood from Incom- plete Data via the EM Algorithm, Journal of the Royal Statistical Society. Series B (Methodological) 39 (1977) 1–38. URL: http://www.jstor.org/stable/2984875. doi:https://doi.org/10.1111/j.2517-6161.1977.tb01600.x. 29

work page doi:10.1111/j.2517-6161.1977.tb01600.x 1977
[4]

Orchard, M

T. Orchard, M. A. Woodbury, A missing information principle: theory and applica- tions, in: Proceedings of the Sixth Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Theory of Statistics, volume 6, University of California Press, 1972, pp. 697–716. doi: https://doi.org/10.1525/9780520325883-036

work page doi:10.1525/9780520325883-036 1972
[5]

T. A. Louis, Finding the Observed Information Matrix when Using the EM Algo- rithm, Journal of the Royal Statistical Society. Series B (Methodological) 44 (1982) 226–233. URL: http://www.jstor.org/stable/2345828. doi: https://doi.org/10. 1111/j.2517-6161.1982.tb01203.x

arXiv 1982
[6]

Oakes, Direct calculation of the information matrix via the em algorithm, Jour- nal of the Royal Statistical Society

D. Oakes, Direct calculation of the information matrix via the em algorithm, Jour- nal of the Royal Statistical Society. Series B (Statistical Methodology) 61 (1999) 479–

1999
[7]

doi:https://doi.org/10.1111/ 1467-9868.00188

URL: http://www.jstor.org/stable/2680653. doi:https://doi.org/10.1111/ 1467-9868.00188

arXiv
[8]

T. R. Turner, M. A. Cameron, P. J. Thomson, Hidden markov chains in generalized linear models, The Canadian Journal of Statistics / La Revue Canadienne de Statistique 26 (1998) 107–125. URL: http://www.jstor.org/stable/3315677. doi:https://doi. org/10.2307/3315677

work page doi:10.2307/3315677 1998
[9]

Delyon, M

B. Delyon, M. Lavielle, E. Moulines, Convergence of a stochastic approximation version of the em algorithm, The Annals of Statistics 27 (1999) 94–128. URL: http://www. jstor.org/stable/120120. doi:https://doi.org/10.1214/aos/1018031103

work page doi:10.1214/aos/1018031103 1999
[10]

R. P. Chalmers, Numerical approximation of the observed information matrix with Oakes’ identity, British Journal of Mathematical and Statistical Psychology 71 (2018) 415–436. URL: https://bpspsychub.onlinelibrary.wiley.com/doi/abs/10.1111/ bmsp.12127. doi:https://doi.org/10.1111/bmsp.12127

work page doi:10.1111/bmsp.12127 2018
[11]

Capp´ e, E

O. Capp´ e, E. Moulines, Recursive computation of the score and observed information matrix in hidden Markov models, in: IEEE/SP 13th Workshop on Statistical Signal Processing, 2005, IEEE, 2005, pp. 703–708. doi: 10.1109/SSP.2005.1628685

work page doi:10.1109/ssp.2005.1628685 2005
[12]

Bartolucci, A

F. Bartolucci, A. Farcomeni, Information matrix for hidden Markov models with co- variates, Statistics and Computing 25 (2014) 515–526. URL: https://link.springer. com/article/10.1007/s11222-014-9450-8 . doi:10.1007/s11222-014-9450-8

work page doi:10.1007/s11222-014-9450-8 2014
[13]

T. C. Lystig, J. P. Hughes, Exact computation of the observed information ma- trix for hidden markov models, Journal of Computational and Graphical Statis- tics 11 (2002) 678–689. URL: http://www.jstor.org/stable/1391119. doi:10.1198/ 106186002402. 30

arXiv 2002
[14]

Lefebvre, G

A. Lefebvre, G. Nuel, A sum-product algorithm with polynomials for computing exact derivatives of the likelihood in Bayesian networks, in: V. Kratochv´ ıl, M. Studen´ y (Eds.), Proceedings of the Ninth International Conference on Probabilistic Graphical Models, volume 72 of Proceedings of Machine Learning Research , PMLR, 2018, pp. 201–212. URL: https://p...

2018
[15]

Proceedings of the IEEE , author=

L. Rabiner, A tutorial on hidden Markov models and selected applications in speech recognition, Proceedings of the IEEE 77 (1989) 257–286. doi: 10.1109/5.18626

work page doi:10.1109/5.18626 1989
[16]

Capp´ e, E

O. Capp´ e, E. Moulines, T. Ryd´ en, Inference in hidden Markov models, Springer, 2005. doi:10.1007/0-387-28982-8

work page doi:10.1007/0-387-28982-8 2005
[17]

P. A. Bromiley, Products and Convolutions of Gaussian Probability Density Functions, Internal Report TINA Memo No. 2003-003, University of Manchester, 2003

2003
[18]

URL: https://www.R-project

R Core Team, R: A Language and Environment for Statistical Computing, R Founda- tion for Statistical Computing, Vienna, Austria, 2024. URL: https://www.R-project. org/. 31

2024

[1] [1]

L. E. Baum, T. Petrie, Statistical Inference for Probabilistic Functions of Fi- nite State Markov Chains, The Annals of Mathematical Statistics 37 (1966) 1554–1563. URL: http://www.jstor.org/stable/2238772. doi: https://doi.org/ 10.1214/aoms/1177699147

work page doi:10.1214/aoms/1177699147 1966

[2] [2]

L. E. Baum, T. Petrie, G. Soules, N. Weiss, A maximization technique occurring in the statistical analysis of probabilistic functions of markov chains, The Annals of Mathe- matical Statistics 41 (1970) 164–171. URL: http://www.jstor.org/stable/2239727. doi:https://doi.org/10.1214/aoms/1177697196

work page doi:10.1214/aoms/1177697196 1970

[3] [3]

A. P. Dempster, N. M. Laird, D. B. Rubin, Maximum Likelihood from Incom- plete Data via the EM Algorithm, Journal of the Royal Statistical Society. Series B (Methodological) 39 (1977) 1–38. URL: http://www.jstor.org/stable/2984875. doi:https://doi.org/10.1111/j.2517-6161.1977.tb01600.x. 29

work page doi:10.1111/j.2517-6161.1977.tb01600.x 1977

[4] [4]

Orchard, M

T. Orchard, M. A. Woodbury, A missing information principle: theory and applica- tions, in: Proceedings of the Sixth Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Theory of Statistics, volume 6, University of California Press, 1972, pp. 697–716. doi: https://doi.org/10.1525/9780520325883-036

work page doi:10.1525/9780520325883-036 1972

[5] [5]

T. A. Louis, Finding the Observed Information Matrix when Using the EM Algo- rithm, Journal of the Royal Statistical Society. Series B (Methodological) 44 (1982) 226–233. URL: http://www.jstor.org/stable/2345828. doi: https://doi.org/10. 1111/j.2517-6161.1982.tb01203.x

arXiv 1982

[6] [6]

Oakes, Direct calculation of the information matrix via the em algorithm, Jour- nal of the Royal Statistical Society

D. Oakes, Direct calculation of the information matrix via the em algorithm, Jour- nal of the Royal Statistical Society. Series B (Statistical Methodology) 61 (1999) 479–

1999

[7] [7]

doi:https://doi.org/10.1111/ 1467-9868.00188

URL: http://www.jstor.org/stable/2680653. doi:https://doi.org/10.1111/ 1467-9868.00188

arXiv

[8] [8]

T. R. Turner, M. A. Cameron, P. J. Thomson, Hidden markov chains in generalized linear models, The Canadian Journal of Statistics / La Revue Canadienne de Statistique 26 (1998) 107–125. URL: http://www.jstor.org/stable/3315677. doi:https://doi. org/10.2307/3315677

work page doi:10.2307/3315677 1998

[9] [9]

Delyon, M

B. Delyon, M. Lavielle, E. Moulines, Convergence of a stochastic approximation version of the em algorithm, The Annals of Statistics 27 (1999) 94–128. URL: http://www. jstor.org/stable/120120. doi:https://doi.org/10.1214/aos/1018031103

work page doi:10.1214/aos/1018031103 1999

[10] [10]

R. P. Chalmers, Numerical approximation of the observed information matrix with Oakes’ identity, British Journal of Mathematical and Statistical Psychology 71 (2018) 415–436. URL: https://bpspsychub.onlinelibrary.wiley.com/doi/abs/10.1111/ bmsp.12127. doi:https://doi.org/10.1111/bmsp.12127

work page doi:10.1111/bmsp.12127 2018

[11] [11]

Capp´ e, E

O. Capp´ e, E. Moulines, Recursive computation of the score and observed information matrix in hidden Markov models, in: IEEE/SP 13th Workshop on Statistical Signal Processing, 2005, IEEE, 2005, pp. 703–708. doi: 10.1109/SSP.2005.1628685

work page doi:10.1109/ssp.2005.1628685 2005

[12] [12]

Bartolucci, A

F. Bartolucci, A. Farcomeni, Information matrix for hidden Markov models with co- variates, Statistics and Computing 25 (2014) 515–526. URL: https://link.springer. com/article/10.1007/s11222-014-9450-8 . doi:10.1007/s11222-014-9450-8

work page doi:10.1007/s11222-014-9450-8 2014

[13] [13]

T. C. Lystig, J. P. Hughes, Exact computation of the observed information ma- trix for hidden markov models, Journal of Computational and Graphical Statis- tics 11 (2002) 678–689. URL: http://www.jstor.org/stable/1391119. doi:10.1198/ 106186002402. 30

arXiv 2002

[14] [14]

Lefebvre, G

A. Lefebvre, G. Nuel, A sum-product algorithm with polynomials for computing exact derivatives of the likelihood in Bayesian networks, in: V. Kratochv´ ıl, M. Studen´ y (Eds.), Proceedings of the Ninth International Conference on Probabilistic Graphical Models, volume 72 of Proceedings of Machine Learning Research , PMLR, 2018, pp. 201–212. URL: https://p...

2018

[15] [15]

Proceedings of the IEEE , author=

L. Rabiner, A tutorial on hidden Markov models and selected applications in speech recognition, Proceedings of the IEEE 77 (1989) 257–286. doi: 10.1109/5.18626

work page doi:10.1109/5.18626 1989

[16] [16]

Capp´ e, E

O. Capp´ e, E. Moulines, T. Ryd´ en, Inference in hidden Markov models, Springer, 2005. doi:10.1007/0-387-28982-8

work page doi:10.1007/0-387-28982-8 2005

[17] [17]

P. A. Bromiley, Products and Convolutions of Gaussian Probability Density Functions, Internal Report TINA Memo No. 2003-003, University of Manchester, 2003

2003

[18] [18]

URL: https://www.R-project

R Core Team, R: A Language and Environment for Statistical Computing, R Founda- tion for Statistical Computing, Vienna, Austria, 2024. URL: https://www.R-project. org/. 31

2024