pith. sign in

arxiv: 2605.29208 · v1 · pith:6CCK26GYnew · submitted 2026-05-28 · 💻 cs.MS · cs.LG

libhmm: A Modern C++20 Library for Hidden Markov Models with Correct MLE Emission M-Steps

Pith reviewed 2026-06-29 00:09 UTC · model grok-4.3

classification 💻 cs.MS cs.LG
keywords Hidden Markov ModelsBaum-Welch algorithmMaximum likelihood estimationEmission distributionsC++ libraryStudent-t distributionNewton-RaphsonViterbi algorithm
0
0 comments X

The pith

libhmm supplies correct maximum likelihood estimators for sixteen HMM emission distributions instead of method-of-moments approximations.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper describes a C++20 library for Hidden Markov Models focused on accurate parameter estimation in the emission M-step of the Baum-Welch algorithm. Existing software commonly substitutes method-of-moments approximations, but libhmm implements exact maximum likelihood estimators for sixteen distributions. These include an ECME algorithm for the location-scale Student-t, Newton-Raphson maximization for Gamma, Beta, Weibull, and Negative Binomial, and the von Mises distribution for circular observations. Forward-backward and Viterbi routines run entirely in log space with SIMD dispatch, and the library is benchmarked against other C, C++, and R packages on five real datasets.

Core claim

libhmm implements correct maximum likelihood estimators for sixteen continuous and discrete emission distributions, including an ECME algorithm for the location-scale Student-t distribution, Newton-Raphson maximization for Gamma, Beta, Weibull, and Negative Binomial distributions, and the von Mises distribution for circular data.

What carries the argument

Correct MLE emission M-steps in the Baum-Welch algorithm, realized via ECME for Student-t and Newton-Raphson for Gamma, Beta, Weibull, and Negative Binomial.

If this is right

  • HMM fits using non-Gaussian emissions achieve higher likelihoods than those obtained with method-of-moments approximations.
  • Model selection and Viterbi decoding benefit from the improved emission parameter accuracy.
  • A zero-dependency C++20 implementation becomes available for embedding in production systems.
  • Circular data can be modeled directly through the von Mises emission option.
  • Python users obtain the same estimators through the pylibhmm bindings.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same pattern of replacing approximations with exact MLE steps could be applied to other latent-variable models beyond HMMs.
  • Production systems that currently tolerate MOM bias may see measurable gains in predictive performance once switched to the library.
  • Numerical stability in long sequences may improve because all recursions remain in log space.

Load-bearing premise

The described M-step implementations are the mathematically correct maximum likelihood estimators for those emission distributions.

What would settle it

Showing that the Newton-Raphson procedure for the Gamma emission distribution fails to recover the known maximum-likelihood parameters on a controlled test set of observations.

Figures

Figures reproduced from arXiv: 2605.29208 by Gary Wolfman.

Figure 1
Figure 1. Figure 1: Forward-backward throughput: libhmm vs. HMMLib vs. StochHMM (Dishonest Casino, 2-state discrete HMM, Windows Ryzen 7 / AVX-512 / MSVC). HMMLib uses scaled forward-backward; libhmm uses log-space; StochHMM uses unscaled recursion. The throughput gap between HMMLib and libhmm is primarily due to the log-space vs. scaled algorithm choice. 6.2 Real-data benchmarks We fit five HMMs to published datasets and com… view at source ↗
Figure 2
Figure 2. Figure 2: ECME log-likelihood convergence on 5,838 DAX log-returns (2000–2022). The [PITH_FULL_IMAGE:figures/full_fig_p012_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: VonMisesDistribution vs. Normal approximation for circular wind direction data [PITH_FULL_IMAGE:figures/full_fig_p014_3.png] view at source ↗
read the original abstract

We describe libhmm, a C++20 library for Hidden Markov Model parameter estimation, sequence decoding, and model selection. libhmm addresses two gaps in existing software: the absence of a well-maintained, zero-dependency C++ HMM library suitable for embedding in production systems, and the widespread use of method-of-moments (MOM) approximations in the emission distribution M-step of the Baum-Welch algorithm. The library implements correct maximum likelihood estimators for sixteen continuous and discrete emission distributions, including an ECME algorithm for the location-scale Student-t distribution, Newton-Raphson maximization for Gamma, Beta, Weibull, and Negative Binomial distributions, and the von Mises distribution for circular data. All forward-backward and Viterbi calculations operate in full log-space. SIMD acceleration is provided for AVX-512, AVX2, SSE2, and ARM NEON via compile-time dispatch with scalar fallback. Python bindings are available via the companion package pylibhmm. We compare libhmm against established C and C++ HMM libraries and against published R reference packages on five real-data benchmarks, and discuss the architectural tradeoffs made in the design.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 1 minor

Summary. The paper presents libhmm, a C++20 library for Hidden Markov Model parameter estimation, sequence decoding, and model selection. It claims to address gaps in existing software by providing a zero-dependency C++ implementation with correct maximum likelihood estimators (rather than method-of-moments approximations) for sixteen continuous and discrete emission distributions, using ECME for the location-scale Student-t, Newton-Raphson for Gamma/Beta/Weibull/Negative Binomial, and support for the von Mises distribution; all calculations are in log-space with SIMD acceleration (AVX-512/AVX2/SSE2/NEON) and Python bindings via pylibhmm. The library is compared to existing C/C++ and R packages on five real-data benchmarks.

Significance. If the MLE implementations are correct, the work would be significant as a production-oriented, modern C++ HMM library with accurate emission M-steps, full log-space numerics, and compile-time SIMD dispatch. These features address real needs in embedded and high-performance settings where existing libraries rely on MOM approximations or lack maintenance.

major comments (1)
  1. [Abstract] Abstract: the central claim that the sixteen M-step routines (ECME for Student-t; Newton-Raphson for Gamma, Beta, Weibull, Negative Binomial) compute true MLEs is load-bearing for the paper's contribution, yet the manuscript supplies no analytic derivations of the score equations, convergence guarantees, boundary-case handling, or recovery tests against known MLEs; without such verification the numerical procedures could return non-MLE stationary points in some regimes.
minor comments (1)
  1. [Abstract] Abstract: the five real-data benchmarks are mentioned without any description of dataset selection, preprocessing, or controls for post-hoc selection, which would strengthen the comparison claims.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the thoughtful review and for recognizing the potential significance of the work if the MLE claims hold. We address the single major comment below.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the central claim that the sixteen M-step routines (ECME for Student-t; Newton-Raphson for Gamma, Beta, Weibull, Negative Binomial) compute true MLEs is load-bearing for the paper's contribution, yet the manuscript supplies no analytic derivations of the score equations, convergence guarantees, boundary-case handling, or recovery tests against known MLEs; without such verification the numerical procedures could return non-MLE stationary points in some regimes.

    Authors: We agree that the current manuscript does not contain the analytic derivations, convergence analysis, boundary handling, or recovery tests needed to substantiate the MLE claim. In the revised manuscript we will add a dedicated section (or appendix) that (i) states the score equations for each of the sixteen emission M-steps, (ii) outlines the convergence properties of the ECME algorithm for the location-scale Student-t and the Newton-Raphson iterations for Gamma, Beta, Weibull and Negative Binomial, (iii) documents the boundary-case logic (e.g., shape-parameter safeguards and initialization), and (iv) reports numerical recovery experiments comparing libhmm MLEs against reference solutions obtained from R's optim and fitdistrplus on synthetic data drawn from the same distributions. These additions will directly address the concern that the numerical procedures might converge to non-MLE stationary points. revision: yes

Circularity Check

0 steps flagged

No circularity: implementation report with no derivation or fitted predictions

full rationale

The paper describes a C++ library implementing HMM algorithms and M-step estimators for emission distributions using standard numerical methods (Newton-Raphson, ECME). No mathematical derivations, predictions from fitted parameters, self-citations as load-bearing premises, or renamings of known results are present. The central claims concern code correctness and performance benchmarks against external libraries, which are independent of any internal reduction to the paper's own inputs. This is a self-contained software report with no derivation chain to inspect for circularity.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

This is a software library description paper; the abstract introduces no free parameters, mathematical axioms, or invented entities.

pith-pipeline@v0.9.1-grok · 5731 in / 1114 out tokens · 18488 ms · 2026-06-29T00:09:46.876210+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

12 extracted references · 12 canonical work pages

  1. [1]

    Arthur P

    doi: 10.1214/aoms/1177697196. Arthur P. Dempster, Nan M. Laird, and Donald B. Rubin. Maximum likelihood from incomplete data via the EM algorithm.Journal of the Royal Statistical Society, Series B, 39(1):1–38,

  2. [2]

    P., Laird, N

    doi: 10.1111/j.2517-6161.1977.tb01600.x. Richard Durbin, Sean R. Eddy, Anders Krogh, and Graeme Mitchison.Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids. Cambridge University Press, Cambridge,

  3. [3]

    Jean-Marc Fran¸ cois

    doi: 10.1017/CBO9780511790492. Jean-Marc Fran¸ cois. JAHMM: An implementation of HMM in Java. URL https://code.google. com/archive/p/jahmm/. David Harte.HiddenMarkov: Hidden Markov Models,

  4. [4]

    doi: 10.2307/2337067. Brett T. McClintock and Th´ eo Michelot. momentuHMM: R package for generalised hidden Markov models of animal movement.Methods in Ecology and Evolution, 9(6):1518–1530,

  5. [5]

    Th´ eo Michelot, Roland Langrock, and Toby A

    doi: 10.1111/2041-210X.12995. Th´ eo Michelot, Roland Langrock, and Toby A. Patterson. moveHMM: An R package for the statistical modelling of animal movement data using hidden Markov models.Methods in Ecology and Evolution, 7(11):1308–1315,

  6. [6]

    doi: 10.1111/2041-210X.12578. Juan M. Morales, Daniel T. Haydon, Jacqueline Frair, Kent E. Holsinger, and John M. Fryxell. Extracting more out of relocation data: Building movement models as mixtures of random walks. Ecology, 85(9):2436–2445,

  7. [7]

    Lennart Oelschl¨ ager, Timo Adam, and Rouven Michels

    doi: 10.1890/03-0269. Lennart Oelschl¨ ager, Timo Adam, and Rouven Michels. fHMM: Hidden Markov models for financial time series in R.Journal of Statistical Software, 109(9):1–37,

  8. [8]

    Lawrence R

    doi: 10.18637/jss.v109.i09. Lawrence R. Rabiner. A tutorial on hidden Markov models and selected applications in speech recognition.Proceedings of the IEEE, 77(2):257–286,

  9. [9]

    Proceedings of the IEEE , author=

    doi: 10.1109/5.18626. Alexander Schliep, Alexander Sch¨ onhuth, and Christine Steinhoff. Using hidden Markov models to analyze gene expression time course data.Bioinformatics, 19(suppl 1):i255–i263,

  10. [10]

    Andrew J

    doi: 10.1093/bioinformatics/btg1036. Andrew J. Viterbi. Error bounds for convolutional codes and an asymptotically optimum decoding algorithm.IEEE Transactions on Information Theory, 13(2):260–269,

  11. [11]

    Maharaj, Code Construction on Fiber Products of Kummer Covers, IEEE Transactions on Information Theory 50 (9) (2004) 2169–2173.doi:10.1109/TIT

    doi: 10.1109/TIT. 1967.1054010. 16 Gary Wolfman. libhmm: A modern C++20 library for hidden Markov model analysis.Journal of Open Source Software, 2026a. URL https://github.com/OldCrow/libhmm. DOI to be assigned at publication. Gary Wolfman. pylibhmm: Python bindings for libhmm, 2026b. URL https://github.com/ OldCrow/pylibhmm. Walter Zucchini and Iain L. M...

  12. [12]

    doi: 10.1201/b20790. 17