Quick Adaptive Ternary Segmentation: An Efficient Decoding Procedure For Hidden Markov Models

Alexandre M\"osching; Axel Munk; Housen Li

arxiv: 2305.18578 · v2 · pith:ZHLAC6NHnew · submitted 2023-05-29 · 📊 stat.ME · cs.LG· stat.ML

Quick Adaptive Ternary Segmentation: An Efficient Decoding Procedure For Hidden Markov Models

Alexandre M\"osching , Housen Li , Axel Munk This is my paper

Pith reviewed 2026-05-24 08:52 UTC · model grok-4.3

classification 📊 stat.ME cs.LGstat.ML

keywords hidden Markov modelsdecodingViterbiternary segmentationdivide and conquercomputational complexityadmissible pathslikelihood maximization

0 comments

The pith

QATS is a polylog-time decoding method for HMMs that uses adaptive ternary segmentation to maximize local likelihoods while staying admissible.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents Quick Adaptive Ternary Segmentation (QATS) as an efficient decoding procedure for hidden Markov models. It uses a divide-and-conquer strategy to achieve polylogarithmic complexity in the sequence length and cubic complexity in the state space size. This is much faster than linear methods like Viterbi for long sequences. QATS works by approximately maximizing local likelihood scores over paths with at most three segments using adaptive search, resulting in an admissible state sequence estimate. The authors provide simulations showing speedups and precision analysis.

Core claim

QATS is a divide-and-conquer procedure with computational complexity polylogarithmic in the length of the sequence, and cubic in the size of the state space. The estimated sequence of states sequentially maximizes local likelihood scores among all local paths with at most three segments, and is meanwhile admissible. The maximization is performed only approximately using an adaptive search procedure. It also suggests an effective way of data storage as specific cumulative sums.

What carries the argument

Quick Adaptive Ternary Segmentation (QATS), a divide-and-conquer procedure that finds admissible state sequences by local maximization over paths with at most three segments via approximate adaptive search.

If this is right

Enables decoding of HMMs with much longer sequences than linear-time methods allow.
Allows efficient storage of data through specific cumulative sums.
Maintains admissibility of the estimated state sequence.
Provides speedups in simulations compared to Viterbi and PMAP with similar precision.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The method's efficiency for few-state models could encourage HMM use in domains with massive datasets like genomics.
Parallelization of the divide-and-conquer steps might yield additional speed gains not explored in the paper.
The approximate nature suggests QATS could be tuned for different accuracy-speed tradeoffs.

Load-bearing premise

The adaptive search procedure used for approximate local maximization is sufficiently accurate that the resulting sequence remains admissible and retains the claimed precision relative to exact methods.

What would settle it

A direct comparison on a simulated long sequence where the QATS estimate is checked for admissibility against an exact decoder or where its likelihood is substantially lower.

Figures

Figures reproduced from arXiv: 2305.18578 by Alexandre M\"osching, Axel Munk, Housen Li.

**Figure 2.** Figure 2: Plots of H2 = log H2 (left) and H3 = log H3 (right) in the setting of the top and middle plots of [PITH_FULL_IMAGE:figures/full_fig_p009_2.png] view at source ↗

**Figure 3.** Figure 3: Plots of H2 = log H2 (left) and H3 = log H3 (right). Local maxima (in red) are located either on the boundaries, on the diagonal (for H3 ), on true change points or pairs thereof (black lines). as local maxima of H2 or as components of local maxima of H3 . In other words, the study of the maps H2 and H3 shall theoretically unveil all change points of the true sequence x o . Theorem 3.3. Suppose that s o ≥ … view at source ↗

**Figure 4.** Figure 4: Median time ratio T V/T Q against exit probability p. values of p. When m = 10 and n = 106 + 1 for instance, QATS is about 10 times faster than Viterbi for an expected number of segments of 11. This shows that QATS is indeed substantially faster than Viterbi for low number of segments/change points. Note furthermore that the time ratio should be even larger for even longer sequences of observations [PITH_… view at source ↗

**Figure 5.** Figure 5: Median quantile time ratio T V/T Q (left), and times T Q and T V (right) against exit probability p, along with 10%- and 90%-quantile curves, for the setting σ = 1.0 and n = 106 + 1. for β ∈ {0.1, 0.5, 0.9} and w ∈ {0, 2}, where xb Q and xb V correspond to QATS and Viterbi paths, respectively. Those sample quantiles are computed from the same samples as for the computation time study [PITH_FULL_IMAGE:figu… view at source ↗

**Figure 6.** Figure 6: Difference in misclassification rates (w = 0, top) and root mean squared error (w = 2, bottom) against exit probability p for the setting σ = 0.1 (left) and σ = 1.0 (right). 23 [PITH_FULL_IMAGE:figures/full_fig_p023_6.png] view at source ↗

**Figure 7.** Figure 7: Misclassification rate (w = 0, left column) and root mean squared error (w = 2, right column) against exit probability p, for m ∈ {2, 3, 5, 10} (top to bottom), in the setting σ = 0.1 and n = 106 + 1. 24 [PITH_FULL_IMAGE:figures/full_fig_p024_7.png] view at source ↗

**Figure 8.** Figure 8: Misclassification rate (w = 0, left column) and root mean squared error (w = 2, right column) against exit probability p, for m ∈ {2, 3, 5, 10} (top to bottom), in the setting σ = 1.0 and n = 106 + 1. 25 [PITH_FULL_IMAGE:figures/full_fig_p025_8.png] view at source ↗

**Figure 9.** Figure 9: Computation time of QATS T Q against root mean squared error (w = 2), in the setting n = 106 + 1, m = 3, p = 10−4 and σ = 1.0. The bivariate normal {10, 50, 90}-% confidence ellipses (in red) are computed on the log-scale for both the error and the time, and serve as indicators for the distribution of the data. exceeds 2 · 10−3 , both methods interpret larger numbers of segments as extra noise in the data,… view at source ↗

read the original abstract

Hidden Markov models (HMMs) are characterized by an unobservable Markov chain and an observable process -- a noisy version of the hidden chain. Decoding the original signal from the noisy observations is one of the main goals in nearly all HMM based data analyses. Existing decoding algorithms such as Viterbi and the pointwise maximum a posteriori (PMAP) algorithm have computational complexity at best linear in the length of the observed sequence, and sub-quadratic in the size of the state space of the hidden chain. We present Quick Adaptive Ternary Segmentation (QATS), a divide-and-conquer procedure with computational complexity polylogarithmic in the length of the sequence, and cubic in the size of the state space, hence particularly suited for large scale HMMs with relatively few states. It also suggests an effective way of data storage as specific cumulative sums. In essence, the estimated sequence of states sequentially maximizes local likelihood scores among all local paths with at most three segments, and is meanwhile admissible. The maximization is performed only approximately using an adaptive search procedure. Our simulations demonstrate the speedups offered by QATS in comparison to Viterbi and PMAP, along with a precision analysis. An implementation of QATS is in the R-package QATS on GitHub.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

QATS gives a new ternary divide-and-conquer decoder for HMMs that looks faster on paper, but the admissibility and polylog claims hang on an unshown approximation step.

read the letter

The paper's main contribution is QATS, a divide-and-conquer procedure that restricts local paths to at most three segments, stores data via cumulative sums, and uses adaptive search to pick the next segment. This setup is distinct from the linear-time Viterbi and PMAP methods referenced in the abstract, and the R package on GitHub is a concrete plus for anyone who wants to try it on long sequences with small state spaces. The simulations are said to show speed gains and some precision checks, which is useful evidence even if the details are thin in the abstract. The central claim is that the output stays admissible while sequentially maximizing local scores, yet the maximization itself is only approximate. No error bound, convergence rate, or sketch appears in the abstract to show why the approximation preserves the required properties or the stated complexity. That gap is load-bearing: if the adaptive step can miss the true local max by more than a controlled amount, both admissibility and the polylog argument lose their footing. The work is aimed at statisticians and bioinformaticians who routinely decode long chains with few states and would value any practical speedup. A reader who already knows the Viterbi literature can extract the algorithmic idea quickly. It is worth sending to referees so they can check the missing derivations and the approximation analysis; the construction is clear enough to merit that step.

Referee Report

3 major / 2 minor

Summary. The paper introduces Quick Adaptive Ternary Segmentation (QATS), a divide-and-conquer decoding procedure for hidden Markov models. It claims computational complexity that is polylogarithmic in sequence length and cubic in state-space size, achieved via an approximate adaptive search that produces state sequences sequentially maximizing local likelihoods among paths with at most three segments while remaining admissible. The method also proposes cumulative-sum storage; simulations compare speed and precision against Viterbi and PMAP, and an R implementation is provided.

Significance. If the complexity bound and admissibility property under approximation can be established, the algorithm would offer a substantial practical advance for large-scale HMM decoding with small state spaces, together with an efficient storage representation. The reported simulations already indicate clear runtime gains, but the absence of the supporting derivations limits assessment of whether the central theoretical claims hold.

major comments (3)

[Abstract] Abstract: the statement that QATS 'sequentially maximizes local likelihood scores among all local paths with at most three segments, and is meanwhile admissible' is made even though 'the maximization is performed only approximately using an adaptive search procedure.' No error bound, convergence rate, or proof sketch is supplied showing that the approximation preserves both the local-maximization property and admissibility; this is load-bearing for the central claim.
[Abstract] Abstract: the claimed polylogarithmic complexity in sequence length is asserted without any derivation, recurrence relation, or proof sketch. Because the divide-and-conquer structure and the cost of the local three-segment search are the only ingredients that could produce this bound, the absence of the argument leaves the complexity claim unsupported.
[Simulations] Simulations section (implied by abstract): the precision analysis is described only at a high level; no explicit statement is given of how admissibility or closeness to exact dynamic programming is quantified, nor are error-bar details or replication counts supplied. This weakens the empirical support for the claim that the approximate procedure retains the stated precision.

minor comments (2)

[Abstract] The abstract mentions 'specific cumulative sums' for data storage but does not indicate the exact recurrence or memory layout used; a short explicit formula would clarify the implementation.
The R-package name and GitHub link are given, but no statement appears on whether the released code reproduces the reported simulation figures or includes the exact parameter settings used.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the careful review and constructive feedback. We address each major comment below and indicate planned revisions to strengthen the presentation of the theoretical claims and empirical details.

read point-by-point responses

Referee: [Abstract] Abstract: the statement that QATS 'sequentially maximizes local likelihood scores among all local paths with at most three segments, and is meanwhile admissible' is made even though 'the maximization is performed only approximately using an adaptive search procedure.' No error bound, convergence rate, or proof sketch is supplied showing that the approximation preserves both the local-maximization property and admissibility; this is load-bearing for the central claim.

Authors: We agree the abstract phrasing risks overstating the exactness of the local maximization. The adaptive ternary search is constructed to locate the admissible three-segment path with highest local likelihood, but we acknowledge that no formal argument is supplied showing the approximation preserves the claimed properties. We will revise the abstract for precision and add a short discussion of admissibility preservation in the methods. revision: yes
Referee: [Abstract] Abstract: the claimed polylogarithmic complexity in sequence length is asserted without any derivation, recurrence relation, or proof sketch. Because the divide-and-conquer structure and the cost of the local three-segment search are the only ingredients that could produce this bound, the absence of the argument leaves the complexity claim unsupported.

Authors: The claimed bound follows from the recursive divide-and-conquer yielding logarithmic depth with cubic (in state-space size) work per level. We recognize that an explicit recurrence or derivation is missing from the manuscript. We will insert a dedicated paragraph or short appendix deriving the complexity in the revision. revision: yes
Referee: [Simulations] Simulations section (implied by abstract): the precision analysis is described only at a high level; no explicit statement is given of how admissibility or closeness to exact dynamic programming is quantified, nor are error-bar details or replication counts supplied. This weakens the empirical support for the claim that the approximate procedure retains the stated precision.

Authors: We concur that the simulations section would be strengthened by explicit definitions of the precision metrics (e.g., path agreement with Viterbi), replication counts, and variability measures. We will expand the section with these details and any necessary additional figures or tables. revision: yes

Circularity Check

0 steps flagged

No circularity detected; algorithmic claims independent of inputs

full rationale

The paper presents QATS as a divide-and-conquer algorithmic construction whose polylogarithmic complexity and admissibility properties are asserted to follow from the ternary segmentation procedure and its local maximization step. No equations or definitions reduce a claimed output to a fitted parameter or self-referential input by construction. No self-citations appear as load-bearing premises for uniqueness or ansatz choices. The explicit note that maximization is approximate does not create a definitional loop; it is an implementation detail whose accuracy is addressed via simulations rather than by redefining the target quantity. The derivation chain is self-contained as a standard algorithmic proposal.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review reveals no explicit free parameters, mathematical axioms, or invented entities; the method is presented as a self-contained algorithmic construction whose internal thresholds for adaptive search are not quantified here.

pith-pipeline@v0.9.0 · 5762 in / 1189 out tokens · 26141 ms · 2026-05-24T08:52:08.050013+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean (Jcost uniqueness), Foundation/AlexanderDuality.lean (D=3 forcing) reality_from_one_distinction; washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

the estimated sequence of states sequentially maximizes local likelihood scores among all local paths with at most three segments, and is meanwhile admissible. The maximization is performed only approximately using an adaptive search procedure.

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Robust inference of cooperative behaviour of multiple ion channels in voltage-clamp recordings
stat.ME 2024-03 unverdicted novelty 6.0

IDC combines idealisation, discretisation and a minimum-distance estimator on coupled Markov models to infer cooperativity from ensemble voltage-clamp currents, with an asymptotic consistency guarantee, and finds gram...

Reference graph

Works this paper leans on

46 extracted references · 46 canonical work pages · cited by 1 Pith paper · 1 internal anchor

[1]

and Tzamos, C

Backurs, A. and Tzamos, C. (2017). Improving V iterbi is hard: Better runtimes imply faster clique algorithms. In Proceedings of the 34th International Conference on Machine Learning (D. Precup and Y. W. Teh, eds.), vol. 70 of Proceedings of Machine Learning Research. PMLR. ://proceedings.mlr.press/v70/backurs17a.html

work page 2017
[2]

Bai, J. (1997). Estimating multiple breaks one at a time. Econ. Theory 13 315--352. ://doi.org/10.1017/S0266466600005831

work page doi:10.1017/s0266466600005831 1997
[3]

Ball, F. G. and Rice, J. A. (1992). Stochastic models for ion channels: Introduction and bibliography. Math. Biosci. 112 189--206. ://doi.org/10.1016/0025-5564(92)90023-p

work page doi:10.1016/0025-5564(92)90023-p 1992
[4]

Baum, L. E. (1972). An inequality and associated maximization technique in statistical estimation for probabilistic functions of M arkov processes. In Inequalities, III ( P roc. T hird S ympos., U niv. C alifornia, L os A ngeles, C alif., 1969; dedicated to the memory of T heordore S . M otzkin)

work page 1972
[5]

Baum, L. E. and Petrie, T. (1966). Statistical inference for probabilistic functions of finite state M arkov chains. Ann. Math. Statist. 37 1554--1563. ://doi.org/10.1214/aoms/1177699147

work page doi:10.1214/aoms/1177699147 1966
[6]

Baum, L. E. , Petrie, T. , Soules, G. and Weiss, N. (1970). A maximization technique occurring in the statistical analysis of probabilistic functions of M arkov chains. Ann. Math. Statist. 41 164--171. ://doi.org/10.1214/aoms/1177697196

work page doi:10.1214/aoms/1177697196 1970
[7]

Blelloch, G. E. (1989). Scans as primitive parallel operations. IEEE Transactions on Computers 38 1526--1538. ://doi.org/10.1109/12.42122

work page doi:10.1109/12.42122 1989
[8]

, Langrock, R

Bulla, J. , Langrock, R. and Maruotti, A. (2019). Guest editor's introduction to the special issue on `` H idden M arkov models: theory and applications''. Metron 77 63--66. ://doi.org/10.1007/s40300-019-00157-2

work page doi:10.1007/s40300-019-00157-2 2019
[9]

, Farina, G

Cairo, M. , Farina, G. and Rizzi, R. (2016). Decoding hidden M arkov models faster than V iterbi via online matrix-vector (max, +)-multiplication. In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence (AAAI-16). Association for the Advancement of Artificial Intelligence, The AAAI Press, Palo Alto, California. ://dl.acm.org/doi/abs/10....

work page doi:10.5555/3016100.3016106 2016
[10]

, Moulines, E

Capp\' e , O. , Moulines, E. and Ryd\' e n, T. (2005). Inference in Hidden M arkov Models . Springer Series in Statistics, Springer, New York. ://doi.org/10.1007/0-387-28982-8

work page doi:10.1007/0-387-28982-8 2005
[11]

Carvalho, L. E. and Lawrence, C. E. (2008). Centroid estimation in discrete high-dimensional spaces with applications in biology. Proc. Natl. Acad. Sci. U.S.A. 105 3209--3214. ://doi.org/10.1073/pnas.0712329105

work page doi:10.1073/pnas.0712329105 2008
[12]

, Eddy, S

Durbin, R. , Eddy, S. R. , Krogh, A. and Mitchison, G. (1998). Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids. Cambridge University Press. ://doi.org/10.1017/CBO9780511790492

work page doi:10.1017/cbo9780511790492 1998
[13]

Eddelbuettel, D. (2013). Seamless R and C++ Integration with Rcpp . Springer, New York. ISBN 978-1-4614-6867-7. ://www.doi.org/10.1007/978-1-4614-6868-4

work page doi:10.1007/978-1-4614-6868-4 2013
[14]

Four transformations on the Catalan triangle

Eddelbuettel, D. and Balamuta, J. J. (2018). Extending R with C++ : A brief introduction to Rcpp . The American Statistician 72 28--36. ://www.doi.org/10.1080/00031305.2017.1375990

work page internal anchor Pith review Pith/arXiv arXiv doi:10.1080/00031305.2017.1375990 2018
[15]

and Fran c ois, R

Eddelbuettel, D. and Fran c ois, R. (2011). Rcpp : Seamless R and C++ integration. Journal of Statistical Software 40 1--18. ://www.doi.org/10.18637/jss.v040.i08

work page doi:10.18637/jss.v040.i08 2011
[16]

and Sanderson, C

Eddelbuettel, D. and Sanderson, C. (2014). RcppArmadillo : Accelerating R with high-performance C++ linear algebra. Computational Statistics and Data Analysis 71 1054--1063. ://dx.doi.org/10.1016/j.csda.2013.02.005

work page doi:10.1016/j.csda.2013.02.005 2014
[17]

and Merhav, N

Ephraim, Y. and Merhav, N. (2002). Hidden M arkov processes. In Special issue on S hannon theory: perspective, trends, and applications (H. J. Landau, J. E. Mazo, S. Shamai and J. Ziv, eds.), vol. 48, no. 6. Institute of Electrical and Electronics Engineers, Inc. (IEEE), Piscataway, NJ, 1518--1569. ://doi.org/10.1109/TIT.2002.1003838

work page doi:10.1109/tit.2002.1003838 2002
[18]

and Radicioni, D

Esposito, R. and Radicioni, D. P. (2009). CarpeDiem : O ptimizing the V iterbi algorithm and applications to supervised sequential learning. J. Mach. Learn. Res. 10 1851--1880. ://dl.acm.org/doi/10.5555/1577069.1755847

work page doi:10.5555/1577069.1755847 2009
[19]

, Martelli, P

Fariselli, P. , Martelli, P. L. and Casadio, R. (2005). A new decoding algorithm for hidden M arkov models improves the prediction of the topology of all-beta membrane proteins. BMC Bioinformatics 6. ://doi.org/10.1186/1471-2105-6-S4-S12

work page doi:10.1186/1471-2105-6-s4-s12 2005
[20]

Forney, G. D., Jr. (1973). The V iterbi algorithm. Proc. IEEE 61 268--278. ://doi.org/10.1109/PROC.1973.9030

work page doi:10.1109/proc.1973.9030 1973
[21]

Fr\" u hwirth-Schnatter, S. (2006). Finite M ixture and M arkov S witching M odels . Springer Series in Statistics, Springer, New York

work page 2006
[22]

and Young, S

Gales, M. and Young, S. (2008). The application of hidden markov models in speech recognition. Foundations and Trends in Signal Processing 1 195--304. ://dx.doi.org/10.1561/2000000004

work page doi:10.1561/2000000004 2008
[23]

, Hochberg, M

Gotoh, Y. , Hochberg, M. M. and Silverman, H. F. (1998). Efficient training algorithms for HMMs using incremental estimation. IEEE Trans. Speech Audio Processing 6 539--548. ://doi.org/10.1109/89.725320

work page doi:10.1109/89.725320 1998
[24]

Hamilton, J. D. (1989). A new approach to the economic analysis of nonstationary time series and the business cycle. Econometrica 57 357--384. ://doi.org/10.2307/1912559

work page doi:10.2307/1912559 1989
[25]

Hassan, S. S. , S\" a rkk\" a , S. and Garc\' i a-Fern\' a ndez, A. F. (2021). Temporal parallelization of inference in hidden M arkov models. IEEE Trans. Signal Process. 69 4875--4887. ://doi.org/10.1109/TSP.2021.3103338

work page doi:10.1109/tsp.2021.3103338 2021
[26]

, Fujiwara, Y

Kaji, N. , Fujiwara, Y. , Yoshinaga, N. and Kitsuregawa, M. (2010). Efficient staggered decoding for sequence labeling. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Uppsala, Sweden. ://aclanthology.org/P10-1050

work page 2010
[27]

Karplus, K. (2009). SAM-T08, HMM-based protein structure prediction . Nucleic Acids Research 37 W492--W497. ://doi.org/10.1093/nar/gkp403

work page doi:10.1093/nar/gkp403 2009
[28]

Kiefer, J. (1953). Sequential minimax search for a maximum. Proc. Amer. Math. Soc. 4 502--506. ://doi.org/10.2307/2032161

work page doi:10.2307/2032161 1953
[29]

Kov\'acs, S. , Li, H. , Haubner, L. , Munk, A. and B\"uhlmann, P. (2020). Optimistic search strategy: Change point detection for large-scale data via adaptive logarithmic queries. Preprint, arXiv:2010.10194. ://arxiv.org/abs/2010.10194

work page arXiv 2020
[30]

Ladner, R. E. and Fischer, M. J. (1980). Parallel prefix computation. J. Assoc. Comput. Mach. 27 831--838. ://doi.org/10.1145/322217.322232

work page doi:10.1145/322217.322232 1980
[31]

and Koloydenko, A

Lember, J. and Koloydenko, A. A. (2014). Bridging V iterbi and posterior decoding: A generalized risk approach to hidden path inference based on hidden M arkov models. J. Mach. Learn. Res. 15 1--58. ://dl.acm.org/doi/10.5555/2627435.2627436

work page doi:10.5555/2627435.2627436 2014
[32]

and Kline, J

Levin, B. and Kline, J. (1985). The cusum test of homogeneity with an application in spontaneous abortion epidemiology. Statistics in Medicine 4 469--488. ://doi.org/10.1002/sim.4780040408

work page doi:10.1002/sim.4780040408 1985
[33]

, Mozes, S

Lifshits, Y. , Mozes, S. , Weimann, O. and Ziv-Ukelson, M. (2009). Speeding up HMM decoding and training by exploiting sequence repetitions. Algorithmica 54 379--399. ://doi.org/10.1007/s00453-007-9128-0

work page doi:10.1007/s00453-007-9128-0 2009
[34]

, Garhwal, S

Mor, B. , Garhwal, S. and Kumar, A. (2021). A systematic review of hidden M arkov models and their applications. Arch. Comput. Methods Eng. 28 1429--1448. ://doi.org/10.1007/s11831-020-09422-4

work page doi:10.1007/s11831-020-09422-4 2021
[35]

Niu, Y. S. , Hao, N. and Zhang, H. (2016). Multiple change-point detection: A selective overview. Statist. Sci. 31 611--623. ://doi.org/10.1214/16-STS587

work page doi:10.1214/16-sts587 2016
[36]

Olshen, A. B. , Venkatraman, E. S. , Lucito, R. and Wigler, M. (2004). Circular binary segmentation for the analysis of array-based DNA copy number data. Biostatistics 5 557--572. ://doi.org/10.1093/biostatistics/kxh008

work page doi:10.1093/biostatistics/kxh008 2004
[37]

Page, E. S. (1955). A test for a change in a parameter occurring at an unknown point. Biometrika 42 523--527. ://doi.org/10.1093/biomet/42.3-4.523

work page doi:10.1093/biomet/42.3-4.523 1955
[38]

, Bartsch, A

Pein, F. , Bartsch, A. , Steinem, C. and Munk, A. (2021). Heterogeneous idealization of ion channel recordings – open channel noise. IEEE Transactions on NanoBioscience 20 57--78

work page 2021
[39]

R: A Language and Environment for Statistical Computing

R Core Team (2022). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. ://www.R-project.org/

work page 2022
[40]

Rabiner, L. R. (1989). A tutorial on hidden M arkov models and selected applications in speech recognition. Proceedings of the IEEE 77 257--286. ://doi.org/10.1109/5.18626

work page doi:10.1109/5.18626 1989
[41]

and Curtin, R

Sanderson, C. and Curtin, R. (2016). Armadillo: a template-based C ++ library for linear algebra. Journal of Open Source Software 1 26. ://doi.org/10.21105/joss.00026

work page doi:10.21105/joss.00026 2016
[42]

and Curtin, R

Sanderson, C. and Curtin, R. (2018). A user-friendly hybrid sparse matrix class in C ++. In Mathematical Software -- ICMS 2018 (J. H. Davenport, M. Kauers, G. Labahn and J. Urban, eds.). Springer International Publishing, Cham. ://doi.org/10.1007/978-3-319-96418-8_50

work page doi:10.1007/978-3-319-96418-8_50 2018
[43]

Scott, A. J. and Knott, M. (1974). A cluster analysis method for grouping means in the analysis of variance. Biometrics 507--512. ://doi.org/10.2307/2529204

work page doi:10.2307/2529204 1974
[44]

Titsias, M. K. , Holmes, C. C. and Yau, C. (2016). Statistical inference in hidden M arkov models using k -segment constraints. J. Amer. Statist. Assoc. 111 200--215. ://doi.org/10.1080/01621459.2014.998762

work page doi:10.1080/01621459.2014.998762 2016
[45]

, Oudre, L

Truong, C. , Oudre, L. and Vayatis, N. (2020). Selective review of offline change point detection methods. Signal Process. 167 107299. ://doi.org/10.1016/j.sigpro.2019.107299

work page doi:10.1016/j.sigpro.2019.107299 2020
[46]

Viterbi, A. (1967). Error bounds for convolutional codes and an asymptotically optimum decoding algorithm. IEEE Trans. Inform. Theory 13 260--269. ://doi.org/10.1109/TIT.1967.1054010

work page doi:10.1109/tit.1967.1054010 1967

[1] [1]

and Tzamos, C

Backurs, A. and Tzamos, C. (2017). Improving V iterbi is hard: Better runtimes imply faster clique algorithms. In Proceedings of the 34th International Conference on Machine Learning (D. Precup and Y. W. Teh, eds.), vol. 70 of Proceedings of Machine Learning Research. PMLR. ://proceedings.mlr.press/v70/backurs17a.html

work page 2017

[2] [2]

Bai, J. (1997). Estimating multiple breaks one at a time. Econ. Theory 13 315--352. ://doi.org/10.1017/S0266466600005831

work page doi:10.1017/s0266466600005831 1997

[3] [3]

Ball, F. G. and Rice, J. A. (1992). Stochastic models for ion channels: Introduction and bibliography. Math. Biosci. 112 189--206. ://doi.org/10.1016/0025-5564(92)90023-p

work page doi:10.1016/0025-5564(92)90023-p 1992

[4] [4]

Baum, L. E. (1972). An inequality and associated maximization technique in statistical estimation for probabilistic functions of M arkov processes. In Inequalities, III ( P roc. T hird S ympos., U niv. C alifornia, L os A ngeles, C alif., 1969; dedicated to the memory of T heordore S . M otzkin)

work page 1972

[5] [5]

Baum, L. E. and Petrie, T. (1966). Statistical inference for probabilistic functions of finite state M arkov chains. Ann. Math. Statist. 37 1554--1563. ://doi.org/10.1214/aoms/1177699147

work page doi:10.1214/aoms/1177699147 1966

[6] [6]

Baum, L. E. , Petrie, T. , Soules, G. and Weiss, N. (1970). A maximization technique occurring in the statistical analysis of probabilistic functions of M arkov chains. Ann. Math. Statist. 41 164--171. ://doi.org/10.1214/aoms/1177697196

work page doi:10.1214/aoms/1177697196 1970

[7] [7]

Blelloch, G. E. (1989). Scans as primitive parallel operations. IEEE Transactions on Computers 38 1526--1538. ://doi.org/10.1109/12.42122

work page doi:10.1109/12.42122 1989

[8] [8]

, Langrock, R

Bulla, J. , Langrock, R. and Maruotti, A. (2019). Guest editor's introduction to the special issue on `` H idden M arkov models: theory and applications''. Metron 77 63--66. ://doi.org/10.1007/s40300-019-00157-2

work page doi:10.1007/s40300-019-00157-2 2019

[9] [9]

, Farina, G

Cairo, M. , Farina, G. and Rizzi, R. (2016). Decoding hidden M arkov models faster than V iterbi via online matrix-vector (max, +)-multiplication. In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence (AAAI-16). Association for the Advancement of Artificial Intelligence, The AAAI Press, Palo Alto, California. ://dl.acm.org/doi/abs/10....

work page doi:10.5555/3016100.3016106 2016

[10] [10]

, Moulines, E

Capp\' e , O. , Moulines, E. and Ryd\' e n, T. (2005). Inference in Hidden M arkov Models . Springer Series in Statistics, Springer, New York. ://doi.org/10.1007/0-387-28982-8

work page doi:10.1007/0-387-28982-8 2005

[11] [11]

Carvalho, L. E. and Lawrence, C. E. (2008). Centroid estimation in discrete high-dimensional spaces with applications in biology. Proc. Natl. Acad. Sci. U.S.A. 105 3209--3214. ://doi.org/10.1073/pnas.0712329105

work page doi:10.1073/pnas.0712329105 2008

[12] [12]

, Eddy, S

Durbin, R. , Eddy, S. R. , Krogh, A. and Mitchison, G. (1998). Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids. Cambridge University Press. ://doi.org/10.1017/CBO9780511790492

work page doi:10.1017/cbo9780511790492 1998

[13] [13]

Eddelbuettel, D. (2013). Seamless R and C++ Integration with Rcpp . Springer, New York. ISBN 978-1-4614-6867-7. ://www.doi.org/10.1007/978-1-4614-6868-4

work page doi:10.1007/978-1-4614-6868-4 2013

[14] [14]

Four transformations on the Catalan triangle

Eddelbuettel, D. and Balamuta, J. J. (2018). Extending R with C++ : A brief introduction to Rcpp . The American Statistician 72 28--36. ://www.doi.org/10.1080/00031305.2017.1375990

work page internal anchor Pith review Pith/arXiv arXiv doi:10.1080/00031305.2017.1375990 2018

[15] [15]

and Fran c ois, R

Eddelbuettel, D. and Fran c ois, R. (2011). Rcpp : Seamless R and C++ integration. Journal of Statistical Software 40 1--18. ://www.doi.org/10.18637/jss.v040.i08

work page doi:10.18637/jss.v040.i08 2011

[16] [16]

and Sanderson, C

Eddelbuettel, D. and Sanderson, C. (2014). RcppArmadillo : Accelerating R with high-performance C++ linear algebra. Computational Statistics and Data Analysis 71 1054--1063. ://dx.doi.org/10.1016/j.csda.2013.02.005

work page doi:10.1016/j.csda.2013.02.005 2014

[17] [17]

and Merhav, N

Ephraim, Y. and Merhav, N. (2002). Hidden M arkov processes. In Special issue on S hannon theory: perspective, trends, and applications (H. J. Landau, J. E. Mazo, S. Shamai and J. Ziv, eds.), vol. 48, no. 6. Institute of Electrical and Electronics Engineers, Inc. (IEEE), Piscataway, NJ, 1518--1569. ://doi.org/10.1109/TIT.2002.1003838

work page doi:10.1109/tit.2002.1003838 2002

[18] [18]

and Radicioni, D

Esposito, R. and Radicioni, D. P. (2009). CarpeDiem : O ptimizing the V iterbi algorithm and applications to supervised sequential learning. J. Mach. Learn. Res. 10 1851--1880. ://dl.acm.org/doi/10.5555/1577069.1755847

work page doi:10.5555/1577069.1755847 2009

[19] [19]

, Martelli, P

Fariselli, P. , Martelli, P. L. and Casadio, R. (2005). A new decoding algorithm for hidden M arkov models improves the prediction of the topology of all-beta membrane proteins. BMC Bioinformatics 6. ://doi.org/10.1186/1471-2105-6-S4-S12

work page doi:10.1186/1471-2105-6-s4-s12 2005

[20] [20]

Forney, G. D., Jr. (1973). The V iterbi algorithm. Proc. IEEE 61 268--278. ://doi.org/10.1109/PROC.1973.9030

work page doi:10.1109/proc.1973.9030 1973

[21] [21]

Fr\" u hwirth-Schnatter, S. (2006). Finite M ixture and M arkov S witching M odels . Springer Series in Statistics, Springer, New York

work page 2006

[22] [22]

and Young, S

Gales, M. and Young, S. (2008). The application of hidden markov models in speech recognition. Foundations and Trends in Signal Processing 1 195--304. ://dx.doi.org/10.1561/2000000004

work page doi:10.1561/2000000004 2008

[23] [23]

, Hochberg, M

Gotoh, Y. , Hochberg, M. M. and Silverman, H. F. (1998). Efficient training algorithms for HMMs using incremental estimation. IEEE Trans. Speech Audio Processing 6 539--548. ://doi.org/10.1109/89.725320

work page doi:10.1109/89.725320 1998

[24] [24]

Hamilton, J. D. (1989). A new approach to the economic analysis of nonstationary time series and the business cycle. Econometrica 57 357--384. ://doi.org/10.2307/1912559

work page doi:10.2307/1912559 1989

[25] [25]

Hassan, S. S. , S\" a rkk\" a , S. and Garc\' i a-Fern\' a ndez, A. F. (2021). Temporal parallelization of inference in hidden M arkov models. IEEE Trans. Signal Process. 69 4875--4887. ://doi.org/10.1109/TSP.2021.3103338

work page doi:10.1109/tsp.2021.3103338 2021

[26] [26]

, Fujiwara, Y

Kaji, N. , Fujiwara, Y. , Yoshinaga, N. and Kitsuregawa, M. (2010). Efficient staggered decoding for sequence labeling. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Uppsala, Sweden. ://aclanthology.org/P10-1050

work page 2010

[27] [27]

Karplus, K. (2009). SAM-T08, HMM-based protein structure prediction . Nucleic Acids Research 37 W492--W497. ://doi.org/10.1093/nar/gkp403

work page doi:10.1093/nar/gkp403 2009

[28] [28]

Kiefer, J. (1953). Sequential minimax search for a maximum. Proc. Amer. Math. Soc. 4 502--506. ://doi.org/10.2307/2032161

work page doi:10.2307/2032161 1953

[29] [29]

Kov\'acs, S. , Li, H. , Haubner, L. , Munk, A. and B\"uhlmann, P. (2020). Optimistic search strategy: Change point detection for large-scale data via adaptive logarithmic queries. Preprint, arXiv:2010.10194. ://arxiv.org/abs/2010.10194

work page arXiv 2020

[30] [30]

Ladner, R. E. and Fischer, M. J. (1980). Parallel prefix computation. J. Assoc. Comput. Mach. 27 831--838. ://doi.org/10.1145/322217.322232

work page doi:10.1145/322217.322232 1980

[31] [31]

and Koloydenko, A

Lember, J. and Koloydenko, A. A. (2014). Bridging V iterbi and posterior decoding: A generalized risk approach to hidden path inference based on hidden M arkov models. J. Mach. Learn. Res. 15 1--58. ://dl.acm.org/doi/10.5555/2627435.2627436

work page doi:10.5555/2627435.2627436 2014

[32] [32]

and Kline, J

Levin, B. and Kline, J. (1985). The cusum test of homogeneity with an application in spontaneous abortion epidemiology. Statistics in Medicine 4 469--488. ://doi.org/10.1002/sim.4780040408

work page doi:10.1002/sim.4780040408 1985

[33] [33]

, Mozes, S

Lifshits, Y. , Mozes, S. , Weimann, O. and Ziv-Ukelson, M. (2009). Speeding up HMM decoding and training by exploiting sequence repetitions. Algorithmica 54 379--399. ://doi.org/10.1007/s00453-007-9128-0

work page doi:10.1007/s00453-007-9128-0 2009

[34] [34]

, Garhwal, S

Mor, B. , Garhwal, S. and Kumar, A. (2021). A systematic review of hidden M arkov models and their applications. Arch. Comput. Methods Eng. 28 1429--1448. ://doi.org/10.1007/s11831-020-09422-4

work page doi:10.1007/s11831-020-09422-4 2021

[35] [35]

Niu, Y. S. , Hao, N. and Zhang, H. (2016). Multiple change-point detection: A selective overview. Statist. Sci. 31 611--623. ://doi.org/10.1214/16-STS587

work page doi:10.1214/16-sts587 2016

[36] [36]

Olshen, A. B. , Venkatraman, E. S. , Lucito, R. and Wigler, M. (2004). Circular binary segmentation for the analysis of array-based DNA copy number data. Biostatistics 5 557--572. ://doi.org/10.1093/biostatistics/kxh008

work page doi:10.1093/biostatistics/kxh008 2004

[37] [37]

Page, E. S. (1955). A test for a change in a parameter occurring at an unknown point. Biometrika 42 523--527. ://doi.org/10.1093/biomet/42.3-4.523

work page doi:10.1093/biomet/42.3-4.523 1955

[38] [38]

, Bartsch, A

Pein, F. , Bartsch, A. , Steinem, C. and Munk, A. (2021). Heterogeneous idealization of ion channel recordings – open channel noise. IEEE Transactions on NanoBioscience 20 57--78

work page 2021

[39] [39]

R: A Language and Environment for Statistical Computing

R Core Team (2022). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. ://www.R-project.org/

work page 2022

[40] [40]

Rabiner, L. R. (1989). A tutorial on hidden M arkov models and selected applications in speech recognition. Proceedings of the IEEE 77 257--286. ://doi.org/10.1109/5.18626

work page doi:10.1109/5.18626 1989

[41] [41]

and Curtin, R

Sanderson, C. and Curtin, R. (2016). Armadillo: a template-based C ++ library for linear algebra. Journal of Open Source Software 1 26. ://doi.org/10.21105/joss.00026

work page doi:10.21105/joss.00026 2016

[42] [42]

and Curtin, R

Sanderson, C. and Curtin, R. (2018). A user-friendly hybrid sparse matrix class in C ++. In Mathematical Software -- ICMS 2018 (J. H. Davenport, M. Kauers, G. Labahn and J. Urban, eds.). Springer International Publishing, Cham. ://doi.org/10.1007/978-3-319-96418-8_50

work page doi:10.1007/978-3-319-96418-8_50 2018

[43] [43]

Scott, A. J. and Knott, M. (1974). A cluster analysis method for grouping means in the analysis of variance. Biometrics 507--512. ://doi.org/10.2307/2529204

work page doi:10.2307/2529204 1974

[44] [44]

Titsias, M. K. , Holmes, C. C. and Yau, C. (2016). Statistical inference in hidden M arkov models using k -segment constraints. J. Amer. Statist. Assoc. 111 200--215. ://doi.org/10.1080/01621459.2014.998762

work page doi:10.1080/01621459.2014.998762 2016

[45] [45]

, Oudre, L

Truong, C. , Oudre, L. and Vayatis, N. (2020). Selective review of offline change point detection methods. Signal Process. 167 107299. ://doi.org/10.1016/j.sigpro.2019.107299

work page doi:10.1016/j.sigpro.2019.107299 2020

[46] [46]

Viterbi, A. (1967). Error bounds for convolutional codes and an asymptotically optimum decoding algorithm. IEEE Trans. Inform. Theory 13 260--269. ://doi.org/10.1109/TIT.1967.1054010

work page doi:10.1109/tit.1967.1054010 1967