Quick Adaptive Ternary Segmentation: An Efficient Decoding Procedure For Hidden Markov Models
Pith reviewed 2026-05-24 08:52 UTC · model grok-4.3
The pith
QATS is a polylog-time decoding method for HMMs that uses adaptive ternary segmentation to maximize local likelihoods while staying admissible.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
QATS is a divide-and-conquer procedure with computational complexity polylogarithmic in the length of the sequence, and cubic in the size of the state space. The estimated sequence of states sequentially maximizes local likelihood scores among all local paths with at most three segments, and is meanwhile admissible. The maximization is performed only approximately using an adaptive search procedure. It also suggests an effective way of data storage as specific cumulative sums.
What carries the argument
Quick Adaptive Ternary Segmentation (QATS), a divide-and-conquer procedure that finds admissible state sequences by local maximization over paths with at most three segments via approximate adaptive search.
If this is right
- Enables decoding of HMMs with much longer sequences than linear-time methods allow.
- Allows efficient storage of data through specific cumulative sums.
- Maintains admissibility of the estimated state sequence.
- Provides speedups in simulations compared to Viterbi and PMAP with similar precision.
Where Pith is reading between the lines
- The method's efficiency for few-state models could encourage HMM use in domains with massive datasets like genomics.
- Parallelization of the divide-and-conquer steps might yield additional speed gains not explored in the paper.
- The approximate nature suggests QATS could be tuned for different accuracy-speed tradeoffs.
Load-bearing premise
The adaptive search procedure used for approximate local maximization is sufficiently accurate that the resulting sequence remains admissible and retains the claimed precision relative to exact methods.
What would settle it
A direct comparison on a simulated long sequence where the QATS estimate is checked for admissibility against an exact decoder or where its likelihood is substantially lower.
Figures
read the original abstract
Hidden Markov models (HMMs) are characterized by an unobservable Markov chain and an observable process -- a noisy version of the hidden chain. Decoding the original signal from the noisy observations is one of the main goals in nearly all HMM based data analyses. Existing decoding algorithms such as Viterbi and the pointwise maximum a posteriori (PMAP) algorithm have computational complexity at best linear in the length of the observed sequence, and sub-quadratic in the size of the state space of the hidden chain. We present Quick Adaptive Ternary Segmentation (QATS), a divide-and-conquer procedure with computational complexity polylogarithmic in the length of the sequence, and cubic in the size of the state space, hence particularly suited for large scale HMMs with relatively few states. It also suggests an effective way of data storage as specific cumulative sums. In essence, the estimated sequence of states sequentially maximizes local likelihood scores among all local paths with at most three segments, and is meanwhile admissible. The maximization is performed only approximately using an adaptive search procedure. Our simulations demonstrate the speedups offered by QATS in comparison to Viterbi and PMAP, along with a precision analysis. An implementation of QATS is in the R-package QATS on GitHub.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces Quick Adaptive Ternary Segmentation (QATS), a divide-and-conquer decoding procedure for hidden Markov models. It claims computational complexity that is polylogarithmic in sequence length and cubic in state-space size, achieved via an approximate adaptive search that produces state sequences sequentially maximizing local likelihoods among paths with at most three segments while remaining admissible. The method also proposes cumulative-sum storage; simulations compare speed and precision against Viterbi and PMAP, and an R implementation is provided.
Significance. If the complexity bound and admissibility property under approximation can be established, the algorithm would offer a substantial practical advance for large-scale HMM decoding with small state spaces, together with an efficient storage representation. The reported simulations already indicate clear runtime gains, but the absence of the supporting derivations limits assessment of whether the central theoretical claims hold.
major comments (3)
- [Abstract] Abstract: the statement that QATS 'sequentially maximizes local likelihood scores among all local paths with at most three segments, and is meanwhile admissible' is made even though 'the maximization is performed only approximately using an adaptive search procedure.' No error bound, convergence rate, or proof sketch is supplied showing that the approximation preserves both the local-maximization property and admissibility; this is load-bearing for the central claim.
- [Abstract] Abstract: the claimed polylogarithmic complexity in sequence length is asserted without any derivation, recurrence relation, or proof sketch. Because the divide-and-conquer structure and the cost of the local three-segment search are the only ingredients that could produce this bound, the absence of the argument leaves the complexity claim unsupported.
- [Simulations] Simulations section (implied by abstract): the precision analysis is described only at a high level; no explicit statement is given of how admissibility or closeness to exact dynamic programming is quantified, nor are error-bar details or replication counts supplied. This weakens the empirical support for the claim that the approximate procedure retains the stated precision.
minor comments (2)
- [Abstract] The abstract mentions 'specific cumulative sums' for data storage but does not indicate the exact recurrence or memory layout used; a short explicit formula would clarify the implementation.
- The R-package name and GitHub link are given, but no statement appears on whether the released code reproduces the reported simulation figures or includes the exact parameter settings used.
Simulated Author's Rebuttal
We thank the referee for the careful review and constructive feedback. We address each major comment below and indicate planned revisions to strengthen the presentation of the theoretical claims and empirical details.
read point-by-point responses
-
Referee: [Abstract] Abstract: the statement that QATS 'sequentially maximizes local likelihood scores among all local paths with at most three segments, and is meanwhile admissible' is made even though 'the maximization is performed only approximately using an adaptive search procedure.' No error bound, convergence rate, or proof sketch is supplied showing that the approximation preserves both the local-maximization property and admissibility; this is load-bearing for the central claim.
Authors: We agree the abstract phrasing risks overstating the exactness of the local maximization. The adaptive ternary search is constructed to locate the admissible three-segment path with highest local likelihood, but we acknowledge that no formal argument is supplied showing the approximation preserves the claimed properties. We will revise the abstract for precision and add a short discussion of admissibility preservation in the methods. revision: yes
-
Referee: [Abstract] Abstract: the claimed polylogarithmic complexity in sequence length is asserted without any derivation, recurrence relation, or proof sketch. Because the divide-and-conquer structure and the cost of the local three-segment search are the only ingredients that could produce this bound, the absence of the argument leaves the complexity claim unsupported.
Authors: The claimed bound follows from the recursive divide-and-conquer yielding logarithmic depth with cubic (in state-space size) work per level. We recognize that an explicit recurrence or derivation is missing from the manuscript. We will insert a dedicated paragraph or short appendix deriving the complexity in the revision. revision: yes
-
Referee: [Simulations] Simulations section (implied by abstract): the precision analysis is described only at a high level; no explicit statement is given of how admissibility or closeness to exact dynamic programming is quantified, nor are error-bar details or replication counts supplied. This weakens the empirical support for the claim that the approximate procedure retains the stated precision.
Authors: We concur that the simulations section would be strengthened by explicit definitions of the precision metrics (e.g., path agreement with Viterbi), replication counts, and variability measures. We will expand the section with these details and any necessary additional figures or tables. revision: yes
Circularity Check
No circularity detected; algorithmic claims independent of inputs
full rationale
The paper presents QATS as a divide-and-conquer algorithmic construction whose polylogarithmic complexity and admissibility properties are asserted to follow from the ternary segmentation procedure and its local maximization step. No equations or definitions reduce a claimed output to a fitted parameter or self-referential input by construction. No self-citations appear as load-bearing premises for uniqueness or ansatz choices. The explicit note that maximization is approximate does not create a definitional loop; it is an implementation detail whose accuracy is addressed via simulations rather than by redefining the target quantity. The derivation chain is self-contained as a standard algorithmic proposal.
Axiom & Free-Parameter Ledger
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.lean (Jcost uniqueness), Foundation/AlexanderDuality.lean (D=3 forcing)reality_from_one_distinction; washburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
the estimated sequence of states sequentially maximizes local likelihood scores among all local paths with at most three segments, and is meanwhile admissible. The maximization is performed only approximately using an adaptive search procedure.
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Forward citations
Cited by 1 Pith paper
-
Robust inference of cooperative behaviour of multiple ion channels in voltage-clamp recordings
IDC combines idealisation, discretisation and a minimum-distance estimator on coupled Markov models to infer cooperativity from ensemble voltage-clamp currents, with an asymptotic consistency guarantee, and finds gram...
Reference graph
Works this paper leans on
-
[1]
Backurs, A. and Tzamos, C. (2017). Improving V iterbi is hard: Better runtimes imply faster clique algorithms. In Proceedings of the 34th International Conference on Machine Learning (D. Precup and Y. W. Teh, eds.), vol. 70 of Proceedings of Machine Learning Research. PMLR. ://proceedings.mlr.press/v70/backurs17a.html
work page 2017
-
[2]
Bai, J. (1997). Estimating multiple breaks one at a time. Econ. Theory 13 315--352. ://doi.org/10.1017/S0266466600005831
-
[3]
Ball, F. G. and Rice, J. A. (1992). Stochastic models for ion channels: Introduction and bibliography. Math. Biosci. 112 189--206. ://doi.org/10.1016/0025-5564(92)90023-p
-
[4]
Baum, L. E. (1972). An inequality and associated maximization technique in statistical estimation for probabilistic functions of M arkov processes. In Inequalities, III ( P roc. T hird S ympos., U niv. C alifornia, L os A ngeles, C alif., 1969; dedicated to the memory of T heordore S . M otzkin)
work page 1972
-
[5]
Baum, L. E. and Petrie, T. (1966). Statistical inference for probabilistic functions of finite state M arkov chains. Ann. Math. Statist. 37 1554--1563. ://doi.org/10.1214/aoms/1177699147
-
[6]
Baum, L. E. , Petrie, T. , Soules, G. and Weiss, N. (1970). A maximization technique occurring in the statistical analysis of probabilistic functions of M arkov chains. Ann. Math. Statist. 41 164--171. ://doi.org/10.1214/aoms/1177697196
-
[7]
Blelloch, G. E. (1989). Scans as primitive parallel operations. IEEE Transactions on Computers 38 1526--1538. ://doi.org/10.1109/12.42122
-
[8]
Bulla, J. , Langrock, R. and Maruotti, A. (2019). Guest editor's introduction to the special issue on `` H idden M arkov models: theory and applications''. Metron 77 63--66. ://doi.org/10.1007/s40300-019-00157-2
-
[9]
Cairo, M. , Farina, G. and Rizzi, R. (2016). Decoding hidden M arkov models faster than V iterbi via online matrix-vector (max, +)-multiplication. In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence (AAAI-16). Association for the Advancement of Artificial Intelligence, The AAAI Press, Palo Alto, California. ://dl.acm.org/doi/abs/10....
-
[10]
Capp\' e , O. , Moulines, E. and Ryd\' e n, T. (2005). Inference in Hidden M arkov Models . Springer Series in Statistics, Springer, New York. ://doi.org/10.1007/0-387-28982-8
-
[11]
Carvalho, L. E. and Lawrence, C. E. (2008). Centroid estimation in discrete high-dimensional spaces with applications in biology. Proc. Natl. Acad. Sci. U.S.A. 105 3209--3214. ://doi.org/10.1073/pnas.0712329105
-
[12]
Durbin, R. , Eddy, S. R. , Krogh, A. and Mitchison, G. (1998). Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids. Cambridge University Press. ://doi.org/10.1017/CBO9780511790492
-
[13]
Eddelbuettel, D. (2013). Seamless R and C++ Integration with Rcpp . Springer, New York. ISBN 978-1-4614-6867-7. ://www.doi.org/10.1007/978-1-4614-6868-4
-
[14]
Four transformations on the Catalan triangle
Eddelbuettel, D. and Balamuta, J. J. (2018). Extending R with C++ : A brief introduction to Rcpp . The American Statistician 72 28--36. ://www.doi.org/10.1080/00031305.2017.1375990
work page internal anchor Pith review Pith/arXiv arXiv doi:10.1080/00031305.2017.1375990 2018
-
[15]
Eddelbuettel, D. and Fran c ois, R. (2011). Rcpp : Seamless R and C++ integration. Journal of Statistical Software 40 1--18. ://www.doi.org/10.18637/jss.v040.i08
-
[16]
Eddelbuettel, D. and Sanderson, C. (2014). RcppArmadillo : Accelerating R with high-performance C++ linear algebra. Computational Statistics and Data Analysis 71 1054--1063. ://dx.doi.org/10.1016/j.csda.2013.02.005
-
[17]
Ephraim, Y. and Merhav, N. (2002). Hidden M arkov processes. In Special issue on S hannon theory: perspective, trends, and applications (H. J. Landau, J. E. Mazo, S. Shamai and J. Ziv, eds.), vol. 48, no. 6. Institute of Electrical and Electronics Engineers, Inc. (IEEE), Piscataway, NJ, 1518--1569. ://doi.org/10.1109/TIT.2002.1003838
-
[18]
Esposito, R. and Radicioni, D. P. (2009). CarpeDiem : O ptimizing the V iterbi algorithm and applications to supervised sequential learning. J. Mach. Learn. Res. 10 1851--1880. ://dl.acm.org/doi/10.5555/1577069.1755847
-
[19]
Fariselli, P. , Martelli, P. L. and Casadio, R. (2005). A new decoding algorithm for hidden M arkov models improves the prediction of the topology of all-beta membrane proteins. BMC Bioinformatics 6. ://doi.org/10.1186/1471-2105-6-S4-S12
-
[20]
Forney, G. D., Jr. (1973). The V iterbi algorithm. Proc. IEEE 61 268--278. ://doi.org/10.1109/PROC.1973.9030
-
[21]
Fr\" u hwirth-Schnatter, S. (2006). Finite M ixture and M arkov S witching M odels . Springer Series in Statistics, Springer, New York
work page 2006
-
[22]
Gales, M. and Young, S. (2008). The application of hidden markov models in speech recognition. Foundations and Trends in Signal Processing 1 195--304. ://dx.doi.org/10.1561/2000000004
-
[23]
Gotoh, Y. , Hochberg, M. M. and Silverman, H. F. (1998). Efficient training algorithms for HMMs using incremental estimation. IEEE Trans. Speech Audio Processing 6 539--548. ://doi.org/10.1109/89.725320
-
[24]
Hamilton, J. D. (1989). A new approach to the economic analysis of nonstationary time series and the business cycle. Econometrica 57 357--384. ://doi.org/10.2307/1912559
-
[25]
Hassan, S. S. , S\" a rkk\" a , S. and Garc\' i a-Fern\' a ndez, A. F. (2021). Temporal parallelization of inference in hidden M arkov models. IEEE Trans. Signal Process. 69 4875--4887. ://doi.org/10.1109/TSP.2021.3103338
-
[26]
Kaji, N. , Fujiwara, Y. , Yoshinaga, N. and Kitsuregawa, M. (2010). Efficient staggered decoding for sequence labeling. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Uppsala, Sweden. ://aclanthology.org/P10-1050
work page 2010
-
[27]
Karplus, K. (2009). SAM-T08, HMM-based protein structure prediction . Nucleic Acids Research 37 W492--W497. ://doi.org/10.1093/nar/gkp403
-
[28]
Kiefer, J. (1953). Sequential minimax search for a maximum. Proc. Amer. Math. Soc. 4 502--506. ://doi.org/10.2307/2032161
- [29]
-
[30]
Ladner, R. E. and Fischer, M. J. (1980). Parallel prefix computation. J. Assoc. Comput. Mach. 27 831--838. ://doi.org/10.1145/322217.322232
-
[31]
Lember, J. and Koloydenko, A. A. (2014). Bridging V iterbi and posterior decoding: A generalized risk approach to hidden path inference based on hidden M arkov models. J. Mach. Learn. Res. 15 1--58. ://dl.acm.org/doi/10.5555/2627435.2627436
-
[32]
Levin, B. and Kline, J. (1985). The cusum test of homogeneity with an application in spontaneous abortion epidemiology. Statistics in Medicine 4 469--488. ://doi.org/10.1002/sim.4780040408
-
[33]
Lifshits, Y. , Mozes, S. , Weimann, O. and Ziv-Ukelson, M. (2009). Speeding up HMM decoding and training by exploiting sequence repetitions. Algorithmica 54 379--399. ://doi.org/10.1007/s00453-007-9128-0
-
[34]
Mor, B. , Garhwal, S. and Kumar, A. (2021). A systematic review of hidden M arkov models and their applications. Arch. Comput. Methods Eng. 28 1429--1448. ://doi.org/10.1007/s11831-020-09422-4
-
[35]
Niu, Y. S. , Hao, N. and Zhang, H. (2016). Multiple change-point detection: A selective overview. Statist. Sci. 31 611--623. ://doi.org/10.1214/16-STS587
-
[36]
Olshen, A. B. , Venkatraman, E. S. , Lucito, R. and Wigler, M. (2004). Circular binary segmentation for the analysis of array-based DNA copy number data. Biostatistics 5 557--572. ://doi.org/10.1093/biostatistics/kxh008
-
[37]
Page, E. S. (1955). A test for a change in a parameter occurring at an unknown point. Biometrika 42 523--527. ://doi.org/10.1093/biomet/42.3-4.523
-
[38]
Pein, F. , Bartsch, A. , Steinem, C. and Munk, A. (2021). Heterogeneous idealization of ion channel recordings – open channel noise. IEEE Transactions on NanoBioscience 20 57--78
work page 2021
-
[39]
R: A Language and Environment for Statistical Computing
R Core Team (2022). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. ://www.R-project.org/
work page 2022
-
[40]
Rabiner, L. R. (1989). A tutorial on hidden M arkov models and selected applications in speech recognition. Proceedings of the IEEE 77 257--286. ://doi.org/10.1109/5.18626
-
[41]
Sanderson, C. and Curtin, R. (2016). Armadillo: a template-based C ++ library for linear algebra. Journal of Open Source Software 1 26. ://doi.org/10.21105/joss.00026
-
[42]
Sanderson, C. and Curtin, R. (2018). A user-friendly hybrid sparse matrix class in C ++. In Mathematical Software -- ICMS 2018 (J. H. Davenport, M. Kauers, G. Labahn and J. Urban, eds.). Springer International Publishing, Cham. ://doi.org/10.1007/978-3-319-96418-8_50
-
[43]
Scott, A. J. and Knott, M. (1974). A cluster analysis method for grouping means in the analysis of variance. Biometrics 507--512. ://doi.org/10.2307/2529204
-
[44]
Titsias, M. K. , Holmes, C. C. and Yau, C. (2016). Statistical inference in hidden M arkov models using k -segment constraints. J. Amer. Statist. Assoc. 111 200--215. ://doi.org/10.1080/01621459.2014.998762
-
[45]
Truong, C. , Oudre, L. and Vayatis, N. (2020). Selective review of offline change point detection methods. Signal Process. 167 107299. ://doi.org/10.1016/j.sigpro.2019.107299
-
[46]
Viterbi, A. (1967). Error bounds for convolutional codes and an asymptotically optimum decoding algorithm. IEEE Trans. Inform. Theory 13 260--269. ://doi.org/10.1109/TIT.1967.1054010
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.