Large deviations for possibly reducible Markov chains on discrete state spaces
Pith reviewed 2026-05-19 04:57 UTC · model grok-4.3
The pith
Markov chains on discrete spaces satisfy large deviation principles even when reducible, with possibly nonconvex rate functions.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that both the level-2 large deviation principle for the empirical occupation measure and the level-3 principle for the empirical process hold for any Markov chain on a discrete state space. The associated rate functions are lower-semicontinuous and coincide with the Donsker-Varadhan entropy precisely on the set of invariant measures, but may be strictly larger and nonconvex elsewhere. These statements follow from subadditive inequalities applied to the logarithms of the probabilities of cylinder sets generated by the chain.
What carries the argument
Subadditive arguments applied to the sequence of log-probabilities of the empirical occupation measures and processes.
Load-bearing premise
The subadditive ergodic theorem applies to the relevant sequences built from the empirical measures even when the chain is reducible.
What would settle it
A concrete finite-state reducible Markov chain in which the observed decay rate of the probability of a particular empirical measure falls outside the interval predicted by the claimed rate function.
Figures
read the original abstract
We study the large deviations of Markov chains under the sole assumption that the state space is discrete. In particular, we do not require any of the usual irreducibility and exponential tightness assumptions. Using subadditive arguments, we provide an elementary and self-contained proof of the level-2 and level-3 large deviation principles. Due to the possible reducibility of the Markov chain, the rate functions may be nonconvex and may differ, outside a specific set, from the Donsker-Varadhan entropy and other classical rate functions.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript establishes level-2 and level-3 large deviation principles for Markov chains on discrete (possibly countably infinite) state spaces under the sole assumption that the transition kernel is a Markov kernel, without requiring irreducibility or exponential tightness. It employs subadditive arguments to obtain an elementary, self-contained proof. The resulting rate functions are permitted to be nonconvex and are shown to coincide with the Donsker-Varadhan entropy only on a specific subset of the space of measures.
Significance. If the arguments close, the work meaningfully relaxes standard hypotheses in large-deviation theory for Markov processes and supplies an accessible proof technique that applies directly to reducible chains. The explicit treatment of nonconvexity and the departure from classical rate functions outside a distinguished set constitute a genuine contribution. The self-contained character of the derivation is a clear strength.
major comments (2)
- [§3] §3 (upper bound for level-2 LDP): the subadditive control on limsup (1/n) log P(μ_n ∈ F) for closed F in the weak topology is stated without a separate argument that rules out mass escape to infinity on countable spaces; for reducible chains containing transient states, this leaves open whether the bound holds uniformly or requires an auxiliary tightness estimate that the paper claims to avoid.
- [Theorem 4.3] Theorem 4.3 (rate-function identification): the assertion that the constructed rate function equals the Donsker-Varadhan functional only on a specific set is load-bearing for the novelty claim, yet the proof sketch does not exhibit an explicit example of a reducible chain where the two functionals differ on a positive-measure set, making verification of the identification step difficult.
minor comments (2)
- [§2.1] The notation for the empirical occupation measure in the reducible case could be made more explicit by distinguishing the absorbing and transient components.
- [§3.1] A short remark on the applicability of the subadditive ergodic theorem to the sequence of log-probabilities for non-irreducible kernels would improve readability.
Simulated Author's Rebuttal
We thank the referee for the careful reading and constructive comments on our manuscript. We address each major comment below with clarifications and indicate revisions where they will strengthen the exposition without altering the core arguments.
read point-by-point responses
-
Referee: [§3] §3 (upper bound for level-2 LDP): the subadditive control on limsup (1/n) log P(μ_n ∈ F) for closed F in the weak topology is stated without a separate argument that rules out mass escape to infinity on countable spaces; for reducible chains containing transient states, this leaves open whether the bound holds uniformly or requires an auxiliary tightness estimate that the paper claims to avoid.
Authors: We appreciate the referee highlighting this potential gap in the presentation. The subadditive upper bound in §3 is obtained directly from the Markov property applied to cylinder sets and the fact that the empirical measure is always a probability measure on the discrete space; any putative mass escape would violate the normalization of μ_n, which is enforced by construction. Consequently, the limsup bound holds for every closed F in the weak topology without a separate tightness step. Nevertheless, to make this reasoning fully explicit for readers concerned with transient states, we will insert a short clarifying paragraph in the revised §3. This constitutes a partial revision. revision: partial
-
Referee: [Theorem 4.3] Theorem 4.3 (rate-function identification): the assertion that the constructed rate function equals the Donsker-Varadhan functional only on a specific set is load-bearing for the novelty claim, yet the proof sketch does not exhibit an explicit example of a reducible chain where the two functionals differ on a positive-measure set, making verification of the identification step difficult.
Authors: We agree that an explicit example would facilitate verification of the identification result. Theorem 4.3 establishes equality between the subadditive rate function and the Donsker-Varadhan functional precisely on the set of measures whose support lies in recurrent classes (or, more generally, on measures invariant under the transition kernel restricted to communicating classes). Outside this set the functionals differ because the subadditive construction captures the cost of visiting transient states. In the revision we will add a concrete example: a three-state chain with two absorbing states and one transient state. For this chain the rate functions differ on any measure that places positive mass on the transient state, which has positive Lebesgue measure in the simplex. This addition will be placed immediately after the statement of Theorem 4.3. revision: yes
Circularity Check
No circularity; self-contained subadditive proof
full rationale
The paper explicitly presents its level-2/3 LDP proofs as elementary and self-contained, relying on subadditive arguments applied directly to the empirical measures of a Markov chain on a discrete space without invoking irreducibility or exponential tightness. No load-bearing self-citations, fitted parameters renamed as predictions, or reductions of the rate function to its own inputs appear in the derivation chain. The rate functions are constructed via the subadditive limit and shown to satisfy the LDP properties, with the nonconvexity and deviation from Donsker-Varadhan entropy arising as a consequence rather than an input. This is the standard case of an independent proof on a countable space.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Subadditive ergodic theorem applies to the sequence of empirical measures generated by the Markov chain
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.lean (J-cost uniqueness, Aczél classification)washburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We study the large deviations of Markov chains under the sole assumption that the state space is discrete... Using subadditive arguments, we provide an elementary and self-contained proof of the level-2 and level-3 large deviation principles. Due to the possible reducibility... the rate functions may be nonconvex...
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Forward citations
Cited by 1 Pith paper
-
Large deviations for non-irreducible Markov chains on Euclidean spaces
The weak large deviations principle holds for empirical measures of non-irreducible Markov chains on R^d with arbitrary initial measure, proved via subadditivity, with the rate function not necessarily convex.
Reference graph
Works this paper leans on
-
[1]
Azencott, R. (1980). Grandes d´ eviations et applications. In Eighth Saint Flour Probability Summer School—1978 (Saint Flour, 1978) , Volume 774 of Lecture Notes in Math. , pp. 1–176. Springer, Berlin
work page 1980
-
[2]
Bahadur, R. R. and S. L. Zabell (1979). Large deviations of the sample mean in general vector spaces. Ann. Probab. 7 (4), 587–621
work page 1979
-
[3]
Baxter, J. R., N. C. Jain, and S. R. S. Varadhan (1991). Some familiar examples for which the large deviation principle does not hold. Comm. Pure Appl. Math. 44 (8-9), 911–923
work page 1991
-
[4]
Bourbaki, N. (1971). ´El´ ements de math´ ematique. Topologie g´ en´ erale. Chapitres 1 ` a 4. Hermann, Paris
work page 1971
-
[5]
Bryc, W. and A. Dembo (1996). Large deviations and strong mixing. Ann. Inst. H. Poincar´ e Probab. Statist. 32 (4), 549–569
work page 1996
-
[6]
Cuneo, N., V. Jakˇ si´ c, C.-A. Pillet, and A. Shirikyan (2019). Large deviations and fluctuation theorem for selectively decoupled measures on shift spaces. Rev. Math. Phys. 31 (10), 1950036, 54
work page 2019
-
[7]
Dawson, D. A. and J. G¨ artner (1987). Large deviations from the McKean-Vlasov limit for weakly interacting diffusions. Stochastics 20 (4), 247–308
work page 1987
-
[8]
de Acosta, A. (1990). Large deviations for empirical measures of Markov chains. J. Theoret. Probab. 3(3), 395–431
work page 1990
-
[9]
de Acosta, A. and P. Ney (1998). Large deviation lower bounds for arbitrary additive functionals of a Markov chain. Ann. Probab. 26 (4), 1660–1682
work page 1998
-
[10]
de Acosta, A. D. (2022). Large deviations for Markov chains , Volume 229 of Cambridge Tracts in Mathematics. Cambridge University Press, Cambridge
work page 2022
-
[11]
Dembo, A. and O. Zeitouni (2010). Large deviations techniques and applications , Volume 38 of Stochastic Modelling and Applied Probability . Springer-Verlag, Berlin. Corrected reprint of the second (1998) edition
work page 2010
-
[12]
den Hollander, F. (2000). Large deviations, Volume 14 of Fields Institute Monographs . American Mathematical Society, Providence, RI
work page 2000
-
[13]
Deuschel, J.-D. and D. W. Stroock (1989). Large deviations, Volume 137 of Pure and Applied Mathematics. Academic Press, Inc., Boston, MA
work page 1989
-
[14]
Dinwoodie, I. H. (1993). Identifying a large deviation rate function. Ann. Probab. 21 (1), 216–231
work page 1993
-
[15]
Dinwoodie, I. H. and P. Ney (1995). Occupation measures for Markov chains. J. Theoret. Probab. 8(3), 679–691
work page 1995
-
[16]
Donsker, M. D. and S. R. S. Varadhan (1975a). Asymptotic evaluation of certain Markov process expectations for large time. I. Comm. Pure Appl. Math. 28 , 1–47
-
[17]
Donsker, M. D. and S. R. S. Varadhan (1975b). Asymptotic evaluation of certain Markov process expectations for large time. II. Comm. Pure Appl. Math. 28 , 279–301
-
[18]
Donsker, M. D. and S. R. S. Varadhan (1976). Asymptotic evaluation of certain Markov process expectations for large time. III. Comm. Pure Appl. Math. 29 (4), 389–461
work page 1976
-
[19]
Donsker, M. D. and S. R. S. Varadhan (1983). Asymptotic evaluation of certain Markov process expectations for large time. IV. Comm. Pure Appl. Math. 36 (2), 183–212. 59
work page 1983
-
[20]
Ellis, R. S. (2006). Entropy, large deviations, and statistical mechanics . Classics in Mathematics. Springer-Verlag, Berlin. Reprint of the 1985 original
work page 2006
-
[21]
Fayolle, G. and A. de La Fortelle (2002). Entropy and the principle of large deviations for discrete-time Markov chains. Problemy Peredachi Informatsii 38 (4), 121–135
work page 2002
-
[22]
Jiang, Y. W. and L. M. Wu (2005). Large deviations for empirical measures of not necessarily irreducible countable Markov chains with arbitrary initial measures. Acta Math. Sin. (Engl. Ser.) 21 (6), 1377–1390
work page 2005
-
[23]
Lanford, O. E. (1973). Entropy and equilibrium states in classical statistical mechanics. pp. 1–113
work page 1973
- [24]
-
[25]
Parthasarathy, K. R. (2005). Probability measures on metric spaces / K. R. Parthasarathy . Providence (R.I.): AMS Chelsea Publishing, American Mathematical Society
work page 2005
-
[26]
Pfister, C.-E. (2002). Thermodynamical aspects of classical lattice systems. In In and out of equilibrium (Mambucaba, 2000), Volume 51 of Progr. Probab., pp. 393–472. Birkh¨ auser Boston, Boston, MA
work page 2002
-
[27]
Pinsker, M. S. (1964). Information and information stability of random variables and processes. Holden-Day, Inc., San Francisco, Calif.-London-Amsterdam
work page 1964
-
[28]
Rassoul-Agha, F. and T. Sepp¨ al¨ ainen (2015).A course on large deviations with an introduction to Gibbs measures, Volume 162 of Graduate Studies in Mathematics . American Mathematical Society, Providence, RI
work page 2015
-
[29]
Ruelle, D. (1965). Correlation functionals. J. Mathematical Phys. 6 , 201–220
work page 1965
-
[30]
Wu, L. (2000). Uniformly integrable operators and large deviations for Markov processes. J. Funct. Anal. 172 (2), 301–376
work page 2000
-
[31]
(2002).Convex analysis in general vector spaces
Z˘ alinescu, C. (2002).Convex analysis in general vector spaces . World Scientific Publishing Co., Inc., River Edge, NJ. 60
work page 2002
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.