arxiv: 2605.07839 · v1 · submitted 2026-05-08 · 💻 cs.AI

Recognition: no theorem link

Exact Regular-Constrained Variable-Order Markov Generation via Sparse Context-State Belief Propagation

Fran\c{c}ois Pachet

Authors on Pith no claims yet

Pith reviewed 2026-05-11 02:54 UTC · model grok-4.3

classification 💻 cs.AI

keywords variable-order Markovregular constraintsbelief propagationsequence generationcontext statessparse productconstrained generationbackoff models

0 comments

The pith

Variable-order Markov models generate sequences under regular constraints exactly by sparse belief propagation on the product of observed context states and the constraint automaton.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops an exact method for generating sequences from a variable-order Markov model while satisfying regular constraints given by an automaton. It starts from existing belief-propagation techniques that work for first-order chains and identifies the key mismatch: a first-order state merges histories that the variable-order model must keep separate. The solution replaces the first-order state with the observed context state from the trained model and forms the standard product with the automaton. For any fixed context graph and automaton this construction yields the correct conditional distribution without enumerating all K-tuples, with inference linear in sequence length.

Core claim

Replacing the first-order Markov state with the observed context state from the trained variable-order model, then taking the product with the regular-constraint automaton, produces the exact variable-order distribution conditioned on the constraints. Inference on this sparse product is linear in the horizon for fixed graphs and polynomial in the number of reachable edges in general. The same interface also supports reversible data augmentation via inverse count lookup and cleanly separates exact inference from stochastic backoff policies applied at generation time.

What carries the argument

The sparse product of observed context states with the regular constraint automaton, which keeps variable-order histories distinct while enforcing the automaton constraints.

If this is right

For any fixed trained context graph and automaton, exact conditional probabilities are obtained without expanding to all K-tuples.
Inference cost is linear in sequence horizon when the product graph is fixed.
Reversible data augmentation becomes possible by inverse lookup of counts on the same finite-source interface.
Exact belief-propagation inference is cleanly separated from generation-time backoff policies such as singleton avoidance.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same product construction could be tested on non-Markovian generators whose context sets are still finite.
Small-scale verification against enumeration would give immediate empirical checks before scaling to music or text corpora.
The separation of exact inference from backoff policies suggests that other stochastic policies could be substituted while keeping the conditional distribution exact.

Load-bearing premise

The sparse product of observed context states with the constraint automaton resolves the mismatch between merged first-order histories and distinct variable-order contexts to give exact conditional probabilities.

What would settle it

Compare the probabilities produced by the method on a small alphabet and short horizon against brute-force enumeration of all sequences that satisfy the same automaton; any mismatch in the conditional distribution falsifies the claim.

Figures

Figures reproduced from arXiv: 2605.07839 by Fran\c{c}ois Pachet.

read the original abstract

Variable-order Markov models generate sequences over a finite alphabet by conditioning each symbol on the longest available suffix of the generated history. Regular constraints, by contrast, describe finite-horizon control requirements by an automaton: fixed positions, forced endings, metrical patterns, and forbidden copied fragments are all special cases. Existing exact methods already handle regular constraints with belief propagation for first-order Markov chains. The contribution here is the variable-order extension: identifying the state space on which the existing BP-regular machinery must be run when the generator is a variable-order/backoff model. A first-order constraint layer can enforce useful support conditions, but it computes future mass after merging histories that a variable-order generator deliberately keeps distinct. We formalize this mismatch and give the sparse construction obtained by replacing the first-order Markov state with the observed context state, then taking the standard product with the regular constraint automaton. For a fixed trained context graph and automaton, inference is linear in the sequence horizon; in general it is polynomial in the number of reachable product edges. This gives the correct variable-order distribution conditioned on regular constraints without expanding to all K-tuples. The same finite-source interface supports reversible data augmentation by inverse count lookup, matching materialized transposition augmentation without storing transformed corpora. We also separate exact BP inference from generation-time backoff policies, such as singleton avoidance, whose stochastic semantics must be made explicit if exactness is claimed.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives a sparse context-state product that lets ordinary belief propagation compute the exact variable-order distribution under regular constraints, without the history merging that breaks first-order methods.

read the letter

The core contribution is replacing the first-order state with the actual observed context from the trained variable-order graph before taking the product with the constraint automaton. This directly addresses the mismatch where first-order BP merges histories that a VOM model keeps distinct, and the resulting inference stays linear in sequence length for fixed graphs or polynomial in reachable product edges. The abstract also cleanly separates the exact BP step from any runtime backoff policies like singleton avoidance, which is a useful clarification for anyone claiming exactness in generation systems. The same finite-source interface for reversible augmentation is a nice side benefit that matches transposition methods without materializing new data. These pieces feel like a practical algorithmic extension rather than a routine application of existing BP. The main soft spot is verification of exactness. The construction works only if context transitions follow the longest-suffix rule exactly, emissions come verbatim from the context node, and the automaton does not collapse distinct VOM contexts. The abstract asserts the marginals match the original model on accepting paths, but without a small worked example or explicit invariant in the provided summary, it is hard to confirm no probability mass is altered. If the full paper supplies that derivation or a reproducible check on a toy alphabet, the claim holds; otherwise it remains an assumption. This is aimed at researchers doing constrained creative generation or structured synthesis with backoff models. A reader who already knows first-order BP for automata products will get immediate value from the variable-order fix. It deserves serious referee time because the problem is well-posed, the construction is non-trivial, and the efficiency claim is concrete, even if the proofs need tightening.

Referee Report

2 major / 2 minor

Summary. The paper claims that variable-order Markov (VOM) models can be exactly conditioned on regular constraints by replacing the first-order Markov state with the observed context state from a trained context graph, forming a sparse product with the constraint automaton, and then running standard belief propagation on the resulting graph; this computes the correct VOM conditional distribution in time linear in the horizon for fixed models (or polynomial in reachable product edges) without expanding to all K-tuples, while separating exact inference from backoff policies.

Significance. If the construction is correct, the result extends exact constrained generation techniques from first-order Markov chains to variable-order models, which is significant for applications requiring both long-range dependencies (via VOM) and finite-horizon control (via automata), such as metrical text or music generation. The explicit formalization of the history-merging mismatch and the finite-source interface for reversible augmentation are practical strengths; the work also usefully clarifies that generation-time policies like singleton avoidance must be stated separately from the exact BP step.

major comments (2)

[Abstract / construction] Abstract and construction section: the central claim of exactness requires that the product marginals equal the original VOM marginals restricted to accepting paths. The abstract asserts this follows from the sparse context-state product but supplies no explicit small-case derivation, invariant, or probability-assignment rule showing that (i) context transitions select the longest suffix, (ii) emissions are taken verbatim from the context node, and (iii) the automaton states do not merge VOM-distinct contexts. This verification is load-bearing for the exactness guarantee.
[Mismatch formalization] Mismatch formalization: the paper correctly identifies that a first-order constraint layer merges histories a VOM keeps distinct, but the sparse replacement must be shown to preserve the conditional probabilities exactly; without an equation or lemma establishing the equivalence of the product measure to the constrained VOM measure, the claim that ordinary BP on the product yields the desired distribution remains unverified.

minor comments (2)

[Abstract] The statement that inference is 'linear in the sequence horizon' for fixed models should be accompanied by a precise complexity statement in terms of the number of reachable product edges and the cost of BP message passing.
[Augmentation interface] The reversible data-augmentation interface via inverse count lookup is mentioned but would benefit from a short algorithmic sketch or pseudocode to clarify how it matches materialized transposition without storing transformed corpora.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thoughtful and detailed review of our manuscript. The comments correctly identify areas where additional formal verification would strengthen the presentation of the exactness claim. We address each major comment below and plan to incorporate revisions to clarify these points.

read point-by-point responses

Referee: [Abstract / construction] Abstract and construction section: the central claim of exactness requires that the product marginals equal the original VOM marginals restricted to accepting paths. The abstract asserts this follows from the sparse context-state product but supplies no explicit small-case derivation, invariant, or probability-assignment rule showing that (i) context transitions select the longest suffix, (ii) emissions are taken verbatim from the context node, and (iii) the automaton states do not merge VOM-distinct contexts. This verification is load-bearing for the exactness guarantee.

Authors: We agree that an explicit small-case derivation or invariant would make the exactness claim more transparent. In the revised manuscript, we will add a dedicated subsection or lemma in the construction section that provides a small example derivation, detailing the probability assignment rules for context transitions (longest suffix selection), verbatim emissions from context nodes, and confirmation that automaton states preserve VOM-distinct contexts. This will verify that the product marginals equal the original VOM marginals on accepting paths. revision: yes
Referee: [Mismatch formalization] Mismatch formalization: the paper correctly identifies that a first-order constraint layer merges histories a VOM keeps distinct, but the sparse replacement must be shown to preserve the conditional probabilities exactly; without an equation or lemma establishing the equivalence of the product measure to the constrained VOM measure, the claim that ordinary BP on the product yields the desired distribution remains unverified.

Authors: We acknowledge that while the mismatch is formalized in the paper, an explicit equation or lemma equating the product measure to the constrained VOM measure would provide stronger verification. We will introduce such a lemma in the revised version, showing that the sparse context-state product preserves the conditional probabilities exactly, thereby confirming that standard belief propagation computes the correct distribution. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation relies on external inputs and standard constructions

full rationale

The paper treats the trained context graph as a fixed external input and defines its sparse product construction explicitly by replacing the first-order state with the observed context state before applying the standard automaton product and belief propagation. This is a direct algorithmic definition rather than a reduction to fitted parameters or self-referential claims. No load-bearing self-citations, uniqueness theorems from prior author work, or ansatzes smuggled via citation appear in the provided text. The claim of yielding the 'correct' conditional distribution follows from the formalization of the history-merging mismatch and the replacement rule, but remains a self-contained construction on independent components (trained graph + automaton) without circular equivalence to its inputs by definition. The separation of exact BP from backoff policies further keeps the core inference independent.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 1 invented entities

The method rests on standard properties of belief propagation and automata theory plus the domain assumption of a fixed trained context graph. The main addition is the invented sparse product construction. No free parameters are introduced in the inference procedure itself.

axioms (2)

standard math Belief propagation computes exact marginals on the product graph when the underlying structure permits it
Invoked as the existing BP-regular machinery extended to the new state space.
domain assumption The variable-order model is fully captured by a fixed context graph provided as input
Stated as the basis for the sparse construction and linear-time inference.

invented entities (1)

Sparse context-state product graph no independent evidence
purpose: To combine variable-order context states with constraint automaton states while preserving distinct histories
New construction introduced to resolve the mismatch between first-order merging and variable-order distinction.

pith-pipeline@v0.9.0 · 5545 in / 1491 out tokens · 58308 ms · 2026-05-11T02:54:39.282352+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

22 extracted references · 22 canonical work pages

[1]

OpenFst: A General and Efficient Weighted Finite-State Transducer Library

Cyril Allauzen, Michael Riley, Johan Schalkwyk, Wojciech Skut, and Mehryar Mohri. OpenFst: A General and Efficient Weighted Finite-State Transducer Library. InProceedings of the 12th International Conference on Implementation and Application of Automata, volume 4783 of Lecture Notes in Computer Science, pages 11–23. Springer, 2007

work page 2007
[2]

MIT Press, 2008

Christel Baier and Joost-Pieter Katoen.Principles of Model Checking. MIT Press, 2008. 21

work page 2008
[3]

Peter Bühlmann and Abraham J. Wyner. Variable Length Markov Chains.The Annals of Statistics, 27(2):480–513, 1999

work page 1999
[4]

Lexically Constrained Decoding for Sequence Generation Using Grid Beam Search

Chris Hokamp and Qun Liu. Lexically Constrained Decoding for Sequence Generation Using Grid Beam Search. InProceedings of the 55th Annual Meeting of the Association for Computational Linguistics, pages 1535–1546, 2017

work page 2017
[5]

Mackworth

Alan K. Mackworth. Consistency in Networks of Relations.Artificial Intelligence, 8(1):99–118, 1977

work page 1977
[6]

Weighted Finite-State Transducers in Speech Recognition.Computer Speech & Language, 16(1):69–88, 2002

Mehryar Mohri, Fernando Pereira, and Michael Riley. Weighted Finite-State Transducers in Speech Recognition.Computer Speech & Language, 16(1):69–88, 2002

work page 2002
[7]

The Continuator: Musical interaction with style.Journal of New Music Research, 32(3):333–341, 2003

François Pachet. The Continuator: Musical interaction with style.Journal of New Music Research, 32(3):333–341, 2003

work page 2003
[8]

fpachet/continuator: A constrainable Continuator.https://github.com/ fpachet/continuator, 2026

François Pachet. fpachet/continuator: A constrainable Continuator.https://github.com/ fpachet/continuator, 2026. GitHub repository

work page 2026
[9]

vo-regular-bp: Variable-order regular belief propagation.https://github

François Pachet. vo-regular-bp: Variable-order regular belief propagation.https://github. com/fpachet/vo-regular-bp, 2026. GitHub repository, version v0.1.0

work page 2026
[10]

Markov constraints: steerable generation of Markov sequences

François Pachet and Pierre Roy. Markov constraints: steerable generation of Markov sequences. Constraints, 16(2):148–172, 2011

work page 2011
[11]

Finite-Length Markov Processes with Constraints

François Pachet, Pierre Roy, and Gabriele Barbieri. Finite-Length Markov Processes with Constraints. InProceedings of the Twenty-Second International Joint Conference on Artificial Intelligence, pages 635–642, 2011

work page 2011
[12]

Exact Sampling for Regular and Markov Constraints with Belief Propagation

Alexandre Papadopoulos, François Pachet, Pierre Roy, and Jason Sakellariou. Exact Sampling for Regular and Markov Constraints with Belief Propagation. InPrinciples and Practice of Constraint Programming, volume 9255 ofLecture Notes in Computer Science, pages 341–350. Springer, 2015

work page 2015
[13]

Avoiding Plagiarism in Markov Sequence Generation

Alexandre Papadopoulos, Pierre Roy, and François Pachet. Avoiding Plagiarism in Markov Sequence Generation. InProceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, 2014

work page 2014
[14]

Improved Methods for Statistical Modelling of Monophonic Music.Journal of New Music Research, 33(4):367–385, 2004

Marcus Pearce and Geraint Wiggins. Improved Methods for Statistical Modelling of Monophonic Music.Journal of New Music Research, 33(4):367–385, 2004

work page 2004
[15]

A Regular Language Membership Constraint for Finite Sequences of Variables

Gilles Pesant. A Regular Language Membership Constraint for Finite Sequences of Variables. In Principles and Practice of Constraint Programming, volume 3258 ofLecture Notes in Computer Science, pages 482–495. Springer, 2004

work page 2004
[16]

Fast Lexically Constrained Decoding with Dynamic Beam Allocation for Neural Machine Translation

Matt Post and David Vilar. Fast Lexically Constrained Decoding with Dynamic Beam Allocation for Neural Machine Translation. InProceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics, pages 1314–1324, 2018

work page 2018
[17]

A Universal Data Compression System.IEEE Transactions on Information Theory, 29(5):656–664, 1983

Jorma Rissanen. A Universal Data Compression System.IEEE Transactions on Information Theory, 29(5):656–664, 1983

work page 1983
[18]

The Power of Amnesia: Learning Probabilistic Automata with Variable Memory Length.Machine Learning, 25(2):117–149, 1996

Dana Ron, Yoram Singer, and Naftali Tishby. The Power of Amnesia: Learning Probabilistic Automata with Variable Memory Length.Machine Learning, 25(2):117–149, 1996. 22

work page 1996
[19]

Routledge, London and New York, 2017

Victoria Rowe, Angeliki Triantafyllaki, and François Pachet.Children’s Creative Music-Making with Reflexive Interactive Technology: Adventures in Improvising and Composing. Routledge, London and New York, 2017

work page 2017
[20]

Enforcing Meter in Finite-Length Markov Sequences

Pierre Roy and François Pachet. Enforcing Meter in Finite-Length Markov Sequences. In Proceedings of the Twenty-Seventh AAAI Conference on Artificial Intelligence, 2013

work page 2013
[21]

Maximum Entropy Models Capture Melodic Styles.Scientific Reports, 7(1):9172, 2017

Jason Sakellariou, Francesca Tria, Vittorio Loreto, and François Pachet. Maximum Entropy Models Capture Melodic Styles.Scientific Reports, 7(1):9172, 2017

work page 2017
[22]

Frans M. J. Willems, Yuri M. Shtarkov, and Tjalling J. Tjalkens. The Context-Tree Weighting Method: Basic Properties.IEEE Transactions on Information Theory, 41(3):653–664, 1995. 23

work page 1995