#CFG and #DNNF admit FPRAS

Alexis de Colnet; Kuldeep S. Meel

arxiv: 2406.18224 · v3 · pith:U5BGFYHGnew · submitted 2024-06-26 · 💻 cs.DS

#CFG and #DNNF admit FPRAS

Kuldeep S. Meel , Alexis de Colnet This is my paper

Pith reviewed 2026-05-23 23:54 UTC · model grok-4.3

classification 💻 cs.DS

keywords context-free grammarDNNFcounting problemsapproximation schemerandomized algorithmpolynomial timemodel countingword counting

0 comments

The pith

Context-free grammars and DNNF circuits admit fully polynomial-time randomized approximation schemes for counting.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper gives the first fully polynomial-time randomized approximation scheme for counting the number of words of exact length n generated by a context-free grammar. It gives the same kind of scheme for counting the satisfying assignments of a DNNF circuit. These counting tasks previously had only quasi-polynomial time solutions or worked only for restricted cases such as automata. A reader would care because the new schemes make approximation possible in time polynomial in the input size and the desired accuracy.

Core claim

We provide the first fully polynomial-time randomized approximation scheme for counting the number of words of length exactly n generated by a CFG and the number of assignments satisfying a DNNF circuit. Finding polynomial time algorithms for the aforementioned problems has been a longstanding open problem. Prior work could either only obtain a quasi-polynomial runtime or a polynomial-time randomized approximation scheme for restricted fragments such as non-deterministic finite automata or non-deterministic tree automata.

What carries the argument

A fully polynomial-time randomized approximation scheme that estimates the count to within a (1+epsilon) factor with high probability in time polynomial in the input size, 1/epsilon, and log(1/delta).

Load-bearing premise

The grammar or DNNF is presented in a standard encoding whose size is polynomial in the parameters that appear in the running-time bound of the claimed approximation scheme.

What would settle it

A concrete CFG or DNNF instance on which the approximation procedure fails to stay within a (1+epsilon) multiplicative error of the true count with the stated success probability while running in the claimed polynomial time bound.

Figures

Figures reproduced from arXiv: 2406.18224 by Alexis de Colnet, Kuldeep S. Meel.

**Figure 2.** Figure 2: On the left: the derivation tree T for x1x4x7x8 ∈ supp(q), with q the root node. On the right: the derivation tree T ′ for x1x3x7x8 ∈ supp(q). The circled nodes form lcsn(T, T ′ ). any βqˆ ∈ supp(ˆq), we can replace αq,i ˆ by βqˆ in αi and we still have a monomial in supp(q). This replacement can be done for more than one ˆq at a time. So each αi is used to create monomials of the form α ′ i Q qˆ∈τ βqˆ (αi… view at source ↗

read the original abstract

We provide the first fully polynomial-time randomized approximation scheme for the following two counting problems: 1. Given a Context Free Grammar $G$ over alphabet $\Sigma$, count the number of words of length exactly $n$ generated by $G$. 2. Given a circuit $\varphi$ in Decomposable Negation Normal Form (DNNF) over the set of Boolean variables $X$, compute the number of assignments to $X$ such that $\varphi$ evaluates to 1. Finding polynomial time algorithms for the aforementioned problems has been a longstanding open problem. Prior work could either only obtain a quasi-polynomial runtime (SODA 1995) or a polynomial-time randomized approximation scheme for restricted fragments, such as non-deterministic finite automata (JACM 2021) or non-deterministic tree automata (STOC 2021).

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives the first FPRAS for exact-n word counting under CFGs and model counting under DNNFs, moving past the 1995 quasi-polynomial barrier.

read the letter

The central result is an FPRAS for two problems that had resisted fully polynomial randomized approximation: counting words of length exactly n generated by an arbitrary CFG, and counting satisfying assignments to an arbitrary DNNF. The abstract positions this as the first such scheme without the syntactic restrictions used in the 2021 automata papers, and it improves on the SODA 1995 quasi-polynomial bound. That is the main thing to take away. The work is technically grounded in the sense that it directly targets the open cases rather than carving out easier fragments. The statement of the problems is clean and the contrast with prior results is explicit. If the algorithm and its analysis hold, the result organizes a slice of the approximation landscape for formal-language and knowledge-compilation counting. The soft spot worth checking is the precise dependence on input size. An FPRAS must be polynomial in the total size of the grammar or circuit description plus n and 1/ε. The abstract does not spell out the encoding convention, so the full paper needs to confirm that the running-time bound uses the standard measure (number of productions, nonterminals, and symbols for the grammar; number of gates and variables for the DNNF) and does not hide an exponential factor through a non-standard representation. That is a standard point to verify rather than a fatal gap. No circular definitions or invented quantities appear in the claim itself. The paper is for people working on counting problems in automata, circuits, and knowledge compilation. A reader who follows approximation algorithms for #P-complete tasks or who needs practical counting routines for these representations will get direct value. It deserves a serious referee because the claim is sharp, the prior gap is real, and the result would be cited if the proof checks out. Recommendation: send to peer review.

Referee Report

2 major / 2 minor

Summary. The manuscript claims to provide the first fully polynomial-time randomized approximation scheme (FPRAS) for two problems: (1) given a context-free grammar G over alphabet Σ, approximate the number of words of length exactly n generated by G; (2) given a circuit φ in DNNF over Boolean variables X, approximate the number of satisfying assignments. It positions this as resolving a longstanding open problem, improving on quasi-polynomial runtimes (SODA 1995) and FPRAS results restricted to NFAs (JACM 2021) or tree automata (STOC 2021).

Significance. If the algorithmic constructions and error analyses hold with the stated guarantees, the result would be a notable advance in approximate counting, supplying the first FPRAS for #CFG and #DNNF and closing the gap from quasi-polynomial or restricted cases. The manuscript would earn credit for delivering an explicit algorithmic scheme rather than a non-constructive existence argument.

major comments (2)

[Main theorems / runtime analysis] The main theorem statements (presumably in §3 and §5) and runtime analysis must explicitly define the input-size measure for G and φ (e.g., number of productions/gates, total bit length under a standard encoding). Without this, it is impossible to verify that the claimed poly(n, 1/ε, |G|) or poly(n, 1/ε, |φ|) bound is fully polynomial rather than hiding super-polynomial dependence on a non-standard encoding of the grammar or circuit.
[Algorithm description / proof of runtime] The reduction or dynamic-programming construction used to obtain the FPRAS (likely in the technical sections deriving the sampling or estimation procedure) must be shown to run in time polynomial in the explicit size of the input grammar/circuit; any auxiliary data structures whose size is exponential in the number of non-terminals or variables would invalidate the FPRAS claim.

minor comments (2)

[Theorem statements] Clarify the precise approximation guarantee (additive vs. multiplicative, relative error) and the failure probability in the theorem statements.
[Introduction] Add a short comparison table or paragraph contrasting the new runtime with the SODA 1995 quasi-polynomial bound and the restricted automata results.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their careful reading and for highlighting the need for explicit input-size definitions and runtime bounds. We address the two major comments below and will incorporate clarifications into the revised manuscript.

read point-by-point responses

Referee: [Main theorems / runtime analysis] The main theorem statements (presumably in §3 and §5) and runtime analysis must explicitly define the input-size measure for G and φ (e.g., number of productions/gates, total bit length under a standard encoding). Without this, it is impossible to verify that the claimed poly(n, 1/ε, |G|) or poly(n, 1/ε, |φ|) bound is fully polynomial rather than hiding super-polynomial dependence on a non-standard encoding of the grammar or circuit.

Authors: We agree that explicit definitions are required for verifiability. The manuscript already uses |G| to denote the total bit length of a standard encoding of the grammar (summed lengths of all productions, including nonterminal and terminal symbols) and |φ| to denote the number of gates in the DNNF. The claimed runtimes are polynomial in these quantities, n, and 1/ε. In the revision we will add a short paragraph in the Preliminaries explicitly stating these measures and confirming that all algorithms and analyses are polynomial in them. revision: yes
Referee: [Algorithm description / proof of runtime] The reduction or dynamic-programming construction used to obtain the FPRAS (likely in the technical sections deriving the sampling or estimation procedure) must be shown to run in time polynomial in the explicit size of the input grammar/circuit; any auxiliary data structures whose size is exponential in the number of non-terminals or variables would invalidate the FPRAS claim.

Authors: The dynamic-programming tables and sampling procedures are sized polynomially in |G| (or |φ|) and n; the number of nonterminals is at most |G|, and the DNNF gate count directly bounds the state space. No auxiliary structures exponential in the number of nonterminals or variables appear in the constructions. We will expand the runtime lemmas in Sections 4 and 6 to include an explicit accounting of every data structure in terms of |G| and |φ|. revision: yes

Circularity Check

0 steps flagged

No circularity; algorithmic FPRAS construction is independent

full rationale

The paper presents a new FPRAS algorithm for exact-length word counting in CFGs and model counting in DNNFs. No equations, parameters, or derivations are provided in the abstract or context that reduce by construction to fitted inputs or self-definitions. Cited prior work (SODA 1995, JACM 2021, STOC 2021) is external and does not overlap with the authors. The result is framed as an algorithmic construction rather than a renaming or self-referential reduction, satisfying the default expectation of no circularity for such papers.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Review performed on abstract only; no explicit free parameters, invented entities, or non-standard axioms are visible. Standard randomized-algorithm assumptions (e.g., access to unbiased random bits, polynomial-time membership oracles) are implicitly required but not enumerated.

axioms (1)

domain assumption Existence of a polynomial-time procedure that, given the grammar or DNNF, can be used inside a Monte-Carlo sampling loop whose acceptance probability yields an (1+eps)-approximation.
Typical background assumption for any FPRAS claim; location not specified because only abstract is available.

pith-pipeline@v0.9.0 · 5673 in / 1251 out tokens · 18958 ms · 2026-05-23T23:54:30.349838+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

15 extracted references · 15 canonical work pages

[1]

\# NFA admits an FPRAS: efficient enumeration, counting, and uniform generation for logspace classes

Marcelo Arenas, Luis Alberto Croquevielle, Rajesh Jayaram, and Cristian Riveros. \# NFA admits an FPRAS: efficient enumeration, counting, and uniform generation for logspace classes. J. ACM , 68(6):48:1--48:40, 2021

work page 2021
[2]

When is approximate counting for conjunctive queries tractable? In 53rd Annual ACM SIGACT Symposium on Theory of Computing , pages 1015--1027

Marcelo Arenas, Luis Alberto Croquevielle, Rajesh Jayaram, and Cristian Riveros. When is approximate counting for conjunctive queries tractable? In 53rd Annual ACM SIGACT Symposium on Theory of Computing , pages 1015--1027. ACM , 2021

work page 2021
[3]

Compilers principles, techniques & tools

V Aho Alfred, S Lam Monica, and D Ullman Jeffrey. Compilers principles, techniques & tools . pearson Education, 2007

work page 2007
[4]

On probabilistic inference by weighted model counting

Mark Chavira and Adnan Darwiche. On probabilistic inference by weighted model counting. Artificial Intelligence , 172(6-7):772--799, 2008

work page 2008
[5]

Decomposable negation normal form

Adnan Darwiche. Decomposable negation normal form. J. ACM , 48(4):608--647, 2001

work page 2001
[6]

Sweedyk, and Stephen R

Vivek Gore, Mark Jerrum, Sampath Kannan, Z. Sweedyk, and Stephen R. Mahaney. A quasi-polynomial-time algorithm for sampling words from a context-free language. Inf. Comput. , 134(1):59--74, 1997

work page 1997
[7]

Grammar-based whitebox fuzzing

Patrice Godefroid, Adam Kiezun, and Michael Y Levin. Grammar-based whitebox fuzzing. In Proceedings of the 29th ACM SIGPLAN conference on programming language design and implementation , pages 206--215, 2008

work page 2008
[8]

Knowledge compilation meets database theory: compiling queries to decision diagrams

Abhay Jha and Dan Suciu. Knowledge compilation meets database theory: compiling queries to decision diagrams. In Proceedings of the 14th International Conference on Database Theory , pages 162--173, 2011

work page 2011
[9]

Sweedyk, and Stephen R

Sampath Kannan, Z. Sweedyk, and Stephen R. Mahaney. Counting and random generation of strings in regular languages. In Kenneth L. Clarkson, editor, Proceedings of the Sixth Annual ACM-SIAM Symposium on Discrete Algorithms, 22-24 January 1995. San Francisco, California, USA , pages 551--557. ACM/SIAM , 1995

work page 1995
[10]

Meel, Sourav Chakraborty, and Umang Mathur

Kuldeep S. Meel, Sourav Chakraborty, and Umang Mathur. A faster FPRAS for \# NFA . Proc. ACM Manag. Data , 2(2), may 2024

work page 2024
[11]

Estimating the size of union of sets in streaming models

Kuldeep S Meel, NV Vinodchandran, and Sourav Chakraborty. Estimating the size of union of sets in streaming models. In Proceedings of the 40th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems , pages 126--137, 2021

work page 2021
[12]

Pruning conformant plans by counting models on compiled d-dnnf representations

H \'e ctor Palacios, Blai Bonet, Adnan Darwiche, and Hector Geffner. Pruning conformant plans by counting models on compiled d-dnnf representations. In ICAPS , volume 5, pages 141--150, 2005

work page 2005
[13]

Knowledge compilation meets uniform sampling

Shubham Sharma, Rahul Gupta, Subhajit Roy, and Kuldeep S Meel. Knowledge compilation meets uniform sampling. In LPAR , pages 620--636, 2018

work page 2018
[14]

Valiant, Sven Skyum, S

Leslie G. Valiant, Sven Skyum, S. Berkowitz, and Charles Rackoff. Fast parallel computation of polynomials using few processors. SIAM J. Comput. , 12(4):641--644, 1983

work page 1983
[15]

The Fuzzing Book

Andreas Zeller, Rahul Gopinath, Marcel B \"o hme, Gordon Fraser, and Christian Holler. The Fuzzing Book . CISPA Helmholtz Center for Information Security, 2024. Retrieved 2024-07-01 16:50:18+02:00

work page 2024

[1] [1]

\# NFA admits an FPRAS: efficient enumeration, counting, and uniform generation for logspace classes

Marcelo Arenas, Luis Alberto Croquevielle, Rajesh Jayaram, and Cristian Riveros. \# NFA admits an FPRAS: efficient enumeration, counting, and uniform generation for logspace classes. J. ACM , 68(6):48:1--48:40, 2021

work page 2021

[2] [2]

When is approximate counting for conjunctive queries tractable? In 53rd Annual ACM SIGACT Symposium on Theory of Computing , pages 1015--1027

Marcelo Arenas, Luis Alberto Croquevielle, Rajesh Jayaram, and Cristian Riveros. When is approximate counting for conjunctive queries tractable? In 53rd Annual ACM SIGACT Symposium on Theory of Computing , pages 1015--1027. ACM , 2021

work page 2021

[3] [3]

Compilers principles, techniques & tools

V Aho Alfred, S Lam Monica, and D Ullman Jeffrey. Compilers principles, techniques & tools . pearson Education, 2007

work page 2007

[4] [4]

On probabilistic inference by weighted model counting

Mark Chavira and Adnan Darwiche. On probabilistic inference by weighted model counting. Artificial Intelligence , 172(6-7):772--799, 2008

work page 2008

[5] [5]

Decomposable negation normal form

Adnan Darwiche. Decomposable negation normal form. J. ACM , 48(4):608--647, 2001

work page 2001

[6] [6]

Sweedyk, and Stephen R

Vivek Gore, Mark Jerrum, Sampath Kannan, Z. Sweedyk, and Stephen R. Mahaney. A quasi-polynomial-time algorithm for sampling words from a context-free language. Inf. Comput. , 134(1):59--74, 1997

work page 1997

[7] [7]

Grammar-based whitebox fuzzing

Patrice Godefroid, Adam Kiezun, and Michael Y Levin. Grammar-based whitebox fuzzing. In Proceedings of the 29th ACM SIGPLAN conference on programming language design and implementation , pages 206--215, 2008

work page 2008

[8] [8]

Knowledge compilation meets database theory: compiling queries to decision diagrams

Abhay Jha and Dan Suciu. Knowledge compilation meets database theory: compiling queries to decision diagrams. In Proceedings of the 14th International Conference on Database Theory , pages 162--173, 2011

work page 2011

[9] [9]

Sweedyk, and Stephen R

Sampath Kannan, Z. Sweedyk, and Stephen R. Mahaney. Counting and random generation of strings in regular languages. In Kenneth L. Clarkson, editor, Proceedings of the Sixth Annual ACM-SIAM Symposium on Discrete Algorithms, 22-24 January 1995. San Francisco, California, USA , pages 551--557. ACM/SIAM , 1995

work page 1995

[10] [10]

Meel, Sourav Chakraborty, and Umang Mathur

Kuldeep S. Meel, Sourav Chakraborty, and Umang Mathur. A faster FPRAS for \# NFA . Proc. ACM Manag. Data , 2(2), may 2024

work page 2024

[11] [11]

Estimating the size of union of sets in streaming models

Kuldeep S Meel, NV Vinodchandran, and Sourav Chakraborty. Estimating the size of union of sets in streaming models. In Proceedings of the 40th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems , pages 126--137, 2021

work page 2021

[12] [12]

Pruning conformant plans by counting models on compiled d-dnnf representations

H \'e ctor Palacios, Blai Bonet, Adnan Darwiche, and Hector Geffner. Pruning conformant plans by counting models on compiled d-dnnf representations. In ICAPS , volume 5, pages 141--150, 2005

work page 2005

[13] [13]

Knowledge compilation meets uniform sampling

Shubham Sharma, Rahul Gupta, Subhajit Roy, and Kuldeep S Meel. Knowledge compilation meets uniform sampling. In LPAR , pages 620--636, 2018

work page 2018

[14] [14]

Valiant, Sven Skyum, S

Leslie G. Valiant, Sven Skyum, S. Berkowitz, and Charles Rackoff. Fast parallel computation of polynomials using few processors. SIAM J. Comput. , 12(4):641--644, 1983

work page 1983

[15] [15]

The Fuzzing Book

Andreas Zeller, Rahul Gopinath, Marcel B \"o hme, Gordon Fraser, and Christian Holler. The Fuzzing Book . CISPA Helmholtz Center for Information Security, 2024. Retrieved 2024-07-01 16:50:18+02:00

work page 2024