pith. sign in

arxiv: 2406.18224 · v3 · pith:U5BGFYHGnew · submitted 2024-06-26 · 💻 cs.DS

#CFG and #DNNF admit FPRAS

Pith reviewed 2026-05-23 23:54 UTC · model grok-4.3

classification 💻 cs.DS
keywords context-free grammarDNNFcounting problemsapproximation schemerandomized algorithmpolynomial timemodel countingword counting
0
0 comments X

The pith

Context-free grammars and DNNF circuits admit fully polynomial-time randomized approximation schemes for counting.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper gives the first fully polynomial-time randomized approximation scheme for counting the number of words of exact length n generated by a context-free grammar. It gives the same kind of scheme for counting the satisfying assignments of a DNNF circuit. These counting tasks previously had only quasi-polynomial time solutions or worked only for restricted cases such as automata. A reader would care because the new schemes make approximation possible in time polynomial in the input size and the desired accuracy.

Core claim

We provide the first fully polynomial-time randomized approximation scheme for counting the number of words of length exactly n generated by a CFG and the number of assignments satisfying a DNNF circuit. Finding polynomial time algorithms for the aforementioned problems has been a longstanding open problem. Prior work could either only obtain a quasi-polynomial runtime or a polynomial-time randomized approximation scheme for restricted fragments such as non-deterministic finite automata or non-deterministic tree automata.

What carries the argument

A fully polynomial-time randomized approximation scheme that estimates the count to within a (1+epsilon) factor with high probability in time polynomial in the input size, 1/epsilon, and log(1/delta).

Load-bearing premise

The grammar or DNNF is presented in a standard encoding whose size is polynomial in the parameters that appear in the running-time bound of the claimed approximation scheme.

What would settle it

A concrete CFG or DNNF instance on which the approximation procedure fails to stay within a (1+epsilon) multiplicative error of the true count with the stated success probability while running in the claimed polynomial time bound.

Figures

Figures reproduced from arXiv: 2406.18224 by Alexis de Colnet, Kuldeep S. Meel.

Figure 1
Figure 1. Figure 1: On the left: an homogeneous multilinear (+ [PITH_FULL_IMAGE:figures/full_fig_p007_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: On the left: the derivation tree T for x1x4x7x8 ∈ supp(q), with q the root node. On the right: the derivation tree T ′ for x1x3x7x8 ∈ supp(q). The circled nodes form lcsn(T, T ′ ). any βqˆ ∈ supp(ˆq), we can replace αq,i ˆ by βqˆ in αi and we still have a monomial in supp(q). This replacement can be done for more than one ˆq at a time. So each αi is used to create monomials of the form α ′ i Q qˆ∈τ βqˆ (αi… view at source ↗
read the original abstract

We provide the first fully polynomial-time randomized approximation scheme for the following two counting problems: 1. Given a Context Free Grammar $G$ over alphabet $\Sigma$, count the number of words of length exactly $n$ generated by $G$. 2. Given a circuit $\varphi$ in Decomposable Negation Normal Form (DNNF) over the set of Boolean variables $X$, compute the number of assignments to $X$ such that $\varphi$ evaluates to 1. Finding polynomial time algorithms for the aforementioned problems has been a longstanding open problem. Prior work could either only obtain a quasi-polynomial runtime (SODA 1995) or a polynomial-time randomized approximation scheme for restricted fragments, such as non-deterministic finite automata (JACM 2021) or non-deterministic tree automata (STOC 2021).

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript claims to provide the first fully polynomial-time randomized approximation scheme (FPRAS) for two problems: (1) given a context-free grammar G over alphabet Σ, approximate the number of words of length exactly n generated by G; (2) given a circuit φ in DNNF over Boolean variables X, approximate the number of satisfying assignments. It positions this as resolving a longstanding open problem, improving on quasi-polynomial runtimes (SODA 1995) and FPRAS results restricted to NFAs (JACM 2021) or tree automata (STOC 2021).

Significance. If the algorithmic constructions and error analyses hold with the stated guarantees, the result would be a notable advance in approximate counting, supplying the first FPRAS for #CFG and #DNNF and closing the gap from quasi-polynomial or restricted cases. The manuscript would earn credit for delivering an explicit algorithmic scheme rather than a non-constructive existence argument.

major comments (2)
  1. [Main theorems / runtime analysis] The main theorem statements (presumably in §3 and §5) and runtime analysis must explicitly define the input-size measure for G and φ (e.g., number of productions/gates, total bit length under a standard encoding). Without this, it is impossible to verify that the claimed poly(n, 1/ε, |G|) or poly(n, 1/ε, |φ|) bound is fully polynomial rather than hiding super-polynomial dependence on a non-standard encoding of the grammar or circuit.
  2. [Algorithm description / proof of runtime] The reduction or dynamic-programming construction used to obtain the FPRAS (likely in the technical sections deriving the sampling or estimation procedure) must be shown to run in time polynomial in the explicit size of the input grammar/circuit; any auxiliary data structures whose size is exponential in the number of non-terminals or variables would invalidate the FPRAS claim.
minor comments (2)
  1. [Theorem statements] Clarify the precise approximation guarantee (additive vs. multiplicative, relative error) and the failure probability in the theorem statements.
  2. [Introduction] Add a short comparison table or paragraph contrasting the new runtime with the SODA 1995 quasi-polynomial bound and the restricted automata results.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their careful reading and for highlighting the need for explicit input-size definitions and runtime bounds. We address the two major comments below and will incorporate clarifications into the revised manuscript.

read point-by-point responses
  1. Referee: [Main theorems / runtime analysis] The main theorem statements (presumably in §3 and §5) and runtime analysis must explicitly define the input-size measure for G and φ (e.g., number of productions/gates, total bit length under a standard encoding). Without this, it is impossible to verify that the claimed poly(n, 1/ε, |G|) or poly(n, 1/ε, |φ|) bound is fully polynomial rather than hiding super-polynomial dependence on a non-standard encoding of the grammar or circuit.

    Authors: We agree that explicit definitions are required for verifiability. The manuscript already uses |G| to denote the total bit length of a standard encoding of the grammar (summed lengths of all productions, including nonterminal and terminal symbols) and |φ| to denote the number of gates in the DNNF. The claimed runtimes are polynomial in these quantities, n, and 1/ε. In the revision we will add a short paragraph in the Preliminaries explicitly stating these measures and confirming that all algorithms and analyses are polynomial in them. revision: yes

  2. Referee: [Algorithm description / proof of runtime] The reduction or dynamic-programming construction used to obtain the FPRAS (likely in the technical sections deriving the sampling or estimation procedure) must be shown to run in time polynomial in the explicit size of the input grammar/circuit; any auxiliary data structures whose size is exponential in the number of non-terminals or variables would invalidate the FPRAS claim.

    Authors: The dynamic-programming tables and sampling procedures are sized polynomially in |G| (or |φ|) and n; the number of nonterminals is at most |G|, and the DNNF gate count directly bounds the state space. No auxiliary structures exponential in the number of nonterminals or variables appear in the constructions. We will expand the runtime lemmas in Sections 4 and 6 to include an explicit accounting of every data structure in terms of |G| and |φ|. revision: yes

Circularity Check

0 steps flagged

No circularity; algorithmic FPRAS construction is independent

full rationale

The paper presents a new FPRAS algorithm for exact-length word counting in CFGs and model counting in DNNFs. No equations, parameters, or derivations are provided in the abstract or context that reduce by construction to fitted inputs or self-definitions. Cited prior work (SODA 1995, JACM 2021, STOC 2021) is external and does not overlap with the authors. The result is framed as an algorithmic construction rather than a renaming or self-referential reduction, satisfying the default expectation of no circularity for such papers.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Review performed on abstract only; no explicit free parameters, invented entities, or non-standard axioms are visible. Standard randomized-algorithm assumptions (e.g., access to unbiased random bits, polynomial-time membership oracles) are implicitly required but not enumerated.

axioms (1)
  • domain assumption Existence of a polynomial-time procedure that, given the grammar or DNNF, can be used inside a Monte-Carlo sampling loop whose acceptance probability yields an (1+eps)-approximation.
    Typical background assumption for any FPRAS claim; location not specified because only abstract is available.

pith-pipeline@v0.9.0 · 5673 in / 1251 out tokens · 18958 ms · 2026-05-23T23:54:30.349838+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

15 extracted references · 15 canonical work pages

  1. [1]

    \# NFA admits an FPRAS: efficient enumeration, counting, and uniform generation for logspace classes

    Marcelo Arenas, Luis Alberto Croquevielle, Rajesh Jayaram, and Cristian Riveros. \# NFA admits an FPRAS: efficient enumeration, counting, and uniform generation for logspace classes. J. ACM , 68(6):48:1--48:40, 2021

  2. [2]

    When is approximate counting for conjunctive queries tractable? In 53rd Annual ACM SIGACT Symposium on Theory of Computing , pages 1015--1027

    Marcelo Arenas, Luis Alberto Croquevielle, Rajesh Jayaram, and Cristian Riveros. When is approximate counting for conjunctive queries tractable? In 53rd Annual ACM SIGACT Symposium on Theory of Computing , pages 1015--1027. ACM , 2021

  3. [3]

    Compilers principles, techniques & tools

    V Aho Alfred, S Lam Monica, and D Ullman Jeffrey. Compilers principles, techniques & tools . pearson Education, 2007

  4. [4]

    On probabilistic inference by weighted model counting

    Mark Chavira and Adnan Darwiche. On probabilistic inference by weighted model counting. Artificial Intelligence , 172(6-7):772--799, 2008

  5. [5]

    Decomposable negation normal form

    Adnan Darwiche. Decomposable negation normal form. J. ACM , 48(4):608--647, 2001

  6. [6]

    Sweedyk, and Stephen R

    Vivek Gore, Mark Jerrum, Sampath Kannan, Z. Sweedyk, and Stephen R. Mahaney. A quasi-polynomial-time algorithm for sampling words from a context-free language. Inf. Comput. , 134(1):59--74, 1997

  7. [7]

    Grammar-based whitebox fuzzing

    Patrice Godefroid, Adam Kiezun, and Michael Y Levin. Grammar-based whitebox fuzzing. In Proceedings of the 29th ACM SIGPLAN conference on programming language design and implementation , pages 206--215, 2008

  8. [8]

    Knowledge compilation meets database theory: compiling queries to decision diagrams

    Abhay Jha and Dan Suciu. Knowledge compilation meets database theory: compiling queries to decision diagrams. In Proceedings of the 14th International Conference on Database Theory , pages 162--173, 2011

  9. [9]

    Sweedyk, and Stephen R

    Sampath Kannan, Z. Sweedyk, and Stephen R. Mahaney. Counting and random generation of strings in regular languages. In Kenneth L. Clarkson, editor, Proceedings of the Sixth Annual ACM-SIAM Symposium on Discrete Algorithms, 22-24 January 1995. San Francisco, California, USA , pages 551--557. ACM/SIAM , 1995

  10. [10]

    Meel, Sourav Chakraborty, and Umang Mathur

    Kuldeep S. Meel, Sourav Chakraborty, and Umang Mathur. A faster FPRAS for \# NFA . Proc. ACM Manag. Data , 2(2), may 2024

  11. [11]

    Estimating the size of union of sets in streaming models

    Kuldeep S Meel, NV Vinodchandran, and Sourav Chakraborty. Estimating the size of union of sets in streaming models. In Proceedings of the 40th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems , pages 126--137, 2021

  12. [12]

    Pruning conformant plans by counting models on compiled d-dnnf representations

    H \'e ctor Palacios, Blai Bonet, Adnan Darwiche, and Hector Geffner. Pruning conformant plans by counting models on compiled d-dnnf representations. In ICAPS , volume 5, pages 141--150, 2005

  13. [13]

    Knowledge compilation meets uniform sampling

    Shubham Sharma, Rahul Gupta, Subhajit Roy, and Kuldeep S Meel. Knowledge compilation meets uniform sampling. In LPAR , pages 620--636, 2018

  14. [14]

    Valiant, Sven Skyum, S

    Leslie G. Valiant, Sven Skyum, S. Berkowitz, and Charles Rackoff. Fast parallel computation of polynomials using few processors. SIAM J. Comput. , 12(4):641--644, 1983

  15. [15]

    The Fuzzing Book

    Andreas Zeller, Rahul Gopinath, Marcel B \"o hme, Gordon Fraser, and Christian Holler. The Fuzzing Book . CISPA Helmholtz Center for Information Security, 2024. Retrieved 2024-07-01 16:50:18+02:00