pith. sign in

arxiv: 2411.10229 · v4 · submitted 2024-11-15 · 💻 cs.LO · cs.DB

Optimally Rewriting Formulas and Database Queries: A Confluence of Term Rewriting, Structural Decomposition, and Complexity

Pith reviewed 2026-05-23 17:46 UTC · model grok-4.3

classification 💻 cs.LO cs.DB
keywords first-order logicwidth minimizationterm rewritingdatabase queriesquery optimizationstructural decompositionpositive formulas
0
0 comments X

The pith

An algorithm computes the minimum-width positive first-order sentence reachable from a given sentence by quantifier movement and similar rewriting rules.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper addresses width minimization for first-order sentences used in database query evaluation, where width is the largest number of free variables appearing in any subformula. An undecidability result prevents computing the absolute minimum width, so the work restricts attention to the sentences reachable from a given positive sentence by a fixed set of equivalence-preserving syntactic rules such as quantifier movement. It supplies a concrete algorithm that returns the minimum-width sentence inside this reachable set. The result supplies a complete algorithmic characterization of width minimization under the chosen rules and links term-rewriting techniques to structural decomposition methods used in query evaluation.

Core claim

For any positive first-order sentence φ the algorithm returns a logically equivalent sentence of minimum width that can be obtained from φ by repeated application of the studied rewriting rules, including those that move quantifiers; this yields the first complete algorithmic understanding of width minimization inside the closure of these rules.

What carries the argument

The closure of the rewriting rules (quantifier movement and similar syntactic transformations) that generate all reachable equivalent sentences, together with term-rewriting techniques that interface with structural decomposition measures.

If this is right

  • Width minimization becomes decidable and computable once attention is limited to the closure of the studied rules.
  • The minimum achievable width under these rules can be used as a tight bound for the complexity of evaluating the query on a database.
  • Term-rewriting algorithms can be combined with existing structural decomposition methods to produce optimal rewritings for positive queries.
  • The same interface between term rewriting and width measures extends to other syntactic transformations that preserve equivalence.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The technique could be lifted to fragments with negation if a suitable set of additional rules is identified.
  • Practical query optimizers could adopt the algorithm as a preprocessing step before choosing a join order or decomposition.
  • The reduction to term-rewriting problems suggests that similar completeness results may exist for other width-like measures such as treewidth of the query hypergraph.

Load-bearing premise

The chosen set of rewriting rules is assumed to generate exactly the class of equivalent sentences whose widths are worth comparing.

What would settle it

A positive first-order sentence for which the algorithm returns a sentence whose width is strictly larger than that of another sentence obtainable from the input by further applications of the same rules.

Figures

Figures reproduced from arXiv: 2411.10229 by Hubie Chen, Stefan Mengel.

Figure 1
Figure 1. Figure 1: Summary of studied rules. The removal rule →M allows removal of a quantification when no variables are bound to the quantification: when Q ∈ {∀, ∃}, we have QxF →M F when F contains no free occurrences of the variable x, that is, when x /∈ free(F). (Note that we assume that formulas under discussion are evaluated over structures with non-empty universe; this assumption is needed for this removal rule to pr… view at source ↗
read the original abstract

A central computational task in database theory, finite model theory, and computer science at large is the evaluation of a first-order sentence on a finite structure. In the context of this task, the \emph{width} of a sentence, defined as the maximum number of free variables over all subformulas, has been established as a crucial measure, where minimizing width of a sentence (while retaining logical equivalence) is considered highly desirable. An undecidability result rules out the possibility of an algorithm that, given a first-order sentence, returns a logically equivalent sentence of minimum width; this result motivates the study of width minimization via syntactic rewriting rules, which is this article's focus. For a number of common rewriting rules (which are known to preserve logical equivalence), including rules that allow for the movement of quantifiers, we present an algorithm that, given a positive first-order sentence $\phi$, outputs the minimum-width sentence obtainable from $\phi$ via application of these rules. We thus obtain a complete algorithmic understanding of width minimization up to the studied rules; this result is the first one -- of which we are aware -- that establishes this type of understanding in such a general setting. Our result builds on the theory of term rewriting and establishes an interface among this theory, query evaluation, and structural decomposition theory.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 2 minor

Summary. The paper claims to present an algorithm that, given a positive first-order sentence φ, outputs the minimum-width sentence obtainable from φ via application of a collection of equivalence-preserving syntactic rewriting rules (including quantifier movement). The approach leverages term rewriting theory to ensure confluence and termination within the studied rule set, while interfacing with structural decomposition methods; the result is scoped to minimization relative to the closure of these rules rather than global logical minimality, and is presented as the first complete algorithmic understanding in this general setting.

Significance. If the algorithm and its correctness proof hold, the result supplies a complete solution to width minimization under the given rewrite system. This is significant for database query optimization, where sentence width governs evaluation complexity, and the integration of term rewriting with finite model theory and structural decomposition provides a useful bridge between these areas. The explicit scoping to the studied rules avoids overclaiming and strengthens the contribution.

minor comments (2)
  1. [Abstract] The abstract states that the algorithm 'outputs the minimum-width sentence' but does not indicate the time complexity of the procedure; adding this information would clarify the practical utility.
  2. [Abstract] The connection to structural decomposition theory is mentioned but not elaborated in the high-level description; a brief pointer to the relevant section or theorem would help readers trace how decomposition is used.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the careful reading, positive assessment of significance, and recommendation of minor revision. No major comments appear in the report.

Circularity Check

0 steps flagged

No significant circularity; result is scoped algorithmic existence claim

full rationale

The paper establishes an algorithm that computes the minimum-width positive FO sentence reachable from input φ under a fixed, explicitly enumerated set of equivalence-preserving syntactic rewriting rules (quantifier movement and similar). This is a relative optimality result inside the rewrite closure, not a claim of global logical minimality or a reduction of any output quantity to a fitted parameter or self-referential definition. No equations, fitted inputs, or load-bearing self-citations appear in the abstract or described derivation; the undecidability motivation is external and the completeness statement is with respect to the studied rules only. The derivation is therefore self-contained against external benchmarks and receives the default non-circularity finding.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The result rests on the assumption that the listed rewriting rules preserve logical equivalence and that the positive fragment is the relevant domain; no free parameters or invented entities are introduced.

axioms (2)
  • domain assumption The studied rewriting rules preserve logical equivalence of first-order sentences.
    Invoked when the algorithm is claimed to produce logically equivalent output.
  • standard math Width is defined as the maximum number of free variables over all subformulas.
    Standard definition used throughout database theory and finite model theory.

pith-pipeline@v0.9.0 · 5767 in / 1280 out tokens · 21042 ms · 2026-05-23T17:46:03.278626+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

22 extracted references · 22 canonical work pages

  1. [1]

    Foundations of Databases

    Serge Abiteboul, Richard Hull, and Victor Vianu. Foundations of Databases . Addison-Wesley, 1995

  2. [2]

    Tree-width for first order formulae

    Isolde Adler and Mark Weyer. Tree-width for first order formulae. Log. Methods Comput. Sci. , 8(1), 2012

  3. [3]

    Term rewriting and all that

    Franz Baader and Tobias Nipkow. Term rewriting and all that . Cambridge University Press, 1998

  4. [4]

    Compiling existential positive queries to bounded-variable frag- ments

    Christoph Berkholz and Hubie Chen. Compiling existential positive queries to bounded-variable frag- ments. In Dan Suciu, Sebastian Skritek, and Christoph Koch, editors, Proceedings of the 38th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, PODS 2019, Amsterdam, The Netherlands, June 30 - July 5, 2019 , pages 353–364. ACM, 2019

  5. [5]

    Bodlaender

    Hans L. Bodlaender. A linear-time algorithm for finding tree-decompositions of small treewidth. SIAM J. Comput. , 25(6):1305–1317, 1996

  6. [6]

    Bodlaender, P˚ al Grøn˚ as Drange, Markus S

    Hans L. Bodlaender, P˚ al Grøn˚ as Drange, Markus S. Dregi, Fedor V. Fomin, Daniel Lokshtanov, and Michal Pilipczuk. An o(cˆk n) 5-approximation algorithm for treewidth. In 54th Annual IEEE Symposium on Foundations of Computer Science, FOCS 2013, 26-29 October, 2013, Berkeley, CA, USA , pages 499–508. IEEE Computer Society, 2013. OPTIMALLY REWRITING FORMU...

  7. [7]

    The complexity of width minimization for existential positive queries

    Simone Bova and Hubie Chen. The complexity of width minimization for existential positive queries. In Nicole Schweikardt, Vassilis Christophides, and Vincent Leroy, editors, Proc. 17th International Conference on Database Theory (ICDT), Athens, Greece, March 24-28, 2014 , pages 235–244. OpenPro- ceedings.org, 2014

  8. [8]

    How many variables are needed to express an existential positive query? Theory Comput

    Simone Bova and Hubie Chen. How many variables are needed to express an existential positive query? Theory Comput. Syst. , 63(7):1573–1594, 2019

  9. [9]

    Chandra and Philip M

    Ashok K. Chandra and Philip M. Merlin. Optimal implementation of conjunctive queries in relational data bases. In John E. Hopcroft, Emily P. Friedman, and Michael A. Harrison, editors, Proceedings of the 9th Annual ACM Symposium on Theory of Computing, May 4-6, 1977, Boulder, Colorado, USA , pages 77–90. ACM, 1977

  10. [10]

    On the complexity of existential positive queries

    Hubie Chen. On the complexity of existential positive queries. ACM Trans. Comput. Log. , 15(1):9:1–9:20, 2014

  11. [11]

    The tractability frontier of graph-like first-order query sets

    Hubie Chen. The tractability frontier of graph-like first-order query sets. J. ACM , 64(4):26:1–26:29, 2017

  12. [12]

    Decomposing quantified conjunctive (or disjunctive) formulas.SIAM J

    Hubie Chen and V´ ıctor Dalmau. Decomposing quantified conjunctive (or disjunctive) formulas.SIAM J. Comput., 45(6):2066–2086, 2016

  13. [13]

    Fomin, Lukasz Kowalik, Daniel Lokshtanov, D´ aniel Marx, Marcin Pilipczuk, Michal Pilipczuk, and Saket Saurabh

    Marek Cygan, Fedor V. Fomin, Lukasz Kowalik, Daniel Lokshtanov, D´ aniel Marx, Marcin Pilipczuk, Michal Pilipczuk, and Saket Saurabh. Parameterized Algorithms. Springer, 2015

  14. [14]

    Kolaitis, and Moshe Y

    V´ ıctor Dalmau, Phokion G. Kolaitis, and Moshe Y. Vardi. Constraint satisfaction, bounded treewidth, and finite-variable logics. In Pascal Van Hentenryck, editor, Principles and Practice of Constraint Programming - CP 2002, 8th International Conference, CP 2002, Ithaca, NY, USA, September 9-13, 2002, Proceedings, volume 2470 of Lecture Notes in Computer ...

  15. [15]

    Hypertree decompositions: A survey

    Georg Gottlob, Nicola Leone, and Francesco Scarcello. Hypertree decompositions: A survey. In Jir´ ı Sgall, Ales Pultr, and Petr Kolman, editors, Mathematical Foundations of Computer Science 2001, 26th International Symposium, MFCS 2001 Marianske Lazne, Czech Republic, August 27-31, 2001, Proceedings, volume 2136 of Lecture Notes in Computer Science , page...

  16. [16]

    Robbers, marshals, and guards: game theoretic and logical characterizations of hypertree width.Journal of Computer and System Sciences , 66(4):775–808, 2003

    Georg Gottlob, Nicola Leone, and Francesco Scarcello. Robbers, marshals, and guards: game theoretic and logical characterizations of hypertree width.Journal of Computer and System Sciences , 66(4):775–808, 2003

  17. [17]

    The complexity of homomorphism and constraint satisfaction problems seen from the other side

    Martin Grohe. The complexity of homomorphism and constraint satisfaction problems seen from the other side. J. ACM , 54(1):1:1–1:24, 2007

  18. [18]

    When is the evaluation of conjunctive queries tractable? In Jeffrey Scott Vitter, Paul G

    Martin Grohe, Thomas Schwentick, and Luc Segoufin. When is the evaluation of conjunctive queries tractable? In Jeffrey Scott Vitter, Paul G. Spirakis, and Mihalis Yannakakis, editors, Proceedings on 33rd Annual ACM Symposium on Theory of Computing, July 6-8, 2001, Heraklion, Crete, Greece , pages 657–666. ACM, 2001

  19. [19]

    Kolaitis and Moshe Y

    Phokion G. Kolaitis and Moshe Y. Vardi. Conjunctive-query containment and constraint satisfaction. J. Comput. Syst. Sci. , 61(2):302–332, 2000

  20. [20]

    On theories with a combinatorial definition of” equivalence”

    Maxwell Herman Alexander Newman. On theories with a combinatorial definition of” equivalence”. Annals of mathematics , pages 223–243, 1942

  21. [21]

    Papadimitriou

    Christos H. Papadimitriou. Computational complexity. Addison-Wesley, 1994

  22. [22]

    Faster decision of first-order graph properties

    Ryan Williams. Faster decision of first-order graph properties. In Joint Meeting of the Twenty-Third EACSL Annual Conference on Computer Science Logic (CSL) and the Twenty-Ninth Annual ACM/IEEE Symposium on Logic in Computer Science (LICS), CSL-LICS ’14 , pages 80:1–80:6. ACM, 2014. This work is licensed under the Creative Commons Attribution License. T o...