pith. machine review for the scientific record. sign in

arxiv: 2603.13624 · v2 · submitted 2026-03-13 · 💻 cs.DB · cs.CC

Recognition: 2 theorem links

· Lean Theorem

Jaguar: A Primal Algorithm for Conjunctive Query Evaluation in Submodular-Width Time

Authors on Pith no claims yet

Pith reviewed 2026-05-15 10:53 UTC · model grok-4.3

classification 💻 cs.DB cs.CC
keywords conjunctive query evaluationsubmodular widthprimal algorithmadaptive joinspolymatroiddegree constraintsquery complexity
0
0 comments X

The pith

Jaguar evaluates every Boolean conjunctive query in time O(N to the power of subw(Q) plus any positive epsilon) by adaptively choosing joins.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents Jaguar, a new algorithm for evaluating conjunctive queries that achieves a running time only slightly above the submodular width of the query. It maintains a feasible polymatroid solution in the primal space and uses it to guide an adaptive sequence of join operations whose total cost stays bounded by subw(Q) plus epsilon. This yields O(N^{subw(Q)+ε}) time for any ε greater than zero, and the same bound holds when degree constraints are added to the width measure. A sympathetic reader would care because the algorithm reaches near the known complexity limit for many queries while using a simpler primal description than the earlier PANDA method.

Core claim

Jaguar evaluates every Boolean conjunctive query Q in time O(N^{subw(Q)+ε}) for every ε>0. It does so by operating directly in the primal space of the submodular-width maximization problem: it maintains a polymatroid that is feasible for the input query and adaptively computes a sequence of joins whose total cost is bounded by the value of this polymatroid. The same bound holds when submodular width is extended with degree constraints.

What carries the argument

An adaptive sequence of joins whose computation cost is bounded by maintaining a feasible primal polymatroid solution to the submodular width linear program.

Load-bearing premise

That there always exists an adaptive sequence of joins whose total computation cost stays within subw(Q) plus epsilon, relying on the polymatroid satisfying Shannon inequalities.

What would settle it

A specific Boolean conjunctive query Q together with a sequence of input databases of growing size N for which Jaguar's measured running time exceeds N^{subw(Q)+0.01}.

Figures

Figures reproduced from arXiv: 2603.13624 by Hubie Chen, Mahmoud Abo Khamis.

Figure 1
Figure 1. Figure 1: Query 𝑄□ from Eq. (14) along with the two free-connex tree decompositions. to 1. Instead, we take advantage of the fact that the number of heavy 𝑊 -values is very small, namely 1 in this particular database instance D□. This allows us to lower the value of 𝑔(𝑊 ) down to log𝑁 1 = 0. Note that this change does not really fix the submodularity 𝑔(𝑍𝑊 𝑋) = ∞ > 𝑔(𝑍𝑊 ) + 𝑔(𝑊 𝑋) − 𝑔(𝑊 ) = 2. Nevertheless, we can no… view at source ↗
read the original abstract

The submodular width is a complexity measure of conjunctive queries (CQs), which assigns a nonnegative real number, subw(Q), to each CQ Q. An existing algorithm, called PAND, performs CQ evaluation in polynomial time where the exponent is essentially subw(Q). Formally, for every Boolean CQ Q, PANDA evaluates Q in time $O(N^{\mathsf{subw}(Q)} \cdot \mathsf{polylog}(N))$, where N denotes the input size; moreover, there is complexity-theoretic evidence that, for a number of Boolean CQs, no exponent strictly below subw(Q) can be achieved by combinatorial algorithms. On a high level, the submodular width of a CQ Q can be described as the maximum over all polymatroids, which are set functions on the variables of Q that satisfy Shannon inequalities. The PANDA algorithm in a sense works in the dual space of this maximization problem, makes use of information theory, and transforms a CQ into a set of disjunctive datalog programs which are individually solved. In this article, we introduce a new algorithm for CQ evaluation which achieves, for each Boolean CQ Q and for all epsilon > 0, a running time of $O(N^{\mathsf{subw}(Q)+\epsilon})$. This new algorithm's description and analysis are, in our view, significantly simpler than those of PANDA. We refer to it as a "primal" algorithm as it operates in the primal space of the described maximization problem, by maintaining a feasible primal solution, namely, a polymatroid. Indeed, this algorithm deals directly with the input CQ and adaptively computes a sequence of joins, in a guided fashion, so that the cost of these join computations is bounded. Additionally, this algorithm can achieve the stated runtime for the generalization of the submodular width incorporating degree constraints. We dub our algorithm Jaguar, as it is a join-adaptive guided algorithm.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 2 minor

Summary. The manuscript introduces Jaguar, a primal algorithm for evaluating Boolean conjunctive queries (CQs) that runs in time O(N^{subw(Q)+ε}) for every Boolean CQ Q and any ε>0. Jaguar maintains an explicit feasible polymatroid (satisfying Shannon inequalities) and uses it to guide an adaptive sequence of joins whose individual costs are charged to the current polymatroid value; after each join the polymatroid is updated while preserving feasibility. The total cost therefore sums to the claimed bound. The algorithm also extends to the version of submodular width that incorporates degree constraints. The presentation is positioned as simpler than the dual PANDA algorithm.

Significance. If the central bounding argument holds, the result supplies a simpler, primal-space route to near-optimal submodular-width time for CQ evaluation. The explicit polymatroid-maintenance construction and the ε-additive relaxation are technically clean; the extension to degree constraints broadens applicability. These features constitute a genuine contribution to the algorithmic theory of conjunctive-query evaluation.

minor comments (2)
  1. The abstract and introduction repeatedly use the phrase 'significantly simpler' without a concrete side-by-side comparison (e.g., number of cases or auxiliary lemmas) that would let a reader verify the claim.
  2. Notation for the polymatroid update step after each join should be introduced with an explicit equation or pseudocode fragment in the main body rather than deferred to an appendix.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive review and the recommendation to accept. We appreciate the recognition that Jaguar offers a simpler primal-space construction for achieving near-optimal submodular-width time bounds, along with the clean handling of the ε-relaxation and the extension to degree constraints.

Circularity Check

0 steps flagged

No significant circularity: explicit primal construction bounds runtime directly from submodular-width definition

full rationale

The derivation maintains an explicit feasible polymatroid (satisfying Shannon inequalities) and adaptively selects joins whose individual costs are charged to the current polymatroid value. An explicit update rule preserves feasibility after each join. The additive ε in the exponent permits the telescoping sum of costs to be bounded by subw(Q)+ε without any fitted parameter, self-referential quantity, or load-bearing self-citation. The central runtime claim therefore follows from the definition of submodular width together with the supplied charging argument and termination analysis; no step reduces the claimed bound to its own inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the standard definition of submodular width as the maximum over polymatroids obeying Shannon inequalities; no new free parameters or invented entities are introduced.

axioms (1)
  • domain assumption Submodular width of a CQ is the maximum value of a polymatroid on the query variables that satisfies the Shannon inequalities.
    This is the established definition from prior literature on CQ complexity that the algorithm uses to bound its running time.

pith-pipeline@v0.9.0 · 5666 in / 1254 out tokens · 44697 ms · 2026-05-15T10:53:38.201174+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Cuts and Gauges for Submodular Width

    cs.DS 2026-04 unverdicted novelty 7.0

    Submodular width is approximated within 3/2 by a branchwidth parameter from edge separations and admissible submodular costs, and satisfies subw(H) = Omega(ghw(H) / log ghw(H)) under natural conditions.

Reference graph

Works this paper leans on

28 extracted references · 28 canonical work pages · cited by 1 Pith paper

  1. [1]

    1995.Foundations of Databases

    Serge Abiteboul, Richard Hull, and Victor Vianu. 1995.Foundations of Databases. Addison-Wesley. http://webdam. inria.fr/Alice/

  2. [2]

    Curtin, Benjamin Moseley, Hung Q

    Mahmoud Abo Khamis, Ryan R. Curtin, Benjamin Moseley, Hung Q. Ngo, Xuanlong Nguyen, Dan Olteanu, and Maximilian Schleich. 2020. Functional Aggregate Queries with Additive Inequalities.ACM Trans. Database Syst.45, 4, Article 17 (dec 2020), 41 pages. https://doi.org/10.1145/3426865

  3. [3]

    Mahmoud Abo Khamis, Xiao Hu, and Dan Suciu. 2025. Fast Matrix Multiplication meets the Submodular Width.Proc. ACM Manag. Data3, 2 (2025), 98:1–98:26

  4. [4]

    Ngo, and Atri Rudra

    Mahmoud Abo Khamis, Hung Q. Ngo, and Atri Rudra. 2016. FAQ: Questions Asked Frequently. InProceedings of the 35th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, PODS 2016, San Francisco, CA, USA, June 26 - July 01, 2016, Tova Milo and Wang-Chiew Tan (Eds.). ACM, 13–28. https://doi.org/10.1145/2902251.2902280

  5. [5]

    Ngo, and Dan Suciu

    Mahmoud Abo Khamis, Hung Q. Ngo, and Dan Suciu. 2017. What Do Shannon-type Inequalities, Submodular Width, and Disjunctive Datalog Have to Do with One Another?. InProceedings of the 36th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, PODS 2017, Chicago, IL, USA, May 14-19, 2017, Emanuel Sallinger, Jan Van den Bussche, and Floris Geer...

  6. [6]

    Ngo, and Dan Suciu

    Mahmoud Abo Khamis, Hung Q. Ngo, and Dan Suciu. 2025. PANDA: Query Evaluation in Submodular Width. TheoretiCSVolume 4, Article 12 (Apr 2025). https://doi.org/10.46298/theoretics.25.12

  7. [7]

    Isolde Adler, Georg Gottlob, and Martin Grohe. 2007. Hypertree width and related hypergraph invariants.Eur. J. Comb.28, 8 (2007), 2167–2181

  8. [8]

    Noga Alon, Raphael Yuster, and Uri Zwick. 1997. Finding and Counting Given Length Cycles.Algorithmica17, 3 (1997), 209–223. https://doi.org/10.1007/BF02523189

  9. [9]

    Guillaume Bagan, Arnaud Durand, and Etienne Grandjean. 2007. On Acyclic Conjunctive Queries and Constant Delay Enumeration. InComputer Science Logic, 21st International Workshop, CSL 2007, 16th Annual Conference of the EACSL, Lausanne, Switzerland, September 11-15, 2007, Proceedings (Lecture Notes in Computer Science, Vol. 4646), Jacques Duparc and Thomas...

  10. [10]

    Christoph Berkholz and Nicole Schweikardt. 2019. Constant Delay Enumeration with FPT-Preprocessing for Con- junctive Queries of Bounded Submodular Width. In44th International Symposium on Mathematical Foundations of Computer Science (MFCS 2019) (Leibniz International Proceedings in Informatics (LIPIcs), Vol. 138), Peter Rossmanith, Pinar Heggernes, and Jo...

  11. [11]

    Karl Bringmann and Egor Gorbachev. 2025. A Fine-Grained Classification of Subquadratic Patterns for Subgraph Listing and Friends. InProceedings of the 57th Annual ACM Symposium on Theory of Computing(Prague, Czechia)(STOC ’25). Association for Computing Machinery, New York, NY, USA, 2145–2156. https://doi.org/10.1145/3717823.3718141

  12. [12]

    Nofar Carmeli and Markus Kröll. 2021. On the Enumeration Complexity of Unions of Conjunctive Queries. 46, 2, Article 5 (May 2021), 41 pages. https://doi.org/10.1145/3450263

  13. [13]

    Hubie Chen and Víctor Dalmau. 2005. Beyond Hypertree Width: Decomposition Methods Without Decompositions. InEleventh International Conference on Principles and Practice of Constraint Programming. 167–181

  14. [14]

    Fan, Paraschos Koutris, and Hangdong Zhao

    Austen Z. Fan, Paraschos Koutris, and Hangdong Zhao. 2023. The Fine-Grained Complexity of Boolean Conjunctive Queries and Sum-Product Problems. In50th International Colloquium on Automata, Languages, and Programming (ICALP 2023) (Leibniz International Proceedings in Informatics (LIPIcs), Vol. 261), Kousha Etessami, Uriel Feige, and Gabriele Puppis (Eds.)....

  15. [15]

    Fan, Paraschos Koutris, and Hangdong Zhao

    Austen Z. Fan, Paraschos Koutris, and Hangdong Zhao. 2024. Tight Bounds of Circuits for Sum-Product Queries.Proc. ACM Manag. Data2, 2, Article 87 (May 2024), 20 pages. https://doi.org/10.1145/3651588

  16. [16]

    Georg Gottlob, Gianluigi Greco, Nicola Leone, and Francesco Scarcello. 2016. Hypertree Decompositions: Questions and Answers. InProceedings of the 35th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, PODS 2016, San Francisco, CA, USA, June 26 - July 01, 2016, Tova Milo and Wang-Chiew Tan (Eds.). ACM, 57–74. https://doi.org/10.1145/290...

  17. [17]

    Georg Gottlob, Matthias Lanzinger, Reinhard Pichler, and Igor Razgon. 2021. Complexity Analysis of Generalized and Fractional Hypertree Decompositions.J. ACM68, 5 (2021), 38:1–38:50

  18. [18]

    Georg Gottlob, Nicola Leone, and Francesco Scarcello. 2002. Hypertree decompositions and tractable queries.J. Comput. Syst. Sci.64, 3 (2002), 579–627. Jaguar: A Primal Algorithm for Conjunctive Query Evaluation in Submodular-Width Time 21

  19. [19]

    Georg Gottlob, Nicola Leone, and Francesco Scarcello. 2003. Robbers, marshals, and guards: game theoretic and logical characterizations of hypertree width.J. Comput. Syst. Sci.66, 4 (2003), 775–808

  20. [20]

    Martin Grohe and Dániel Marx. 2006. Constraint solving via fractional edge covers. InProceedings of the Seventeenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2006, Miami, Florida, USA, January 22-26, 2006. ACM Press, 289–298. http://dl.acm.org/citation.cfm?id=1109557.1109590

  21. [21]

    Martin Grohe and Dániel Marx. 2014. Constraint Solving via Fractional Edge Covers.ACM Trans. Algorithms11, 1 (2014), 4:1–4:20

  22. [22]

    Kolaitis, Andrej E

    Phokion G. Kolaitis, Andrej E. Romashchenko, Milan Studený, Dan Suciu, and Tobias A. Boege. 2023. Algorithmic Aspects of Information Theory (Dagstuhl Seminar 22301).Dagstuhl Reports12, 7 (2023), 180–204. https://doi.org/10. 4230/DagRep.12.7.180

  23. [23]

    Dániel Marx. 2010. Approximating fractional hypertree width.ACM Trans. Algorithms6, 2 (2010), 29:1–29:17

  24. [24]

    Dániel Marx. 2013. Tractable Hypergraph Properties for Constraint Satisfaction and Conjunctive Queries.J. ACM60, 6 (2013), 42:1–42:51. https://doi.org/10.1145/2535926

  25. [25]

    Luc Segoufin. 2015. Constant Delay Enumeration for Conjunctive Queries.SIGMOD Rec.44, 1 (May 2015), 10–17. https://doi.org/10.1145/2783888.2783894

  26. [26]

    Mihalis Yannakakis. 1981. Algorithms for Acyclic Database Schemes. InVery Large Data Bases, 7th International Conference, September 9-11, 1981, Cannes, France, Proceedings. IEEE Computer Society, 82–94

  27. [27]

    Zhen Zhang and Raymond W. Yeung. 1997. A non-Shannon-type conditional inequality of information quantities. IEEE Trans. Information Theory43, 6 (1997), 1982–1986

  28. [28]

    Zhen Zhang and Raymond W Yeung. 1998. On characterization of entropy function via information inequalities.IEEE Transactions on Information Theory44, 4 (1998), 1440–1452