Direct Access for Answers to Conjunctive Queries with Aggregation
Pith reviewed 2026-05-24 09:16 UTC · model grok-4.3
The pith
Past tractability results for direct access to conjunctive query answers extend to annotated databases with aggregation when annotations are excluded from the order.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
All past results on direct access continue to hold for annotated databases assuming the annotation does not participate in the lexicographic order. For count-distinct aggregation, which has no efficient semiring representation, the corresponding tractability condition is established. The complexity changes when the aggregate and annotation values are included in the order, and special cases arise with unit annotations and idempotent addition.
What carries the argument
The construction of a loglinear-time data structure providing logarithmic-time direct access to answers, extended via commutative semiring annotations for aggregation.
Load-bearing premise
The annotation itself does not participate in the lexicographic order.
What would settle it
Finding a conjunctive query without self-joins, a lexicographic order excluding annotations, and an annotated database where no loglinear-time data structure for direct access exists, contradicting the extension of past results.
Figures
read the original abstract
We study the fine-grained complexity of conjunctive queries with grouping and aggregation. For common aggregate functions (e.g., min, max, count, sum), such a query can be phrased as an ordinary conjunctive query over a database annotated with a suitable commutative semiring. We investigate the ability to evaluate such queries by constructing in loglinear time a data structure that provides logarithmic-time direct access to the answers ordered by a given lexicographic order. This task is nontrivial since the number of answers might be larger than loglinear in the size of the input, so the data structure needs to provide a compact representation of the space of answers. In the absence of aggregation and annotation, past research established a sufficient tractability condition on queries and orders. For queries without self-joins, this condition is not just sufficient, but also necessary (under conventional lower-bound assumptions in fine-grained complexity). We show that all past results continue to hold for annotated databases, assuming that the annotation itself does not participate in the lexicographic order. Yet, past algorithms do not apply to the count-distinct aggregation, which has no efficient representation as a commutative semiring; for this aggregation, we establish the corresponding tractability condition. We then show how the complexity of the problem changes when we include the aggregate and annotation value in the order. We also study the impact of having all relations but one annotated by the multiplicative identity (one), as happens when we translate aggregate queries into semiring annotations, and having a semiring with an idempotent addition, such as the case of min, max, and count-distinct over a logarithmic-size domain.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper studies the fine-grained complexity of conjunctive queries with grouping and aggregation. It shows that prior results on constructing loglinear-time data structures for logarithmic-time direct access to answers in lexicographic order extend to semiring-annotated databases (covering aggregates like min, max, count, sum), provided the annotation does not participate in the order. For count-distinct (which lacks an efficient commutative semiring representation), a separate tractability condition is derived. The work further examines the complexity when aggregate/annotation values are included in the order, as well as special cases such as most relations annotated by the multiplicative identity and idempotent semirings.
Significance. If the central claims hold, the results meaningfully extend the tractability landscape for direct-access evaluation of CQs to the aggregated setting that is central to database query processing. The explicit handling of the non-participation assumption for annotations, the separate treatment of count-distinct, and the analysis of order inclusion and idempotence are useful contributions that build directly on prior fine-grained complexity results for non-aggregated CQs.
major comments (2)
- [Abstract] Abstract (paragraph on annotated databases): The extension of past results is conditioned on the assumption that 'the annotation itself does not participate in the lexicographic order.' This assumption is load-bearing for the claim that 'all past results continue to hold'; the main theorems should explicitly restate the precise query classes (e.g., self-join-free) and order conditions under which the extension applies, rather than relying on the abstract phrasing.
- [Abstract] Abstract (paragraph on count-distinct): The tractability condition for count-distinct is asserted but not stated; the corresponding section should give the exact syntactic condition on the query and order (analogous to the non-aggregated case) and prove it is both sufficient and necessary under standard assumptions.
minor comments (2)
- Clarify in the preliminaries whether the semiring operations are required to be computable in constant time or logarithmic time for the loglinear preprocessing bound to hold.
- The discussion of translating aggregate queries into semiring annotations (most relations annotated by 1) should include a brief example showing how the direct-access structure is constructed in this restricted case.
Simulated Author's Rebuttal
We thank the referee for the detailed and constructive report. We address the two major comments below and will make the requested clarifications to the abstract.
read point-by-point responses
-
Referee: [Abstract] Abstract (paragraph on annotated databases): The extension of past results is conditioned on the assumption that 'the annotation itself does not participate in the lexicographic order.' This assumption is load-bearing for the claim that 'all past results continue to hold'; the main theorems should explicitly restate the precise query classes (e.g., self-join-free) and order conditions under which the extension applies, rather than relying on the abstract phrasing.
Authors: We agree that the abstract should be more explicit. We will revise it to state that the extension applies to self-join-free conjunctive queries under the same lexicographic order conditions as the non-annotated case (with annotations excluded from the order). The body of the paper already contains the precise theorems with these conditions; the abstract revision will align the high-level claim with them. revision: yes
-
Referee: [Abstract] Abstract (paragraph on count-distinct): The tractability condition for count-distinct is asserted but not stated; the corresponding section should give the exact syntactic condition on the query and order (analogous to the non-aggregated case) and prove it is both sufficient and necessary under standard assumptions.
Authors: We will revise the abstract to explicitly state the syntactic tractability condition for count-distinct (the same as the non-aggregated case). The corresponding section already provides the exact condition together with proofs of sufficiency and necessity under standard fine-grained assumptions; the abstract change will make this visible at the summary level. revision: yes
Circularity Check
Minor self-citation of prior tractability results; extensions for annotated databases and count-distinct are independent
full rationale
The paper explicitly conditions the extension of prior results on the assumption that annotations do not participate in the lexicographic order and separately derives a new tractability condition for count-distinct aggregation (which lacks a semiring representation). These steps are stated directly in the abstract and do not reduce any new claim to a fitted parameter, self-definition, or unverified self-citation chain. The reference to 'past research' on the non-aggregated case is a standard citation of independent prior work and is not load-bearing for the novel annotated or count-distinct contributions. No equations or derivations in the provided text equate a claimed result to its own inputs by construction.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Conventional lower-bound assumptions in fine-grained complexity
Reference graph
Works this paper leans on
-
[1]
On acyclic conjunctive queries and constant delay enumeration
Guillaume Bagan, Arnaud Durand, and Etienne Grandjean. On acyclic conjunctive queries and constant delay enumeration. In Jacques Duparc and Thomas A. Henzinger, editors, Computer Science Logic , pages 208--222, Berlin, Heidelberg, 2007. Springer Berlin Heidelberg
work page 2007
-
[2]
Computing the JTH solution of a first-order query
Guillaume Bagan, Arnaud Durand, Etienne Grandjean, and Fr \'e d \'e ric Olive. Computing the JTH solution of a first-order query . RAIRO - Theoretical Informatics and Applications (RAIRO: ITA) , 42:147--164, 2008. URL: https://hal.archives-ouvertes.fr/hal-00221730
work page 2008
-
[3]
Answering FO+MOD queries under updates on bounded degree databases
Christoph Berkholz, Jens Keppeler, and Nicole Schweikardt. Answering FO+MOD queries under updates on bounded degree databases. ACM Trans. Database Syst. , 43(2):7:1--7:32, 2018
work page 2018
-
[4]
Johann Brault-Baron. De la pertinence de l' \'e num \'e ration : complexit \'e en logiques propositionnelle et du premier ordre . Theses, Universit \'e de Caen , April 2013
work page 2013
-
[5]
Tight fine-grained bounds for direct access on join queries
Karl Bringmann, Nofar Carmeli, and Stefan Mengel. Tight fine-grained bounds for direct access on join queries. In PODS , pages 427--436. ACM , 2022
work page 2022
-
[6]
Tight fine-grained bounds for direct access on join queries
Karl Bringmann, Nofar Carmeli, and Stefan Mengel. Tight fine-grained bounds for direct access on join queries. CoRR , abs/2201.02401, 2022
-
[7]
Tractable orders for direct access to ranked answers of conjunctive queries
Nofar Carmeli, Nikolaos Tziavelis, Wolfgang Gatterbauer, Benny Kimelfeld, and Mirek Riedewald. Tractable orders for direct access to ranked answers of conjunctive queries. In PODS , pages 325--341. ACM , 2021
work page 2021
-
[8]
Answering (unions of) conjunctive queries using random access and random-order enumeration
Nofar Carmeli, Shai Zeevi, Christoph Berkholz, Alessio Conte, Benny Kimelfeld, and Nicole Schweikardt. Answering (unions of) conjunctive queries using random access and random-order enumeration. ACM Trans. Database Syst. , 47(3):9:1--9:49, 2022
work page 2022
-
[9]
A course in computational algebraic number theory , volume 138 of Graduate texts in mathematics
Henri Cohen. A course in computational algebraic number theory , volume 138 of Graduate texts in mathematics . Springer, 1993
work page 1993
-
[10]
Deciding equivalences among conjunctive aggregate queries
Sara Cohen, Werner Nutt, and Yehoshua Sagiv. Deciding equivalences among conjunctive aggregate queries. J. ACM , 54(2):5–es, apr 2007
work page 2007
-
[11]
On a class of o(n2) problems in computational geometry
Anka Gajentaan and Mark H Overmars. On a class of o(n2) problems in computational geometry. Computational Geometry , 5(3):165--185, 1995. URL: https://www.sciencedirect.com/science/article/pii/0925772195000222, https://doi.org/https://doi.org/10.1016/0925-7721(95)00022-2 doi:https://doi.org/10.1016/0925-7721(95)00022-2
-
[12]
Which arithmetic operations can be performed in constant time in the ram model with addition?, 2022
Étienne Grandjean and Louis Jachiet. Which arithmetic operations can be performed in constant time in the ram model with addition?, 2022. URL: https://arxiv.org/abs/2206.13851, https://doi.org/10.48550/ARXIV.2206.13851 doi:10.48550/ARXIV.2206.13851
-
[13]
Green, Grigoris Karvounarakis, and Val Tannen
Todd J. Green, Grigoris Karvounarakis, and Val Tannen. Provenance semirings. In PODS , PODS '07, page 31–40, New York, NY, USA, 2007. Association for Computing Machinery
work page 2007
-
[14]
The dynamic yannakakis algorithm: Compact and efficient query processing under updates
Muhammad Idris, Martin Ugarte, and Stijn Vansummeren. The dynamic yannakakis algorithm: Compact and efficient query processing under updates. In SIGMOD , page 1259–1274, New York, NY, USA, 2017. Association for Computing Machinery
work page 2017
-
[15]
Leslie E. Trotter Jr. Algorithmic graph theory and perfect graphs, by martin c. golumbic, academic, new york, 284 pp. price: 34.00. Networks , 13(2):304--305, 1983. https://doi.org/10.1002/net.3230130214 doi:10.1002/net.3230130214
-
[16]
Curtin, Benjamin Moseley, Hung Q
Mahmoud Abo Khamis, Ryan R. Curtin, Benjamin Moseley, Hung Q. Ngo, XuanLong Nguyen, Dan Olteanu, and Maximilian Schleich. Functional aggregate queries with additive inequalities. ACM Trans. Database Syst. , 45(4):17:1--17:41, 2020
work page 2020
-
[17]
Mahmoud Abo Khamis, Hung Q. Ngo, and Atri Rudra. FAQ: questions asked frequently. In PODS , pages 13--28. ACM , 2016
work page 2016
-
[18]
Dan Olteanu and Maximilian Schleich. Factorized databases. SIGMOD Rec. , 45(2):5--16, 2016
work page 2016
-
[19]
Size bounds for factorised representations of query results
Dan Olteanu and Jakub Z\' a vodn\' y . Size bounds for factorised representations of query results. ACM Trans. Database Syst. , 40(1), mar 2015
work page 2015
-
[20]
Towards polynomial lower bounds for dynamic problems
Mihai Patrascu. Towards polynomial lower bounds for dynamic problems. In Proceedings of the Forty-Second ACM Symposium on Theory of Computing , STOC '10, page 603–610, New York, NY, USA, 2010. Association for Computing Machinery. https://doi.org/10.1145/1806689.1806772 doi:10.1145/1806689.1806772
-
[21]
The trichotomy of HAVING queries on a probabilistic database
Christopher R \' e and Dan Suciu. The trichotomy of HAVING queries on a probabilistic database. VLDB J. , 18(5):1091--1116, 2009
work page 2009
-
[22]
Dynamic complexity under definable changes
Thomas Schwentick, Nils Vortmeier, and Thomas Zeume. Dynamic complexity under definable changes. ACM Trans. Database Syst. , 43(3):12:1--12:38, 2018
work page 2018
-
[23]
Virginia Vassilevska Williams. Hardness of easy problems: Basing hardness on popular conjectures such as the strong exponential time hypothesis (invited talk). In IPEC . Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, 2015
work page 2015
-
[24]
Algorithms for acyclic database schemes
Mihalis Yannakakis. Algorithms for acyclic database schemes. In VLDB , VLDB '81, page 82–94. VLDB Endowment, 1981
work page 1981
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.