Query Lower Bounds for Correlation Clustering under Memory Constraints
Pith reviewed 2026-05-25 04:47 UTC · model grok-4.3
The pith
Approximating correlation clustering to additive error εn² requires Ω(n/ε²) adjacency-matrix queries.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The paper's central claim is that in order to output a partition of the vertices whose cost is within additive εn² of the optimum, any algorithm must make Ω(n/ε²) adjacency-matrix queries. This bound is tight. Under memory constraints, even approximating the value of the optimal clustering cost requires asymptotically more than n/ε² queries in the random query model. The paper further proves a query lower bound in the general graph model allowing adjacency-matrix, neighbor, and degree queries.
What carries the argument
The adjacency-matrix query lower bound of Ω(n/ε²) for producing an εn²-approximate correlation clustering partition.
If this is right
- No algorithm can output an εn²-approximate partition using fewer adjacency queries.
- Memory-constrained algorithms must pay more queries to approximate the cost.
- The general graph query model also imposes nontrivial query requirements.
- This opens the study of memory-query tradeoffs for other graph problems.
Where Pith is reading between the lines
- These bounds may extend to other partitioning problems like community detection.
- Matching upper bounds would complete the picture on query complexity.
- Practitioners might need to consider different query models when designing systems for graph clustering.
- The results suggest that sublinear query algorithms are impossible for this approximation guarantee.
Load-bearing premise
The random query model and general graph query model correctly capture the memory constraints faced by algorithms.
What would settle it
An algorithm that outputs a partition with additive error at most εn² using o(n/ε²) adjacency-matrix queries would falsify the lower bound.
Figures
read the original abstract
This work initiates the study of memory-query tradeoffs for graph problems, with a focus on correlation clustering. Correlation clustering asks for a partition of the vertices that minimizes disagreements: non-edges inside clusters plus edges across clusters. Our first result is a tight query lower bound: to output a partition whose cost approximates the optimum up to an additive error of $\varepsilon n^2$, any algorithm requires $\Omega(n/\varepsilon^2)$ adjacency-matrix queries. Under memory constraints, we show that even for the seemingly easier task of approximating the optimal clustering cost (without producing a partition), any algorithm in the random query model must make $\gg n/\varepsilon^2$ adjacency-matrix queries. Finally, we prove the first general graph model query lower bound for correlation clustering, where algorithms are allowed adjacency-matrix, neighbor, and degree queries. The latter two bounds are not yet tight, leaving room for sharper results.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript initiates the study of memory-query tradeoffs for graph problems by focusing on correlation clustering. It proves a tight Ω(n/ε²) lower bound on adjacency-matrix queries for any algorithm outputting a partition whose cost approximates the optimum to within additive εn². It further establishes that, under memory constraints in the random query model, even approximating the optimal cost value (without outputting the partition) requires ≫ n/ε² adjacency-matrix queries, and it gives the first lower bound in the general graph query model that permits adjacency-matrix, neighbor, and degree queries (the latter two bounds are not claimed to be tight).
Significance. If the proofs hold, the work establishes the first memory-aware query lower bounds for a canonical graph partitioning problem and identifies a tight information-theoretic barrier in the standard adjacency-matrix model. The explicit separation between producing a partition and merely estimating its cost under memory limits is a useful conceptual contribution that could seed similar trade-off studies for other graph problems.
minor comments (3)
- [Abstract] The notation '≫ n/ε²' in the abstract and introduction should be replaced by a precise asymptotic statement (e.g., Ω(n/ε² · ω(1)) or a concrete super-constant factor) so that readers can immediately compare it with the tight bound.
- The manuscript would benefit from an explicit statement, early in the introduction or model section, of the precise memory bound (in bits or words) that triggers the random-query-model lower bound; without it the phrase 'under memory constraints' remains informal.
- A short table or paragraph contrasting the three models (adjacency-matrix only, random queries with memory limit, general graph queries) and the corresponding lower-bound statements would improve readability.
Simulated Author's Rebuttal
We thank the referee for their positive summary, significance assessment, and recommendation of minor revision. No specific major comments were provided in the report.
Circularity Check
No significant circularity detected
full rationale
The paper establishes query lower bounds via standard information-theoretic and adversary arguments in the random query model and general graph query model. These techniques are independent of the target approximation result and do not rely on self-citations, fitted parameters renamed as predictions, or any self-definitional reductions. The abstract and visible claims show no load-bearing steps that collapse to the inputs by construction; the Ω(n/ε²) bound follows from external query complexity methods rather than internal redefinitions.
Axiom & Free-Parameter Ledger
axioms (1)
- standard math Standard definitions and assumptions of adjacency-matrix, random, and general graph query models in query complexity
Reference graph
Works this paper leans on
-
[1]
Proceedings of the 56th Annual ACM Symposium on Theory of Computing (STOC) , pages =
Combinatorial correlation clustering , author =. Proceedings of the 56th Annual ACM Symposium on Theory of Computing (STOC) , pages =. 2024 , timestamp =. doi:10.1145/3618260.3649712 , _bib2doi_selected =
-
[2]
Advances in Neural Information Processing Systems (NeurIPS) , volume =
Streaming Algorithms and Lower Bounds for Estimating Correlation Clustering Cost , author =. Advances in Neural Information Processing Systems (NeurIPS) , volume =. 2023 , timestamp =
work page 2023
-
[3]
Aditya Bhaskara and Samira Daruki and Suresh Venkatasubramanian , booktitle =. Sublinear Algorithms for. 2018 , organization =. doi:10.4230/LIPIcs.ICALP.2018.16 , publisher =
-
[4]
Journal of Computer and System Sciences , volume =
Clustering with qualitative information , author =. Journal of Computer and System Sciences , volume =. 2005 , timestamp =. doi:10.1016/j.jcss.2004.10.012 , _bib2doi_selected =
-
[5]
13th Innovations in Theoretical Computer Science Conference,
Sublinear Time and Space Algorithms for Correlation Clustering via Sparse-Dense Decompositions , author =. 13th Innovations in Theoretical Computer Science Conference,. 2022 , timestamp =. doi:10.4230/LIPIcs.ITCS.2022.10 , publisher =
-
[6]
IEEE 64th Annual Symposium on Foundations of Computer Science (FOCS) , pages =
Handling correlated rounding error via preclustering: A 1.73-approximation for correlation clustering , author =. IEEE 64th Annual Symposium on Foundations of Computer Science (FOCS) , pages =. 2023 , organization =. doi:10.1109/FOCS57990.2023.00065 , _bib2doi_selected =
-
[7]
Proceedings of the 56th Annual ACM Symposium on Theory of Computing (STOC) , pages =
Approximating maximum matching requires almost quadratic time , author =. Proceedings of the 56th Annual ACM Symposium on Theory of Computing (STOC) , pages =. 2024 , timestamp =. doi:10.1145/3618260.3649785 , _bib2doi_selected =
-
[8]
Correlation clustering , author =. Machine Learning , volume =. 2004 , timestamp =. doi:10.1023/B:MACH.0000033116.57574.95 , _bib2doi_selected =
-
[9]
Proceedings of the World Wide Web Conference (WWW) , pages =
A correlation clustering framework for community detection , author =. Proceedings of the World Wide Web Conference (WWW) , pages =. 2018 , timestamp =. doi:10.1145/3178876.3186110 , _bib2doi_selected =
-
[10]
Advances in Neural Information Processing Systems , volume =
Higher-Order Correlation Clustering for Image Segmentation , author =. Advances in Neural Information Processing Systems , volume =. 2011 , timestamp =
work page 2011
-
[11]
Proceedings of the VLDB Endowment , volume =
Crowdsourcing algorithms for entity resolution , author =. Proceedings of the VLDB Endowment , volume =. 2014 , publisher =. doi:10.14778/2732977.2732982 , _bib2doi_selected =
-
[12]
Proceedings of the 17th international conference on World Wide Web , pages =
A graph-theoretic approach to webpage segmentation , author =. Proceedings of the 17th international conference on World Wide Web , pages =. 2008 , timestamp =. doi:10.1145/1367497.1367549 , _bib2doi_selected =
-
[13]
Proceedings of the Second ACM International Conference on Web Search and Data Mining , pages =
Generating labels from clicks , author =. Proceedings of the Second ACM International Conference on Web Search and Data Mining , pages =. 2009 , timestamp =. doi:10.1145/1498759.1498824 , _bib2doi_selected =
-
[14]
A proof of Alon's second eigenvalue conjecture and related problems , author =. 2008 , publisher =
work page 2008
-
[15]
The Annals of Probability , volume =
The spectral gap of dense random regular graphs , author =. The Annals of Probability , volume =. 2019 , _bib2doi_finished =
work page 2019
-
[16]
Discrete Mathematics , volume =
Explicit construction of linear sized tolerant networks , author =. Discrete Mathematics , volume =. 1988 , timestamp =. doi:10.1016/0012-365X(88)90189-6 , _bib2doi_selected =
-
[17]
Linear Algebra and its applications , volume =
Interlacing eigenvalues and graphs , author =. Linear Algebra and its applications , volume =. 1995 , _bib2doi_finished =
work page 1995
-
[18]
Streaming Lower Bounds for Approximating
Michael Kapralov and Sanjeev Khanna and Madhu Sudan , booktitle =. Streaming Lower Bounds for Approximating. 2015 , organization =. doi:10.1137/1.9781611973730.84 , publisher =
-
[19]
Proceedings of the 39th Annual ACM Symposium on Theory of Computing (STOC) , pages =
Exponential separations for one-way quantum communication complexity, with applications to cryptography , author =. Proceedings of the 39th Annual ACM Symposium on Theory of Computing (STOC) , pages =. 2007 , timestamp =. doi:10.1145/1250790.1250866 , _bib2doi_selected =
-
[20]
The influence of variables on Boolean functions , author =. 1989 , publisher =
work page 1989
- [21]
- [22]
-
[23]
Divergence measures based on the Shannon entropy , author =. 1991 , publisher =. doi:10.1109/18.61115 , url =
-
[24]
Advances in Neural Information Processing Systems (NeurIPS) , volume =
Correlation Clustering with Adaptive Similarity Queries , author =. Advances in Neural Information Processing Systems (NeurIPS) , volume =. 2019 , timestamp =
work page 2019
-
[25]
SIAM Journal on Computing , volume =
Tight bounds for testing bipartiteness in general graphs , author =. SIAM Journal on Computing , volume =. 2004 , publisher =. doi:10.1137/S0097539703436424 , _bib2doi_selected =
-
[26]
11th Innovations in Theoretical Computer Science Conference,
The Random-Query Model and the Memory-Bounded Coupon Collector , author =. 11th Innovations in Theoretical Computer Science Conference,. 2020 , timestamp =. doi:10.4230/LIPIcs.ITCS.2020.20 , publisher =
-
[27]
Proceedings of the 2024 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA) , pages =
Time-space lower bounds for bounded-error computation in the random-query model , author =. Proceedings of the 2024 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA) , pages =. 2024 , organization =. doi:10.1137/1.9781611977912.103 , _bib2doi_selected =
-
[28]
Dubhashi and Desh Ranjan , journal =
Devdatt P. Dubhashi and Desh Ranjan , journal =. Balls and bins:. 1998 , timestamp =. doi:10.1002/(SICI)1098-2418(199809)13:2\ pages =
-
[29]
The Annals of Statistics , pages =
Negative association of random variables with applications , author =. The Annals of Statistics , pages =. 1983 , publisher =
work page 1983
-
[30]
6th Conference on Innovations in Theoretical Computer Science (ITCS) , pages =
Sketching cuts in graphs and hypergraphs , author =. 6th Conference on Innovations in Theoretical Computer Science (ITCS) , pages =. 2015 , timestamp =. doi:10.1145/2688073.2688093 , _bib2doi_selected =
-
[31]
2018 IEEE 59th Annual Symposium on Foundations of Computer Science (FOCS) , pages =
The sketching complexity of graph and hypergraph counting , author =. 2018 IEEE 59th Annual Symposium on Foundations of Computer Science (FOCS) , pages =. 2018 , organization =. doi:10.1109/FOCS.2018.00059 , _bib2doi_selected =
-
[32]
Proceedings of the 22nd annual ACM-SIAM Symposium on Discrete Algorithms (SODA) , pages =
The streaming complexity of cycle counting, sorting by reversals, and other problems , author =. Proceedings of the 22nd annual ACM-SIAM Symposium on Discrete Algorithms (SODA) , pages =. 2011 , organization =. doi:10.1137/1.9781611973082.2 , _bib2doi_selected =
-
[33]
Proceedings of the 25th annual ACM-SIAM Symposium on Discrete Algorithms (SODA) , pages =
Approximating matching size from random streams , author =. Proceedings of the 25th annual ACM-SIAM Symposium on Discrete Algorithms (SODA) , pages =. 2014 , organization =. doi:10.1137/1.9781611973402.55 , _bib2doi_selected =
-
[34]
Proceedings of the Twenty-Eighth Annual
(1+ (1))-Approximation to MAX-CUT Requires Linear Space , author =. Proceedings of the Twenty-Eighth Annual. 2017 , organization =. doi:10.1137/1.9781611974782.112 , publisher =
-
[35]
Proceedings of the 51st Annual ACM SIGACT Symposium on Theory of Computing , pages =
An optimal space lower bound for approximating MAX-CUT , author =. Proceedings of the 51st Annual ACM SIGACT Symposium on Theory of Computing , pages =. 2019 , timestamp =. doi:10.1145/3313276.3316364 , _bib2doi_selected =
-
[36]
66th IEEE Symposium on Foundations of Computer Science (FOCS) 2025 , year =
Multi-Pass Streaming Lower Bounds for Approximating Max-Cut , author =. 66th IEEE Symposium on Foundations of Computer Science (FOCS) 2025 , year =
work page 2025
-
[37]
Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing , pages =
Graph streaming lower bounds for parameter estimation and property testing via a streaming XOR lemma , author =. Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing , pages =. 2021 , timestamp =. doi:10.1145/3406325.3451110 , _bib2doi_selected =
-
[38]
Approximation, Randomization, and Combinatorial Optimization
Streaming Hardness of Unique Games , author =. Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques,. 2019 , organization =. doi:10.4230/LIPIcs.APPROX-RANDOM.2019.5 , publisher =
-
[39]
Approximation, Randomization, and Combinatorial Optimization
Streaming Complexity of Approximating Max 2CSP and Max Acyclic Subgraph , author =. Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques,. 2017 , organization =. doi:10.4230/LIPIcs.APPROX-RANDOM.2017.8 , publisher =
-
[40]
2020 IEEE 61st Annual Symposium on Foundations of Computer Science (FOCS) , pages =
Optimal streaming approximations for all boolean max-2csps and max-ksat , author =. 2020 IEEE 61st Annual Symposium on Foundations of Computer Science (FOCS) , pages =. 2020 , organization =. doi:10.1109/FOCS46700.2020.00039 , _bib2doi_selected =
-
[41]
Conference on Learning Theory, 2-5 July 2022, London,
Hierarchical Clustering in Graph Streams: Single-Pass Algorithms and Space Lower Bounds , author =. Conference on Learning Theory, 2-5 July 2022, London,. 2022 , organization =
work page 2022
-
[42]
Infinite and finite sets , year =
On the factrization of the complete uniform hypergraphs , author =. Infinite and finite sets , year =
-
[43]
Journal of Combinatorial Theory, Series A , volume =
Some intersection theorems for ordered sets and graphs , author =. Journal of Combinatorial Theory, Series A , volume =. 1986 , timestamp =. doi:10.1016/0097-3165(86)90019-1 , _bib2doi_selected =
-
[44]
Proceedings of the VLDB Endowment , volume =
Scalable community detection via parallel correlation clustering , author =. Proceedings of the VLDB Endowment , volume =. 2021 , publisher =. doi:10.14778/3476249.3476282 , _bib2doi_selected =
-
[45]
Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery , volume =
Cluster ensembles , author =. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery , volume =. 2011 , publisher =. doi:10.1002/widm.32 , _bib2doi_selected =
-
[46]
Journal of the ACM (JACM) , volume =
Aggregating inconsistent information: ranking and clustering , author =. Journal of the ACM (JACM) , volume =. 2008 , publisher =. doi:10.1145/1411509.1411513 , _bib2doi_selected =
-
[47]
Proceedings of the 47th annual ACM Symposium on Theory of Computing (STOC) , pages =
Near optimal lp rounding algorithm for correlationclustering on complete and complete k-partite graphs , author =. Proceedings of the 47th annual ACM Symposium on Theory of Computing (STOC) , pages =. 2015 , timestamp =. doi:10.1145/2746539.2746604 , _bib2doi_selected =
-
[48]
Liu, Richard Peng, Maximilian Probst Gutenberg, and Sushant Sachdeva
Correlation clustering with sherali-adams , author =. 2022 IEEE 63rd Annual Symposium on Foundations of Computer Science (FOCS) , pages =. 2022 , organization =. doi:10.1109/FOCS54457.2022.00068 , _bib2doi_selected =
-
[49]
Proceedings of the 56th Annual ACM Symposium on Theory of Computing (STOC) , pages =
Understanding the cluster linear program for correlation clustering , author =. Proceedings of the 56th Annual ACM Symposium on Theory of Computing (STOC) , pages =. 2024 , timestamp =. doi:10.1145/3618260.3649749 , _bib2doi_selected =
-
[50]
Liu, Richard Peng, Maximilian Probst Gutenberg, and Sushant Sachdeva
Almost 3-approximate correlation clustering in constant rounds , author =. 2022 IEEE 63rd Annual Symposium on Foundations of Computer Science (FOCS) , pages =. 2022 , organization =. doi:10.1109/FOCS54457.2022.00074 , _bib2doi_selected =
-
[51]
Proceedings of the 2023 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA) , pages =
Single-pass streaming algorithms for correlation clustering , author =. Proceedings of the 2023 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA) , pages =. 2023 , organization =. doi:10.1137/1.9781611977554.ch33 , _bib2doi_selected =
-
[52]
Proceedings of the 2024 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA) , pages =
Breaking 3-factor approximation for correlation clustering in polylogarithmic rounds , author =. Proceedings of the 2024 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA) , pages =. 2024 , organization =. doi:10.1137/1.9781611977912.143 , _bib2doi_selected =
-
[53]
2019 IEEE 60th Annual Symposium on Foundations of Computer Science (FOCS) , pages =
Fully dynamic maximal independent set with polylogarithmic update time , author =. 2019 IEEE 60th Annual Symposium on Foundations of Computer Science (FOCS) , pages =. 2019 , organization =. doi:10.1109/FOCS.2019.00032 , _bib2doi_selected =
-
[54]
Single-Pass Pivot Algorithm for Correlation Clustering. Keep it simple! , author =. Advances in Neural Information Processing Systems (NeurIPS) , volume =. 2023 , timestamp =
work page 2023
-
[55]
Proceedings of the 2024 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA) , pages =
A (3+ ) -Approximate Correlation Clustering Algorithm in Dynamic Streams , author =. Proceedings of the 2024 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA) , pages =. 2024 , organization =. doi:10.1137/1.9781611977912.101 , _bib2doi_selected =
-
[56]
International Conference on Machine Learning (ICML) , pages =
Correlation clustering in constant many parallel rounds , author =. International Conference on Machine Learning (ICML) , pages =. 2021 , organization =
work page 2021
-
[57]
Journal of the ACM (JACM) , volume =
Fast learning requires good memory: A time-space lower bound for parity learning , author =. Journal of the ACM (JACM) , volume =. 2018 , publisher =. doi:10.1145/3186563 , _bib2doi_selected =
-
[58]
34th Computational Complexity Conference,
Time-Space Lower Bounds for Two-Pass Learning , author =. 34th Computational Complexity Conference,. 2019 , timestamp =. doi:10.4230/LIPIcs.CCC.2019.22 , publisher =
-
[59]
Kothari and Ran Raz , booktitle =
Sumegha Garg and Pravesh K. Kothari and Ran Raz , booktitle =. Time-Space Tradeoffs for Distinguishing Distributions and Applications to Security of Goldreich's. 2020 , organization =. doi:10.4230/LIPIcs.APPROX/RANDOM.2020.21 , publisher =
-
[60]
Memory-Sample Lower Bounds for Learning Parity with Noise , author =. CoRR , volume =. 2021 , timestamp =. doi:10.48550/arXiv.2107.02320 , url =. 2107.02320 , _bib2doi_selected =
-
[61]
Proceedings of the 57th Annual ACM Symposium on Theory of Computing (STOC) , pages =
Solving the Correlation Cluster LP in Sublinear Time , author =. Proceedings of the 57th Annual ACM Symposium on Theory of Computing (STOC) , pages =. 2025 , timestamp =. doi:10.1145/3717823.3718181 , _bib2doi_selected =
-
[62]
2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS) , pages =
A time-space lower bound for a large class of learning problems , author =. 2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS) , pages =. 2017 , organization =. doi:10.1109/FOCS.2017.73 , _bib2doi_selected =
-
[63]
SIAM Journal on Discrete Mathematics , volume =
Testing triangle-freeness in general graphs , author =. SIAM Journal on Discrete Mathematics , volume =. 2008 , publisher =. doi:10.1137/07067917X , _bib2doi_selected =
-
[64]
Journal of the ACM (JACM) , volume =
Property testing and its connection to learning and approximation , author =. Journal of the ACM (JACM) , volume =. 1998 , publisher =. doi:10.1145/285055.285060 , _bib2doi_selected =
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.