Nearly optimal edge estimation with independent set queries
Pith reviewed 2026-05-24 23:52 UTC · model grok-4.3
The pith
An algorithm approximates the number of edges m to within (1+ε) using min(√m, n/√m) times poly(log n, 1/ε) independent set queries.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The authors establish that m can be (1+ε)-approximated with min(√m, n/√m)·poly(log n, 1/ε) independent set queries and prove that min(√m, n/√m)/polylog(n) queries are necessary in the worst case, establishing near-optimality up to poly(log n, 1/ε) factors.
What carries the argument
The independent set oracle, which on input S returns whether S forms an independent set, used inside an adaptive estimation procedure whose query count automatically adjusts to the unknown scale of m.
If this is right
- Edge estimation in the independent-set query model has query complexity characterized up to polylogarithmic factors.
- The previous upper bound of min(√m, n²/m) is improved whenever m is smaller than n.
- The algorithm works for any undirected simple graph and any ε > 0.
- No substantially better algorithm exists unless the polylog factors can be removed.
Where Pith is reading between the lines
- The same query model and bound might apply to estimating other global graph parameters such as the number of triangles when the oracle can be adapted.
- In network-monitoring settings where only independence checks are cheap, the bound gives a concrete limit on how precisely edges can be counted.
- The optimality result suggests that further improvements would require either a different oracle or relaxed approximation guarantees.
Load-bearing premise
The independent set oracle returns exact yes-or-no answers with no errors for every queried subset.
What would settle it
An explicit family of graphs on n vertices together with a query algorithm that achieves (1+ε) approximation using asymptotically fewer than min(√m, n/√m) queries on every graph in the family would falsify the lower bound.
read the original abstract
We study the problem of estimating the number of edges of an unknown, undirected graph $G=([n],E)$ with access to an independent set oracle. When queried about a subset $S\subseteq [n]$ of vertices the independent set oracle answers whether $S$ is an independent set in $G$ or not. Our first main result is an algorithm that computes a $(1+\epsilon)$-approximation of the number of edges $m$ of the graph using $\min(\sqrt{m},n / \sqrt{m})\cdot\textrm{poly}(\log n,1/\epsilon)$ independent set queries. This improves the upper bound of $\min(\sqrt{m},n^2/m)\cdot\textrm{poly}(\log n,1/\epsilon)$ by Beame et al. \cite{BHRRS18}. Our second main result shows that ${\min(\sqrt{m},n/\sqrt{m}))/\textrm{polylog}(n)}$ independent set queries are necessary, thus establishing that our algorithm is optimal up to a factor of $\textrm{poly}(\log n, 1/\epsilon)$.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper studies estimating the number of edges m in an unknown undirected graph G=([n],E) using an independent-set oracle that, on input S, returns whether S is independent. It gives an algorithm computing a (1+ε)-approximation to m with min(√m, n/√m)·poly(log n,1/ε) queries, improving the prior min(√m,n²/m)·poly(log n,1/ε) bound of Beame et al., together with a matching Ω(min(√m,n/√m)/polylog n) information-theoretic lower bound establishing near-optimality up to poly(log n,1/ε) factors.
Significance. If the stated bounds hold, the result is significant: it supplies an explicit algorithmic construction for the improved upper bound together with a matching lower bound (up to polylog factors) in the exact independent-set oracle model. This advances sublinear graph algorithms by establishing instance-optimal query complexity for edge estimation and improves upon prior work in a clean, falsifiable way.
minor comments (2)
- The abstract and introduction should include an explicit side-by-side comparison (perhaps as a small table) of the new query bound versus the Beame et al. bound for the full range of m, to make the improvement immediately visible.
- Notation for the polylog factors in the lower bound could be standardized (e.g., consistently writing polylog(n) rather than mixing /polylog(n) and poly(log n,1/ε)) across the abstract, introduction, and theorem statements.
Simulated Author's Rebuttal
We thank the referee for their positive review, accurate summary of our results, and recommendation to accept. We are pleased that the significance of the improved upper and lower bounds for edge estimation in the independent-set oracle model was recognized.
Circularity Check
No significant circularity
full rationale
The paper's central results consist of an explicit algorithmic construction achieving the stated query upper bound and a separate information-theoretic argument establishing the matching lower bound (up to poly-log factors). No load-bearing step reduces a claimed prediction or uniqueness result to a fitted parameter, self-citation chain, or definitional tautology; the cited prior work (Beame et al.) is external and the oracle model assumptions are stated directly without circular appeal.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption The graph is undirected and simple (no self-loops or multi-edges).
- domain assumption The independent set oracle returns exact boolean answers for any subset.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.