Nearly optimal edge estimation with independent set queries

Amit Levi; Erik Waingarten; Xi Chen

arxiv: 1907.04381 · v1 · pith:4O66NNMInew · submitted 2019-07-09 · 💻 cs.DS

Nearly optimal edge estimation with independent set queries

Xi Chen , Amit Levi , Erik Waingarten This is my paper

Pith reviewed 2026-05-24 23:52 UTC · model grok-4.3

classification 💻 cs.DS

keywords independent set queriesedge estimationquery complexityapproximation algorithmsgraph algorithmslower boundssublinear algorithms

0 comments

The pith

An algorithm approximates the number of edges m to within (1+ε) using min(√m, n/√m) times poly(log n, 1/ε) independent set queries.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper gives an algorithm that approximates the edge count m of an unknown graph to a (1+ε) factor by repeatedly querying whether chosen subsets of vertices form independent sets. The number of such queries is bounded by min(√m, n/√m) multiplied by a polynomial in the logarithm of n and in 1/ε. A nearly matching lower bound of min(√m, n/√m) divided by a polylogarithmic factor in n is also proved, showing that the upper bound is optimal up to those extra factors. A reader would care because the result pins down the query cost of edge estimation in this restricted access model and improves the dependence on m over earlier algorithms.

Core claim

The authors establish that m can be (1+ε)-approximated with min(√m, n/√m)·poly(log n, 1/ε) independent set queries and prove that min(√m, n/√m)/polylog(n) queries are necessary in the worst case, establishing near-optimality up to poly(log n, 1/ε) factors.

What carries the argument

The independent set oracle, which on input S returns whether S forms an independent set, used inside an adaptive estimation procedure whose query count automatically adjusts to the unknown scale of m.

If this is right

Edge estimation in the independent-set query model has query complexity characterized up to polylogarithmic factors.
The previous upper bound of min(√m, n²/m) is improved whenever m is smaller than n.
The algorithm works for any undirected simple graph and any ε > 0.
No substantially better algorithm exists unless the polylog factors can be removed.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same query model and bound might apply to estimating other global graph parameters such as the number of triangles when the oracle can be adapted.
In network-monitoring settings where only independence checks are cheap, the bound gives a concrete limit on how precisely edges can be counted.
The optimality result suggests that further improvements would require either a different oracle or relaxed approximation guarantees.

Load-bearing premise

The independent set oracle returns exact yes-or-no answers with no errors for every queried subset.

What would settle it

An explicit family of graphs on n vertices together with a query algorithm that achieves (1+ε) approximation using asymptotically fewer than min(√m, n/√m) queries on every graph in the family would falsify the lower bound.

read the original abstract

We study the problem of estimating the number of edges of an unknown, undirected graph $G=([n],E)$ with access to an independent set oracle. When queried about a subset $S\subseteq [n]$ of vertices the independent set oracle answers whether $S$ is an independent set in $G$ or not. Our first main result is an algorithm that computes a $(1+\epsilon)$-approximation of the number of edges $m$ of the graph using $\min(\sqrt{m},n / \sqrt{m})\cdot\textrm{poly}(\log n,1/\epsilon)$ independent set queries. This improves the upper bound of $\min(\sqrt{m},n^2/m)\cdot\textrm{poly}(\log n,1/\epsilon)$ by Beame et al. \cite{BHRRS18}. Our second main result shows that ${\min(\sqrt{m},n/\sqrt{m}))/\textrm{polylog}(n)}$ independent set queries are necessary, thus establishing that our algorithm is optimal up to a factor of $\textrm{poly}(\log n, 1/\epsilon)$.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper tightens the independent-set query bound for (1+ε)-approximating m from min(√m, n²/m) to min(√m, n/√m) and adds a nearly matching lower bound.

read the letter

The core advance is a cleaner upper bound on independent-set queries for approximating the number of edges m. The new algorithm uses min(√m, n/√m) · poly(log n, 1/ε) queries instead of the earlier min(√m, n²/m) bound from Beame et al. They also prove that min(√m, n/√m)/polylog(n) queries are necessary in the worst case, so the result is tight up to the usual polylog and 1/ε factors. Both pieces are new relative to the cited prior work. The construction is an explicit algorithmic improvement and the lower bound is information-theoretic, which keeps the argument clean. The model assumptions are standard: the oracle answers exactly whether a queried set is independent, the graph is simple and undirected, and queries can be adaptive. No circularity or invented parameters show up in the abstract. The main soft spot is that the full proofs and algorithm details are not visible here, so the precise handling of the log factors and the case split between the √m and n/√m regimes cannot be checked directly. The improvement looks plausible on the stated claims, but a referee would need to verify the analysis does not hide extra logarithmic overhead or restrict the graph class. This is a narrow but useful refinement inside sublinear algorithms and property testing. Readers who already work on oracle models for graphs will get immediate value from the tighter bound and the matching lower bound. It is not a broad result that changes how people think about graph estimation outside this model. The paper is coherent on its own terms and deserves a serious referee rather than a desk reject; the claims are specific enough that a review can focus on whether the upper-bound construction actually delivers the stated query count.

Referee Report

0 major / 2 minor

Summary. The paper studies estimating the number of edges m in an unknown undirected graph G=([n],E) using an independent-set oracle that, on input S, returns whether S is independent. It gives an algorithm computing a (1+ε)-approximation to m with min(√m, n/√m)·poly(log n,1/ε) queries, improving the prior min(√m,n²/m)·poly(log n,1/ε) bound of Beame et al., together with a matching Ω(min(√m,n/√m)/polylog n) information-theoretic lower bound establishing near-optimality up to poly(log n,1/ε) factors.

Significance. If the stated bounds hold, the result is significant: it supplies an explicit algorithmic construction for the improved upper bound together with a matching lower bound (up to polylog factors) in the exact independent-set oracle model. This advances sublinear graph algorithms by establishing instance-optimal query complexity for edge estimation and improves upon prior work in a clean, falsifiable way.

minor comments (2)

The abstract and introduction should include an explicit side-by-side comparison (perhaps as a small table) of the new query bound versus the Beame et al. bound for the full range of m, to make the improvement immediately visible.
Notation for the polylog factors in the lower bound could be standardized (e.g., consistently writing polylog(n) rather than mixing /polylog(n) and poly(log n,1/ε)) across the abstract, introduction, and theorem statements.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive review, accurate summary of our results, and recommendation to accept. We are pleased that the significance of the improved upper and lower bounds for edge estimation in the independent-set oracle model was recognized.

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper's central results consist of an explicit algorithmic construction achieving the stated query upper bound and a separate information-theoretic argument establishing the matching lower bound (up to poly-log factors). No load-bearing step reduces a claimed prediction or uniqueness result to a fitted parameter, self-citation chain, or definitional tautology; the cited prior work (Beame et al.) is external and the oracle model assumptions are stated directly without circular appeal.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The claim rests on standard domain assumptions of the query model and graph class; no free parameters or invented entities are introduced.

axioms (2)

domain assumption The graph is undirected and simple (no self-loops or multi-edges).
Implicit in the problem statement of estimating edges in G=([n],E).
domain assumption The independent set oracle returns exact boolean answers for any subset.
The query model is defined with perfect oracle responses.

pith-pipeline@v0.9.0 · 5722 in / 1258 out tokens · 27939 ms · 2026-05-24T23:52:16.155036+00:00 · methodology

Nearly optimal edge estimation with independent set queries

Core claim

What carries the argument

If this is right

Where Pith is reading between the lines

Load-bearing premise

What would settle it

discussion (0)