On the Average-Case Performance of Greedy for Maximum Coverage

Eric Balkanski; Flore Sentenac; Jason Chatzitheodorou

arxiv: 2604.24884 · v1 · submitted 2026-04-27 · 💻 cs.DS

On the Average-Case Performance of Greedy for Maximum Coverage

Eric Balkanski , Jason Chatzitheodorou , Flore Sentenac This is my paper

Pith reviewed 2026-05-08 01:51 UTC · model grok-4.3

classification 💻 cs.DS

keywords maximum coveragegreedy algorithmaverage-case analysisapproximation ratiorandom modeldifferential equation methodleft-regular model

0 comments

The pith

In a left-regular random model for maximum coverage, greedy's expected approximation ratio always exceeds its worst-case 1-1/e bound by a positive constant and reaches nearly 1 under either of two simple conditions.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper studies the average-case behavior of the greedy algorithm for maximum coverage when set systems are drawn from a left-regular random model. It establishes that the expected ratio is strictly better than 1 minus 1/e across all parameter regimes. Under a density condition or a balance condition on the model, the expected ratio approaches 1 as the number of elements and sets grows large. The work also identifies a regime in which the ratio remains bounded away from 1, no better than 0.94 in expectation. These results provide an average-case explanation for the near-optimal performance of greedy often observed on practical coverage instances.

Core claim

In the left-regular random model, the expected approximation ratio of greedy for maximum coverage is at least 1-1/e plus a fixed positive constant for every choice of parameters. Either a minimum-density condition or a balance condition on the random sets is sufficient to make this expected ratio arbitrarily close to 1 for sufficiently large instances. A parameter regime exists where the expected ratio is at most 0.94. The proofs introduce a novel application of the differential equation method to track the greedy process and a reduction linking coverage gaps to maximum matching in Erdős-Rényi graphs.

What carries the argument

The left-regular random model together with a reduction of coverage gaps to maximum matching in an associated Erdős-Rényi graph and the differential equation method for analyzing the greedy selection process.

If this is right

Greedy is guaranteed to outperform its worst-case analysis in expectation on all instances from this random model.
For large enough graphs satisfying either the density or balance condition, greedy returns a solution whose expected value is arbitrarily close to optimal.
There remain parameter settings inside the model where the expected performance of greedy stays bounded below 1.
The differential-equation and matching-reduction techniques can be reused to analyze other greedy processes on random set systems.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If typical real-world coverage data is statistically close to the left-regular model, the empirical success of greedy is explained by average-case rather than worst-case behavior.
The 0.94 upper bound indicates that even random instances can be constructed to keep greedy from reaching optimality in expectation.
Testing how closely actual coverage benchmarks match the model's degree distribution would directly validate or refute the model's relevance.
The matching connection may extend to average-case analysis of greedy for related problems such as set packing.

Load-bearing premise

The left-regular random model captures the structure of the real-world coverage instances on which greedy empirically achieves ratios close to 1.

What would settle it

Generate many large instances from the left-regular model in the regime claimed to yield ratio at most 0.94 and compute the average ratio achieved by greedy; if it consistently exceeds 0.95 the upper bound claim is false.

Figures

Figures reproduced from arXiv: 2604.24884 by Eric Balkanski, Flore Sentenac, Jason Chatzitheodorou.

**Figure 1.** Figure 1: The computation is carried out for n = m = 100 and the ratio between the value of Greedy and Opt is calculated by averaging over 50 runs. Opt was implemented in Python using Gurobi to solve the integer program. The shaded region depicts the standard error for our estimation. B Generalization Of Technical Tools In this section, we show that our main techniques from the proof of Theorems 1 and 2, can be used… view at source ↗

**Figure 2.** Figure 2: The computation is carried out for n = m = 100, d = 6 and the marginal contributions of Greedy and the fixed set are calculated by averaging over 200 runs. C.2 Analysis For The Different Regions Lemma 5. Let ϵ ∈ (0, 1), d ≥ 204 ϵ 8 , n ≥ d, k ∈ h ϵn 2d , 2n ϵd i , and B ∼ LRR(n, d). Then we have E [|N(Gk)|] ≥ (1 − ϵ) · E [|N(Ok)|]. Proof. To upper bound the optimal solution, first notice that |N(Ok)| = max… view at source ↗

read the original abstract

For the classical maximum coverage problem, the greedy algorithm achieves a worst-case $1-1/e$ approximation, which is optimal unless $\text{P} = \text{NP}$. The notion of coverage appears in a wide range of optimization tasks, where empirical evaluations indicate approximation ratios close to $1$ for the greedy algorithm on real data. Random models have provided average-case justifications for the empirical performance of many well-known algorithms, but little is known about the average-case performance of greedy for maximum coverage. We analyze the expected approximation ratio of the greedy algorithm in a random model, which we call the left-regular random model. We first show that, for all parameter settings of this model, the expected approximation ratio of the greedy algorithm improves by a constant over its worst-case $1-1/e$ guarantee. We then identify two simple conditions, either of which ensures that the expected approximation ratio is close to $1$ for sufficiently large graphs. Finally, we show that there is a regime where greedy does not achieve an expected approximation better than $0.94$. To obtain these results, we develop analytical tools, including a novel application of the differential equation method and a connection to maximum matching in Erd\H{o}s-R\'enyi graphs, which may be of independent interest for other random models.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper introduces a left-regular random model and proves greedy gets a constant-factor improvement over 1-1/e in expectation for max coverage, plus near-1 ratios under two explicit conditions.

read the letter

The central result is that in this new left-regular random model, the expected approximation ratio of greedy for maximum coverage is always strictly better than the worst-case 1-1/e by some constant, and it reaches close to 1 for large graphs under either of two simple conditions on the parameters. They also identify a regime where the expected ratio is at most 0.94, which keeps the claims grounded. The analysis relies on the differential equation method for tracking the process and a reduction to maximum matching in an auxiliary Erdős-Rényi graph; both are standard tools but the specific connection here is fresh and seems to deliver the bounds cleanly. The model parameters are stated explicitly, and the proofs appear to avoid circularity or post-hoc fitting. What stands out is the honest upper bound alongside the positive results, which is rare in average-case papers. The main limitation is that the left-regular model is introduced without much evidence that it matches the structure of real coverage instances where greedy empirically performs near-optimally. Random models often need that bridge to be useful beyond theory, and here it is missing. The work stays within expectation and does not address concentration or the probability of bad draws inside the model. This is aimed at researchers in approximation algorithms and submodular optimization who want to move beyond worst-case analysis. It is technically solid enough to deserve a serious referee, mainly to verify the differential-equation derivations and the matching reduction in detail. I would send it out for review rather than desk-reject.

Referee Report

0 major / 2 minor

Summary. The paper analyzes the expected approximation ratio of the greedy algorithm for the maximum coverage problem in the left-regular random model. It proves that this ratio improves by a constant factor over the worst-case 1-1/e guarantee for all parameter settings, identifies two conditions ensuring the ratio approaches 1 for large graphs, and exhibits a regime where the ratio is at most 0.94. The proofs rely on the differential equation method and a reduction to maximum matching in Erdős-Rényi graphs.

Significance. If the results hold, this provides the first rigorous average-case justification for the near-optimal empirical performance of greedy on coverage problems. The novel application of the differential equation method and the connection to maximum matching in random graphs are strengths that may be of independent interest for analyzing other algorithms on random instances. The work delivers explicit, falsifiable guarantees within the model.

minor comments (2)

The abstract states that two simple conditions ensure near-1 ratios but does not name them; adding a brief parenthetical description would improve accessibility without lengthening the abstract unduly.
In the model definition (likely §2), the choice of left-regularity is presented without explicit comparison to other random set-system models; a short paragraph relating it to existing models would clarify its novelty and scope.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive summary, significance assessment, and recommendation of minor revision. The report accurately captures our main contributions on the expected approximation ratio of greedy for maximum coverage in the left-regular random model. No specific major comments were provided in the report.

Circularity Check

0 steps flagged

No significant circularity detected in derivation chain

full rationale

The paper defines the left-regular random model explicitly and derives its claims about the greedy algorithm's expected approximation ratio using the differential equation method and a reduction to maximum matching in an auxiliary Erdős-Rényi graph. These are standard, externally verifiable techniques applied to the model's parameters without any reduction of the target results to fitted inputs, self-definitions, or load-bearing self-citations. The constant-factor improvement over 1-1/e, the near-1 ratios under stated conditions, and the 0.94 upper bound all follow from the random-graph analysis rather than by construction from the inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 1 invented entities

The central claims rest on the definition of the left-regular random model and the validity of the differential equation method for tracking the greedy process; no free parameters are explicitly fitted in the abstract.

axioms (2)

standard math The differential equation method accurately models the evolution of coverage in the left-regular random model
Invoked to obtain the expected approximation ratio bounds.
domain assumption The reduction to maximum matching in Erdős-Rényi graphs preserves the approximation ratio expectation
Used as an analytical tool for one of the results.

invented entities (1)

left-regular random model no independent evidence
purpose: Random input distribution for average-case analysis of maximum coverage
New model introduced to capture structured random instances; no independent evidence provided that it matches real data.

pith-pipeline@v0.9.0 · 5536 in / 1513 out tokens · 59125 ms · 2026-05-08T01:51:03.307763+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

13 extracted references · 13 canonical work pages

[1]

Trend Hypothesis: ⏐⏐E[Y(i+ 1)−Y(i)|Y(i)]− ( 1−nd m Y(i) n )d⏐⏐≤2d2 m by Lemma 14 forr = m−Y(i)dsincem≥2d2,

work page
[2]

Lipschitz Hypothesis:F(y) = ( 1−nd my )d is nd2 m -Lipschitz since: ⏐⏐F(y 1)−F(y2) ⏐⏐ = ⏐⏐ ( 1−nd my1 )d − ( 1−nd my2 )d⏐⏐ = nd m· ⏐⏐ (( 1−nd my1 )d−1 +···+ ( 1−nd my2 )d−1) ⏐⏐· ⏐⏐y2−y1 ⏐⏐ ≤nd m· ⏐⏐1 +···+ 1 ⏐⏐· ⏐⏐y2−y1 ⏐⏐ = nd2 m · ⏐⏐y1−y2 ⏐⏐,

work page
[3]

Boundedness Hypothesis: ⏐⏐Y (i + 1)−Y(i) ⏐⏐≤1as only nodeui+1 can be accepted on iteration i+ 1,

work page
[4]

We chooseλ= √ 8 log (2n) n

Initial Condition: Y (0) = ˆy= 0so any λ≥2 n≥d2 m min{T,m nd2}+ R n = d2 m min{1, m nd2}+ 1 n works. We chooseλ= √ 8 log (2n) n . ModelY(i)byny ( i n ) and we obtain the following differential equation: y′(t) = ( 1−nd m·y(t) )d , y(0) = 0. By Lemma 16 with probability1−2e−nλ2/8T = 1−1 n we have that: ⏐⏐Y(i)−ny(i/n) ⏐⏐<3e nd2 m T·λn= 3e nd2 m · √ 8nlog(2n)...

work page
[5]

Then, for allt∈[k],Gt(E1) ={a1,...,at}if and only ifGt(E2) ={a1,...,at}. Proof. We show by induction that for allt∈[k], ifGt(E1) ={a1,...,at}, thenGt(E2) ={a1,...,at}. The other direction follows by symmetry. For the base case,t = 1, we have that by construction of E1,E 2,|NE1(ui)|=|NE2(ui)|for alli∈[n], therefore the algorithm works the same in both grap...

work page
[6]

The marginal contribution ofa was higher than that ofat+1, i.e.|NE2(a)\NE2(Gt(E2))|> |NE2(at+1)\N E2(Gt(E2))|, or

work page
[7]

26 We proceed to show that the marginal contribution ofa,at+1 on iterationt+ 1is the same regardless of which graph we runGreedyon

The marginal contribution ofa was equal to that ofat+1, but a was chosen due to the tie-breaking rule. 26 We proceed to show that the marginal contribution ofa,at+1 on iterationt+ 1is the same regardless of which graph we runGreedyon. The marginal contribution ofaonB 2 is: |NE2(a)\N E2(Gt(E2))|=|NE2(a)\N E2(Gt(E1))| =|NE2(a)\N E1(Gt(E1))| =|NE2(a)|−|NE2(a...

work page
[8]

Trend Hypothesis: ⏐⏐E[Y(i+ 1)−Y(i)|Y(i)]− ( 1−dY(i) n )d⏐⏐≤2d2 n by Lemma 34 forr = n−Y(i)dsincen≥2d2, 32

work page
[9]

Lipschitz Hypothesis:F(y) = (1−dy)d isd 2-Lipschitz since: ⏐⏐F(y 1)−F(y2) ⏐⏐ = ⏐⏐(1−dy1)d−(1−dy2)d⏐⏐ =d· ⏐⏐ ( (1−dy1)d−1+···+ (1−dy2)d−1)⏐⏐· ⏐⏐y2−y1 ⏐⏐ ≤d· ⏐⏐1 +···+ 1 ⏐⏐· ⏐⏐y2−y1 ⏐⏐ =d 2· ⏐⏐y1−y2 ⏐⏐,

work page
[10]

Boundedness Hypothesis: ⏐⏐Y(i+ 1)−Y(i) ⏐⏐≤1as only one accept can occur,

work page
[11]

We chooseλ= √ 8 log (2n) n

Initial Condition:Y (0) = ˆy= 0so any λ≥2 n = d2 n min{1, 1 d2}+ 1 n = d2 n min{1, 1 d2}+ 1 n works. We chooseλ= √ 8 log (2n) n . ModelY(i)byny ( i n ) and we obtain the following differential equation: y′(t) = (1−d·y(t))d, y(0) = 0⇒y(t) =1 d· ( 1−(1 +d(d−1)·t)−1 d−1 ) . By Lemma 16 with probability1−2e−nλ2/8 = 1−1 n we have that: ⏐⏐Y(i)−ny(i/n) ⏐⏐<3e d2 ...

work page
[12]

∑k i=1di̸∈[(ϵ/2)m,(ϵ/2)−1m]or

work page
[13]

Then greedy achieves, in expectation, a1−ϵapproximation

1 k ∑k i=1di≥204 max(1,(n/m)2) ϵ8 . Then greedy achieves, in expectation, a1−ϵapproximation. Proof.Follows by Lemmas 50 and 51. Theorem 6(Power Law Left Degrees).Let n,m∈N , X1,X 2,...,Xn be i.i.d. samples from Pareto distribution with parametersa >1and xmin = 1. Also let d1≥···≥dn be their order statistics, ϵ∈(0, 1)arbitrary and draw graphB from random m...

work page

[1] [1]

Trend Hypothesis: ⏐⏐E[Y(i+ 1)−Y(i)|Y(i)]− ( 1−nd m Y(i) n )d⏐⏐≤2d2 m by Lemma 14 forr = m−Y(i)dsincem≥2d2,

work page

[2] [2]

Lipschitz Hypothesis:F(y) = ( 1−nd my )d is nd2 m -Lipschitz since: ⏐⏐F(y 1)−F(y2) ⏐⏐ = ⏐⏐ ( 1−nd my1 )d − ( 1−nd my2 )d⏐⏐ = nd m· ⏐⏐ (( 1−nd my1 )d−1 +···+ ( 1−nd my2 )d−1) ⏐⏐· ⏐⏐y2−y1 ⏐⏐ ≤nd m· ⏐⏐1 +···+ 1 ⏐⏐· ⏐⏐y2−y1 ⏐⏐ = nd2 m · ⏐⏐y1−y2 ⏐⏐,

work page

[3] [3]

Boundedness Hypothesis: ⏐⏐Y (i + 1)−Y(i) ⏐⏐≤1as only nodeui+1 can be accepted on iteration i+ 1,

work page

[4] [4]

We chooseλ= √ 8 log (2n) n

Initial Condition: Y (0) = ˆy= 0so any λ≥2 n≥d2 m min{T,m nd2}+ R n = d2 m min{1, m nd2}+ 1 n works. We chooseλ= √ 8 log (2n) n . ModelY(i)byny ( i n ) and we obtain the following differential equation: y′(t) = ( 1−nd m·y(t) )d , y(0) = 0. By Lemma 16 with probability1−2e−nλ2/8T = 1−1 n we have that: ⏐⏐Y(i)−ny(i/n) ⏐⏐<3e nd2 m T·λn= 3e nd2 m · √ 8nlog(2n)...

work page

[5] [5]

Then, for allt∈[k],Gt(E1) ={a1,...,at}if and only ifGt(E2) ={a1,...,at}. Proof. We show by induction that for allt∈[k], ifGt(E1) ={a1,...,at}, thenGt(E2) ={a1,...,at}. The other direction follows by symmetry. For the base case,t = 1, we have that by construction of E1,E 2,|NE1(ui)|=|NE2(ui)|for alli∈[n], therefore the algorithm works the same in both grap...

work page

[6] [6]

The marginal contribution ofa was higher than that ofat+1, i.e.|NE2(a)\NE2(Gt(E2))|> |NE2(at+1)\N E2(Gt(E2))|, or

work page

[7] [7]

26 We proceed to show that the marginal contribution ofa,at+1 on iterationt+ 1is the same regardless of which graph we runGreedyon

The marginal contribution ofa was equal to that ofat+1, but a was chosen due to the tie-breaking rule. 26 We proceed to show that the marginal contribution ofa,at+1 on iterationt+ 1is the same regardless of which graph we runGreedyon. The marginal contribution ofaonB 2 is: |NE2(a)\N E2(Gt(E2))|=|NE2(a)\N E2(Gt(E1))| =|NE2(a)\N E1(Gt(E1))| =|NE2(a)|−|NE2(a...

work page

[8] [8]

Trend Hypothesis: ⏐⏐E[Y(i+ 1)−Y(i)|Y(i)]− ( 1−dY(i) n )d⏐⏐≤2d2 n by Lemma 34 forr = n−Y(i)dsincen≥2d2, 32

work page

[9] [9]

Lipschitz Hypothesis:F(y) = (1−dy)d isd 2-Lipschitz since: ⏐⏐F(y 1)−F(y2) ⏐⏐ = ⏐⏐(1−dy1)d−(1−dy2)d⏐⏐ =d· ⏐⏐ ( (1−dy1)d−1+···+ (1−dy2)d−1)⏐⏐· ⏐⏐y2−y1 ⏐⏐ ≤d· ⏐⏐1 +···+ 1 ⏐⏐· ⏐⏐y2−y1 ⏐⏐ =d 2· ⏐⏐y1−y2 ⏐⏐,

work page

[10] [10]

Boundedness Hypothesis: ⏐⏐Y(i+ 1)−Y(i) ⏐⏐≤1as only one accept can occur,

work page

[11] [11]

We chooseλ= √ 8 log (2n) n

Initial Condition:Y (0) = ˆy= 0so any λ≥2 n = d2 n min{1, 1 d2}+ 1 n = d2 n min{1, 1 d2}+ 1 n works. We chooseλ= √ 8 log (2n) n . ModelY(i)byny ( i n ) and we obtain the following differential equation: y′(t) = (1−d·y(t))d, y(0) = 0⇒y(t) =1 d· ( 1−(1 +d(d−1)·t)−1 d−1 ) . By Lemma 16 with probability1−2e−nλ2/8 = 1−1 n we have that: ⏐⏐Y(i)−ny(i/n) ⏐⏐<3e d2 ...

work page

[12] [12]

∑k i=1di̸∈[(ϵ/2)m,(ϵ/2)−1m]or

work page

[13] [13]

Then greedy achieves, in expectation, a1−ϵapproximation

1 k ∑k i=1di≥204 max(1,(n/m)2) ϵ8 . Then greedy achieves, in expectation, a1−ϵapproximation. Proof.Follows by Lemmas 50 and 51. Theorem 6(Power Law Left Degrees).Let n,m∈N , X1,X 2,...,Xn be i.i.d. samples from Pareto distribution with parametersa >1and xmin = 1. Also let d1≥···≥dn be their order statistics, ϵ∈(0, 1)arbitrary and draw graphB from random m...

work page