pith. sign in

arxiv: 1906.11237 · v1 · pith:PRL7523Knew · submitted 2019-06-26 · 💻 cs.DS · cs.DM

Making a Sieve Random: Improved Semi-Streaming Algorithm for Submodular Maximization under a Cardinality Constraint

Pith reviewed 2026-05-25 15:04 UTC · model grok-4.3

classification 💻 cs.DS cs.DM
keywords submodular maximizationsemi-streamingnon-monotone submodularcardinality constraintapproximation algorithmrandomized algorithmsieve streaming
0
0 comments X

The pith

Injecting controlled randomness into Sieve-Streaming yields a 4.282-approximation semi-streaming algorithm for non-monotone submodular maximization.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that a carefully calibrated amount of randomness lets the Sieve-Streaming technique extend from monotone to non-monotone submodular functions while staying within semi-streaming memory limits and polynomial time. This produces an approximation ratio of 4.282, beating the prior 5.828 ratio obtained via local search. A reader would care because many practical problems require maximizing non-monotone submodular objectives over large data arriving in one pass, where memory is scarce. The work further indicates that allowing super-polynomial time improves the ratio to 3 + ε.

Core claim

By introducing a just right amount of randomness into the Sieve-Streaming technique, it is possible to obtain a semi-streaming polynomial time algorithm that achieves a 4.282-approximation for maximizing non-negative non-monotone submodular functions subject to a cardinality constraint. The approximation ratio can be improved to 3 + ε when super-polynomial running time is allowed.

What carries the argument

The randomized adaptation of Sieve-Streaming, which maintains a small set of candidate solutions across the stream using probabilistic selection to manage non-monotonicity.

If this is right

  • The best known semi-streaming guarantee for the non-monotone case improves from 5.828 to 4.282.
  • The algorithm remains polynomial time and respects the semi-streaming memory bound.
  • Sieve-Streaming extends to non-monotone objectives without switching to local search.
  • Relaxing the runtime bound yields ratios approaching 3.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Similar controlled randomization might narrow gaps between monotone and non-monotone performance in other streaming combinatorial problems.
  • The technique could be tested on related constraints such as matroid or knapsack to see if the same randomness injection applies.
  • One could check whether the approach scales to multiple passes or sliding-window streams without losing the memory advantage.

Load-bearing premise

A suitable level of randomness exists that keeps the procedure semi-streaming and polynomial-time while delivering the claimed approximation despite non-monotonicity.

What would settle it

A specific input stream and non-monotone submodular function on which the algorithm returns a set whose value is strictly less than 1/4.282 of the optimum would disprove the ratio.

read the original abstract

In this paper we consider the problem of maximizing a non-negative submodular function subject to a cardinality constraint in the data stream model. Previously, the best known algorithm for this problem was a $5.828$-approximation semi-streaming algorithm based on a local search technique (Feldman et al., 2018). For the special case of this problem in which the objective function is also monotone, the state-of-the-art semi-streaming algorithm is an algorithm known as Sieve-Streaming, which is based on a different technique (Badanidiyuru, 2014). Adapting the technique of Sieve-Streaming to non-monotone objective functions has turned out to be a challenging task, which has so far prevented an improvement over the local search based $5.828$-approximation. In this work, we overcome the above challenge, and manage to adapt Sieve-Streaming to non-monotone objective functions by introducing a "just right" amount of randomness into it. Consequently, we get a semi-streaming polynomial time $4.282$-approximation algorithm for non-monotone objectives. Moreover, if one allows our algorithm to run in super-polynomial time, then its approximation ratio can be further improved to $3 + \varepsilon$.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript claims to adapt the Sieve-Streaming technique to non-monotone submodular maximization under a cardinality constraint by introducing a carefully chosen amount of randomness, yielding a semi-streaming polynomial-time 4.282-approximation (improving on the prior 5.828 local-search bound) while preserving O(k polylog n) space; a (3+ε) ratio is claimed when super-polynomial time is allowed.

Significance. If the claims hold, the result improves the best known semi-streaming guarantee for non-monotone objectives by roughly 26% relative to Feldman et al. (2018) and demonstrates that controlled randomization can extend the sieve approach beyond monotonicity without sacrificing the streaming model constraints. The work would narrow the gap between streaming and offline ratios for this problem.

major comments (2)
  1. [Abstract / §3 (randomness mechanism)] The load-bearing claim is the existence of a single randomness level (sampling probability or randomized threshold set) such that the expected value of the returned set is at least OPT/4.282, the maintained collection remains O(k log n) in total size, and per-element processing stays polynomial for non-monotone functions. No outline of the expectation calculation, error analysis, or verification that the parameter is independent of OPT appears in the abstract; this must be supplied explicitly in the main body (likely §3 or §4) with a concrete derivation showing the ratio is achieved without circular dependence on the optimum.
  2. [Abstract / §5 (super-polynomial variant)] For the super-polynomial-time (3+ε) variant, the manuscript must clarify whether the improvement requires additional passes or an exponential dependence on 1/ε that would violate the semi-streaming space bound; the abstract presents both results as consequences of the same randomized sieve, so the separation between the two regimes needs an explicit parameter setting and runtime analysis.
minor comments (2)
  1. [Abstract] The abstract states the numerical ratios 4.282 and 3+ε without indicating how they arise (e.g., solution to an optimization over the randomness parameter); a brief derivation sketch or reference to the governing equation would improve readability.
  2. [Introduction / Related Work] Comparison to the monotone Sieve-Streaming (Badanidiyuru et al., 2014) and the local-search baseline (Feldman et al., 2018) should include explicit space and pass complexity columns in any comparison table.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments. We address each major point below and will incorporate clarifications to improve the exposition of the analysis.

read point-by-point responses
  1. Referee: [Abstract / §3 (randomness mechanism)] The load-bearing claim is the existence of a single randomness level (sampling probability or randomized threshold set) such that the expected value of the returned set is at least OPT/4.282, the maintained collection remains O(k log n) in total size, and per-element processing stays polynomial for non-monotone functions. No outline of the expectation calculation, error analysis, or verification that the parameter is independent of OPT appears in the abstract; this must be supplied explicitly in the main body (likely §3 or §4) with a concrete derivation showing the ratio is achieved without circular dependence on the optimum.

    Authors: The full derivation appears in §3, where a fixed sampling probability (chosen independently of OPT via a standard doubling argument over possible threshold scales) is used to bound the expectation via linearity and the submodular diminishing-returns property, yielding the 4.282 ratio while keeping the sieve collection size O(k log n) and per-element work polynomial. We agree the abstract lacks an outline and will revise it to include a one-sentence summary of the calculation; we will also add an explicit error-analysis paragraph in §3 to emphasize independence from OPT. revision: yes

  2. Referee: [Abstract / §5 (super-polynomial variant)] For the super-polynomial-time (3+ε) variant, the manuscript must clarify whether the improvement requires additional passes or an exponential dependence on 1/ε that would violate the semi-streaming space bound; the abstract presents both results as consequences of the same randomized sieve, so the separation between the two regimes needs an explicit parameter setting and runtime analysis.

    Authors: The (3+ε) improvement is obtained from the same single-pass randomized sieve by increasing the number of sampled thresholds (yielding super-polynomial but not exponential-in-1/ε time); the space bound O(k polylog n) remains identical and independent of ε, with no extra passes. We will add an explicit parameter-setting paragraph (with runtime as a function of ε) in the revised §5 to separate the regimes clearly. revision: yes

Circularity Check

0 steps flagged

No circularity: algorithmic construction with independent analysis

full rationale

The paper constructs a randomized variant of Sieve-Streaming for non-monotone submodular maximization and proves a 4.282-approximation (or 3+ε in super-polynomial time) while preserving semi-streaming space. The abstract frames the ratio as the outcome of an explicit algorithmic design and analysis rather than a quantity defined in terms of itself or obtained by fitting. Prior work (Feldman et al. 2018) is cited only as the baseline being improved upon; the central derivation does not reduce to that citation or to any self-referential equation. No fitted parameters are renamed as predictions, no ansatz is smuggled via citation, and no uniqueness theorem is invoked. The derivation chain is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The work rests on the standard definition of submodularity and the semi-streaming model; no free parameters, invented entities, or ad-hoc axioms appear in the abstract.

axioms (1)
  • domain assumption A set function is submodular if it satisfies the diminishing-returns property for all sets.
    Invoked implicitly as the problem definition in the abstract.

pith-pipeline@v0.9.0 · 5765 in / 1280 out tokens · 57516 ms · 2026-05-25T15:04:15.762921+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

37 extracted references · 37 canonical work pages

  1. [1]

    Submodular max-sa t

    Yossi Azar, Iftah Gamzu, and Ran Roth. Submodular max-sa t. In ESA, pages 323–334, 2011

  2. [2]

    Streaming submodular maximization: massive data summariz ation on the fly

    Ashwinkumar Badanidiyuru, Baharan Mirzasoleiman, Ami n Karbasi, and Andreas Krause. Streaming submodular maximization: massive data summariz ation on the fly. In KDD, pages 671–680, 2014. 9

  3. [3]

    Deterministic algori thms for submodular maximization problems

    Niv Buchbinder and Moran Feldman. Deterministic algori thms for submodular maximization problems. ACM Trans. Algorithms , 14(3):32:1–32:20, 2018

  4. [4]

    Constrained submodul ar maximization via a non- symmetric technique, 2019

    Niv Buchbinder and Moran Feldman. Constrained submodul ar maximization via a non- symmetric technique, 2019. To appear in Mathematics of Operations Research

  5. [5]

    Determin istic ( 1/ 2 + ε)-approximation for submodular maximization over a matroid

    Niv Buchbinder, Moran Feldman, and Mohit Garg. Determin istic ( 1/ 2 + ε)-approximation for submodular maximization over a matroid. In SODA, pages 241–254, 2019

  6. [6]

    Online su bmodular maximization: Beating 1/2 made simple, 2019

    Niv Buchbinder, Moran Feldman, and Mohit Garg. Online su bmodular maximization: Beating 1/2 made simple, 2019. To appear in IPCO

  7. [7]

    Submodular maximization with cardinality constraints

    Niv Buchbinder, Moran Feldman, Joseph Naor, and Roy Schw artz. Submodular maximization with cardinality constraints. In SODA, pages 1433–1452, 2014

  8. [8]

    Online submodular maximization with preemption, 2019

    Niv Buchbinder, Moran Feldman, and Roy Schwartz. Online submodular maximization with preemption, 2019. To appear in ACM Transactions on Algorithms

  9. [9]

    Maximizing a monotone submodular function subject to a matroid constraint

    Gruia C˘ alinescu, Chandra Chekuri, Martin P´ al, and Jan Vondr´ ak. Maximizing a monotone submodular function subject to a matroid constraint. SIAM J. Comput. , 40(6):1740–1766, 2011

  10. [10]

    Submodular maximizat ion meets streaming: matchings, matroids, and more

    Amit Chakrabarti and Sagar Kale. Submodular maximizat ion meets streaming: matchings, matroids, and more. Math. Program., 154(1-2):225–247, 2015

  11. [11]

    Hubert Chan, Zhiyi Huang, Shaofeng H.-C

    T.-H. Hubert Chan, Zhiyi Huang, Shaofeng H.-C. Jiang, N ing Kang, and Zhihao Gavin Tang. Online submodular maximization with free disposal. ACM Trans. Algorithms , 14(4):56:1– 56:29, 2018

  12. [12]

    Personal communication, 2018

    Chandra Chekuri. Personal communication, 2018

  13. [13]

    Str eaming algorithms for submodular function maximization

    Chandra Chekuri, Shalmoli Gupta, and Kent Quanrud. Str eaming algorithms for submodular function maximization. In ICALP, pages 318–330, 2015

  14. [14]

    Dep endent randomized rounding via exchange properties of combinatorial structures

    Chandra Chekuri, Jan Vondr´ ak, and Rico Zenklusen. Dep endent randomized rounding via exchange properties of combinatorial structures. In FOCS, pages 575–584, 2010

  15. [15]

    Sub modular function maximization via the multilinear relaxation and contention resolution sche mes

    Chandra Chekuri, Jan Vondr´ ak, and Rico Zenklusen. Sub modular function maximization via the multilinear relaxation and contention resolution sche mes. SIAM J. Comput. , 43(6):1831– 1879, 2014

  16. [16]

    Submodular meets spectr al: Greedy algorithms for subset selection, sparse approximation and dictionary selection

    Abhimanyu Das and David Kempe. Submodular meets spectr al: Greedy algorithms for subset selection, sparse approximation and dictionary selection . In ICML, pages 1057–1064, 2011

  17. [17]

    Revenue submodularity

    Shaddin Dughmi, Tim Roughgarden, and Mukund Sundarara jan. Revenue submodularity. Theory of Computing , 8(1):95–119, 2012

  18. [18]

    Alina Ene and Huy L. Nguyen. Constrained submodular max imization: Beyond 1/e. In FOCS, pages 248–257, 2016

  19. [19]

    Maximizing symmetric submodular funct ions

    Moran Feldman. Maximizing symmetric submodular funct ions. ACM Trans. Algorithms , 13(3):39:1–39:36, 2017. 10

  20. [20]

    Do less, get more: Streaming submodular maximization with subsampling

    Moran Feldman, Amin Karbasi, and Ehsan Kazemi. Do less, get more: Streaming submodular maximization with subsampling. In NeurIPS, pages 730–740, 2018

  21. [21]

    A unified c ontinuous greedy algorithm for submodular maximization

    Moran Feldman, Joseph Naor, and Roy Schwartz. A unified c ontinuous greedy algorithm for submodular maximization. In FOCS, pages 570–579, 2011

  22. [22]

    Fisher, George L

    Marshall L. Fisher, George L. Nemhauser, and Laurence A . Wolsey. An analysis of approxima- tions for maximizing submodular set functions—II , pages 73–87. Springer Berlin Heidelberg, Berlin, Heidelberg, 1978

  23. [23]

    Submodular maxim ization by simulated annealing

    Shayan Oveis Gharan and Jan Vondr´ ak. Submodular maxim ization by simulated annealing. In SODA, pages 1098–1116, 2011

  24. [24]

    Adaptive submodula rity: Theory and applications in active learning and stochastic optimization

    Daniel Golovin and Andreas Krause. Adaptive submodula rity: Theory and applications in active learning and stochastic optimization. J. Artif. Intell. Res. , 42:427–486, 2011

  25. [25]

    Hartline, Vahab S

    Jason D. Hartline, Vahab S. Mirrokni, and Mukund Sundar arajan. Optimal marketing strate- gies over social networks. In WWW, pages 189–198, 2008

  26. [26]

    Online su bmodular welfare maximization: Greedy is optimal

    Michael Kapralov, Ian Post, and Jan Vondr´ ak. Online su bmodular welfare maximization: Greedy is optimal. In SODA, pages 1216–1225, 2013

  27. [27]

    Kleinberg, and ´Eva Tardos

    David Kempe, Jon M. Kleinberg, and ´Eva Tardos. Maximizing the spread of influence through a social network. Theory of Computing , 11:105–147, 2015

  28. [28]

    Mirrokni, and Morteza Zadimogh addam

    Nitish Korula, Vahab S. Mirrokni, and Morteza Zadimogh addam. Online submodular welfare maximization: Greedy beats 1/2 in random order. SIAM J. Comput. , 47(3):1056–1086, 2018

  29. [29]

    Streaming non-monotone submodular maximization: Personalized video summarizati on on the fly

    Baharan Mirzasoleiman, Stefanie Jegelka, and Andreas Krause. Streaming non-monotone submodular maximization: Personalized video summarizati on on the fly. In AAAI, pages 1379–1386, 2018

  30. [30]

    Nemhauser and Laurence A

    George L. Nemhauser and Laurence A. Wolsey. Best algori thms for approximating the maxi- mum of a submodular set function. Math. Oper. Res. , 3(3):177–188, 1978

  31. [31]

    Nemhauser, Laurence A

    George L. Nemhauser, Laurence A. Wolsey, and Marshall L . Fisher. An analysis of approxi- mations for maximizing submodular set functions—I. Math. Program., 14(1):265–294, 1978

  32. [32]

    Todd Constable

    Mehraveh Salehi, Amin Karbasi, Dustin Scheinost, and R . Todd Constable. A submodular approach to create individualized parcellations of the hum an brain. In Medical Image Com- puting and Computer Assisted Intervention - MICCAI 2017 - 20th Inte rnational Conference, Quebec City, QC, Canada, September 11-13, 2017, Proceedings, P art I , pages 478–485, 2017

  33. [33]

    Schulz and Nelson A

    Andreas S. Schulz and Nelson A. Uhan. Approximating the least core value and least core of cooperative games with supermodular costs. Discrete Optimization, 10(2):163–180, 2013

  34. [34]

    Symmetry and approximability of submodu lar maximization problems

    Jan Vondr´ ak. Symmetry and approximability of submodu lar maximization problems. SIAM J. Comput. , 42(1):265–304, 2013. 11 A Details about the Error in a Previous Work As mentioned above, Chekuri et al. [ 13] described a semi-streaming algorithm for the problem of maximizing a non-negative (not necessarily monotone) subm odular function subject to a card...

  35. [35]

    49895 ≥ 0. 233 . 16 Plugging this bound and the equalities p = 0 . 24 and ε′ = 10 − 4 into the guarantee of Obser- vation C.2, we get that Algorithm 2 with the above parameter values stores at most O(k log k) elements, and uses ˜O(k) space in addition to the poly(1 /ε ′)· ˜O(k) space used by the algorithm for calculating Sτ 2 (as explained in the proof of...

  36. [36]

    000005 ] ≥ 0

    49895 − 0. 000005 ] ≥ 0. 233567·f (OP T ) . Using the low of total expectation, we now get E[f (T )] = Pr[E]·E[f (T )|E ] + Pr[ ¯E]·E [ f (T )| ¯E ] ≥ Pr[E]·E[f (T )|E ] ≥ ( 1− ε′ 20 ) ·E[f (T )|E ]≥ 0. 999995·0. 233567·f (OP T )≥ f (OP T )

  37. [37]

    This shows that the above mentioned implementation of Algo rithm 3 achieves the approximation guarantee of Theorem 1.2

    282 , where the first inequality holds by the non-negativity of f and the second inequality follows from Corollary D.4. This shows that the above mentioned implementation of Algo rithm 3 achieves the approximation guarantee of Theorem 1.2. To complete the proof of the theorem, it remains to observe th at, for the above specified values for the parameters p,...