Stochastic Function Certification with Correlations
Pith reviewed 2026-05-13 18:52 UTC · model grok-4.3
The pith
A non-adaptive O(log n) approximation certifies stochastic matroid bases under arbitrary correlations and is tight unless P equals NP.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
For the stochastic Boolean function certification problem where f is the indicator of a matroid basis, a non-adaptive O(log n)-approximation algorithm works for arbitrary joint distributions p. This ratio is tight up to constants even for partition matroids unless P equals NP. Constant-factor approximations exist for uniform matroids, improving to a 2-approximation when the variables are negatively correlated for the 1-uniform case. For k-uniform matroids on graphs whose edge correlations arise from independent vertex variables, an adaptive O(log k) approximation is possible.
What carries the argument
A non-adaptive ordering of elements computed from the known joint distribution p that minimizes the expected number of probes needed to discover an active basis.
Load-bearing premise
The joint distribution p over the random variables is known in advance or can be sampled from to design the probing order.
What would settle it
A polynomial-time algorithm that achieves o(log n) approximation for certifying a basis of a partition matroid under arbitrary correlations would show the claimed tightness is false.
read the original abstract
We study the Stochastic Boolean Function Certification (SBFC) problem, where we are given $n$ Bernoulli random variables $\{X_e: e \in U\}$ on a ground set $U$ of $n$ elements with joint distribution $p$, a Boolean function $f: 2^U \to \{0, 1\}$, and an (unknown) scenario $S = \{e \in U: X_e = 1\}$ of active elements sampled from $p$. We seek to probe the elements one-at-a-time to reveal if they are active until we can certify $f(S) = 1$, while minimizing the expected number of probes. Unlike most previous results that assume independence, we study correlated distributions $p$ and give approximation algorithms for several classes of functions $f$. When $f(S)$ is the indicator function for whether $S$ is the spanning set of a given matroid, our problem reduces to finding a basis of active elements of a matroid by probing elements. We give a non-adaptive $O(\log n)$-approximation algorithm for arbitrary distributions $p$, and show that this is tight up to constants unless P $=$ NP, even for partition matroids. For uniform matroids, we give constant factor $4.642$-approximation ([BBFT20]) that can be further improved to a $2$-approximation if additionally the random variables are negatively correlated for the case of $1$-uniform matroid. We also give an adaptive $O(\log k)$-approximation algorithm for SBFC for $k$-uniform matroids for the Graph Probing problem, where we seek to probe the edges of a graph one-at-a-time until we find $k$ active edges. The underlying distribution on edges arises from (hidden) independent vertex random variables, with an edge being active if at least one of its endpoints is active. This significantly improves over the information-theoretic lower bound on $\Omega(\mathrm{poly}(n))$ ([JGM19]) for adaptive algorithms for $k$-uniform matroids with arbitrary distributions.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript studies the Stochastic Boolean Function Certification (SBFC) problem with correlated joint distributions p on n Bernoulli variables. For the special case where f certifies that a set spans a given matroid, it gives a non-adaptive O(log n)-approximation algorithm for arbitrary p together with a matching hardness result (up to constants) that holds even for partition matroids. For uniform matroids it supplies a 4.642-approximation (improving to 2 under negative correlation for the 1-uniform case) and, for the graph-probing variant with vertex-induced edge distributions, an adaptive O(log k)-approximation for k-uniform matroids.
Significance. If the stated approximation ratios and hardness results hold, the paper meaningfully extends the stochastic probing literature beyond the independent case by handling arbitrary correlations while retaining tight guarantees for matroids. The non-adaptive O(log n) result and its tightness for partition matroids, the constant-factor improvements for uniform matroids, and the adaptive O(log k) bound that beats the known poly(n) information-theoretic barrier for the vertex-induced model are all substantive contributions. The work credits standard set-cover-style analysis and matroid properties without introducing free parameters or circular definitions.
minor comments (3)
- Abstract: the numerical constant 4.642 is attributed to [BBFT20] but receives no further explanation or citation expansion in the abstract; a one-sentence parenthetical on its origin would help readers.
- Section 1 (Introduction): the modeling assumption that p is known or samplable for policy design is stated but could be highlighted earlier as the natural information model for non-adaptive design.
- Theorem statements: several approximation ratios are given without an accompanying table that lists all results side-by-side with their respective matroid classes and correlation assumptions; such a table would improve readability.
Simulated Author's Rebuttal
We thank the referee for their positive summary of the manuscript, recognition of its contributions to stochastic probing under correlations, and recommendation for minor revision. No specific major comments were raised in the report.
Circularity Check
No significant circularity identified
full rationale
The paper's O(log n)-approximation for arbitrary joint distributions p on matroids, the matching hardness for partition matroids, the constant-factor results for uniform matroids, and the O(log k) adaptive result for graph probing all derive from standard matroid probing and set-cover style analysis. No equations or steps reduce by construction to fitted parameters, self-definitions, or load-bearing self-citations; the explicit assumption that p is known or samplable is the standard model and does not create circularity. The derivation chain remains self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (1)
- standard math Matroids are defined by their independent sets and spanning sets
Reference graph
Works this paper leans on
- [1]
- [2]
- [3]
-
[4]
Y. Ben-Dov. Optimal testing procedures for special structures of coherent systems. Management Science , 27(12):1410--1420, 1981
work page 1981
-
[5]
S. Behnezhad, M. Derakhshan, and M.T. Hajiaghayi. Stochastic matching with few queries: (1- ) approximation. Symposium on Theory of Computing ( STOC ), 2020
work page 2020
- [6]
-
[7]
A. Bhalgat, A. Goel, and S. Khanna. Improved approximation results for stochastic knapsack problems. Symposium on Discrete Algorithms ( SODA ), 2011
work page 2011
- [8]
-
[9]
S. Bubeck. Convex optimization: Algorithms and complexity. Foundations and Trends in Machine Learning , 8(3-4):231--357, 2015
work page 2015
-
[10]
R. Butterworth. Some reliability fault-testing models. Operations Research , 20(2):335--343, 1972
work page 1972
-
[11]
G. Calinescu, C. Chekuri, M. P\' a l, and J. Vondr\' a k. Maximizing a submodular set function subject to a matroid constraint. Integer Programming and Combinatorial Optimization ( IPCO ), 2007
work page 2007
- [12]
- [13]
- [14]
-
[15]
A. Deshpande, L. Hellerstein, and D. Kletenik. Approximation algorithms for stochastic submodular set cover with applications to boolean function evaluation and min-knapsack. ACM Transactions on Algorithms , 12(3):1--28, 2016
work page 2016
-
[16]
U. Feige. A threshold of n for approximating set cover. Journal of the ACM , 45(4):634--652, 1998
work page 1998
- [17]
-
[18]
D. Gkenosis, N. Grammel, L. Hellerstein, and D. Kletenik. The stochastic score classification problem. European Symposium on Algorithms ( ESA ), 2018
work page 2018
-
[19]
D. Gkenosis, N. Grammel, L. Hellerstein, and D. Kletenik. The stochastic boolean function evaluation problem for symmetric boolean functions. Discrete Applied Mathematics , 309:269--277, 2022
work page 2022
- [20]
-
[21]
N. Grammel, L. Hellerstein, D. Kletenik, and N. Liu. Algorithms for the unit-cost stochastic score classification problem. Algorithmica , 84(10):3054--3074, 2022
work page 2022
-
[22]
Adaptive Submodularity: Theory and Applications in Active Learning and Stochastic Optimization
D. Golovin and A. Krause. Adaptive submodularity: A new approach to active learning and stochastic optimization. arXiv:1003.3967, 2017
work page Pith review arXiv 2017
-
[23]
R. Garnett, Y. Krishnamurthy, X. Xiong, J. Schneider, and R. Mann. Bayesian optimal active search and surveying. International Conference on Machine Learning ( ICML ), 2012
work page 2012
-
[24]
A. Gupta and V. Nagarajan. A stochastic probing problem with applications. Integer Programming and Combinatorial Optimization ( IPCO ), 2013
work page 2013
- [25]
-
[26]
L. Hellerstein, B. Plank, and K. Schewior. Approximating matroid basis testing on partition matroids using budget-in-expectation. Symposium on Discrete Algorithms ( SODA ), 2026
work page 2026
- [27]
-
[28]
S. Im, V. Nagarajan, and R. van der Zwaan. Minimum latency submodular cover. ACM Transactions on Algorithms , 13(1):13:1--13:28, 2016
work page 2016
- [29]
- [30]
- [31]
- [32]
-
[33]
R.M. Kainkaryam and P.J. Woolf. Pooling in high-throughput drug screening. Current Opinion in Drug Discovery & Development , 12(3):339, 2009
work page 2009
- [34]
-
[35]
B.M.E. Moret. Decision trees and diagrams. ACM Computing Surveys , 14(4):593--623, 1982
work page 1982
-
[36]
M.A. Nielsen, L. Rohwedder, and K. Schewior. Non-adaptive evaluation of k -of- n functions: Tight gap and a unit-cost PTAS . arXiv:2507.05877, 2025
-
[37]
B.M. Plank and K. Schewior. Simple algorithms for stochastic score classification with small approximation ratios. SIAM Journal on Discrete Mathematics , 38(3):2069--2088, 2024
work page 2069
-
[38]
R.E.A.C. Paley and A. Zygmund. On some series of functions (1). Mathematical Proceedings of the Cambridge Philosophical Society , 26:337--357, 1930
work page 1930
-
[39]
C. Swamy and D.B. Shmoys. Sampling-based approximation algorithms for multistage stochastic optimization. SIAM Journal on Computing , 41(4):975--1004, 2012
work page 2012
-
[40]
T. \" U nl\" u yurt. Sequential testing problem: A follow-up review. Discrete Applied Mathematics , 377:356--369, 2025
work page 2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.