Tight Lower Bounds for the Multi-Secretary Problem via Bellman Certificates

Jiawei Zhang

arxiv: 2607.02150 · v1 · pith:SXW4N6DUnew · submitted 2026-07-02 · 💻 cs.DS · cs.LG

Tight Lower Bounds for the Multi-Secretary Problem via Bellman Certificates

Jiawei Zhang This is my paper

Pith reviewed 2026-07-03 03:56 UTC · model grok-4.3

classification 💻 cs.DS cs.LG

keywords multi-secretary problemadditive regretsupport gapsBellman certificateslower boundsonline algorithmsprophet inequalities

0 comments

The pith

For mixtures of two separated uniforms at critical capacity, the multi-secretary problem requires Omega((log T)^2) additive regret.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that the extra logarithmic factor in regret upper bounds for the multi-secretary problem is necessary when value distributions have gaps in their support. It constructs an explicit lower bound of order (log T)^2 for the optimal online policy versus the offline prophet benchmark, using a mixture of two separated uniform distributions at the critical capacity. This matches known O((log T)^2) upper bounds and proves them tight even in the one-resource case. The argument relies on Bellman certificates as feasible solutions to a relaxed version of the dynamic programming recursion.

Core claim

For a mixture of two separated uniform distributions at the critical capacity, the optimal regret grows at least on the order of (log T)^2. The same framework also yields a matching lower bound for gapped distributions whose gap-facing densities vanish near the support edges.

What carries the argument

Bellman certificates: feasible solutions to a relaxation of the exact Bellman recursion that convert lower bounds into explicit constructions and identify why support gaps permit larger regret.

Load-bearing premise

The model uses bounded-density distributions with support gaps and measures additive regret relative to the offline prophet reward in the standard multi-secretary setting.

What would settle it

An explicit computation of the optimal online policy's expected regret for the two-uniform mixture instance at sufficiently large T that yields o((log T)^2) would disprove the lower bound.

read the original abstract

This paper studies additive regret in the multi-secretary problem, defined as the gap between the expected offline prophet reward and the reward of the best online policy. Prior work established \(O(\log T)\) regret for bounded-density distributions with connected support and \(O((\log T)^2)\) upper bounds for bounded-density distributions with support gaps. It was unknown whether the extra logarithmic factor is necessary even in the one-resource model. We prove that it is necessary. For a mixture of two separated uniform distributions at the critical capacity, the optimal regret grows at least on the order of \((\log T)^2\). Thus the existing \(O((\log T)^2)\) upper bounds for bounded-density gapped instances, including those implied by network revenue management models with continuous rewards, are tight in this simplest specialization. The same framework also yields a matching lower bound for gapped distributions whose gap-facing densities vanish near the support edges; this companion result is given in the appendix. The proofs use Bellman certificates: feasible solutions to a relaxation of the exact Bellman recursion. This framework converts lower bounds into explicit certificate constructions and identifies why support gaps permit larger regret.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives the first explicit Ω((log T)^2) lower bound for gapped multi-secretary instances via Bellman certificates, showing the extra log factor in prior upper bounds is necessary.

read the letter

The key point here is that the paper settles an open question by proving a matching lower bound of Ω((log T)^2) regret for the multi-secretary problem under bounded-density distributions with support gaps. It does this for the concrete case of a mixture of two separated uniform distributions at critical capacity, and the same certificate method also covers a companion case where densities vanish near the edges.

The new element is the explicit construction of a feasible solution to the relaxed Bellman recursion that directly translates into the regret lower bound. This framework avoids circularity with the existing upper bounds and explains why gaps permit the larger regret while connected supports stay at O(log T). The abstract and stress-test description indicate the argument is self-contained and the certificate is built independently of the cited O((log T)^2) upper bounds.

The work is solid on its own terms for this specialization. It cleanly converts the certificate into an additive regret statement relative to the offline prophet, and the choice of the two-uniform mixture at the critical point looks deliberate to hit the threshold where the extra log appears. The appendix extension broadens the result without changing the core technique.

The main limitation is that the tight bound is shown for specific gapped instances rather than all gapped distributions at once. That is not a flaw given the claim, but it means the result is a sharp clarification for the simplest case rather than a sweeping characterization. No load-bearing gaps or fitting issues are visible in the stated argument structure.

This is aimed at people working on online algorithms and dynamic programming for resource allocation. Anyone tracking regret bounds in prophet inequalities or network revenue management will find the tightness result useful. The paper deserves a serious referee because the question it answers was explicitly open and the certificate approach looks reproducible enough to check.

Referee Report

0 major / 2 minor

Summary. The paper proves that the optimal additive regret in the multi-secretary problem is Ω((log T)^2) for a mixture of two separated uniform distributions at critical capacity. This is obtained by constructing an explicit Bellman certificate that is feasible for a relaxation of the exact Bellman recursion; the same framework yields a matching lower bound for gapped distributions whose gap-facing densities vanish near the support edges (appendix). The result shows that existing O((log T)^2) upper bounds for bounded-density gapped instances are tight even in the one-resource case.

Significance. If the certificate construction is valid, the result is significant: it closes the gap between upper and lower bounds for the gapped case, confirming that support gaps force the extra logarithmic factor relative to connected-support distributions. The Bellman-certificate framework is a clear strength—it converts the lower-bound argument into an explicit, verifiable object rather than an information-theoretic or minimax argument—and the paper directly matches the cited upper bounds for this canonical instance.

minor comments (2)

The abstract states that the certificate 'converts lower bounds into explicit certificate constructions'; a short paragraph in the introduction summarizing the key algebraic steps of the construction (without full proof) would help readers assess feasibility before reaching the technical sections.
Notation for the 'critical capacity' parameter should be introduced with an explicit formula (e.g., as a function of the two uniform supports) the first time it appears, rather than only in the instance definition.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the careful reading and positive assessment of the paper. The report accurately summarizes our contribution in establishing matching Ω((log T)^2) lower bounds for the multi-secretary problem under gapped distributions using explicit Bellman certificates, confirming the tightness of existing upper bounds even in the single-resource case.

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper's central derivation constructs an explicit Bellman certificate as a feasible solution to a relaxation of the exact Bellman recursion for the specific two-uniform mixture instance, directly yielding the Ω((log T)^2) lower bound on additive regret relative to the offline prophet. This construction is presented as independent of the cited O((log T)^2) upper bounds and does not reduce any prediction or claim to a fitted input, self-definition, or load-bearing self-citation; the framework converts the certificate feasibility into the regret lower bound without circular reduction. The argument remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract supplies no information on free parameters, axioms, or invented entities; assessment is limited to the stated claim.

pith-pipeline@v0.9.1-grok · 5729 in / 1024 out tokens · 22558 ms · 2026-07-03T03:56:17.474206+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

35 extracted references · 4 canonical work pages

[1]

SIAM Journal on Computing 43(2):930--972

Alaei S (2014) Bayesian combinatorial auctions: Expanding single buyer mechanisms to many buyers. SIAM Journal on Computing 43(2):930--972

2014
[2]

Stochastic Systems 9(3):231--260

Arlotto A, Gurvich I (2019) Uniformly bounded regret in the multisecretary problem. Stochastic Systems 9(3):231--260

2019
[3]

Stochastic Systems 10(2):170--191

Arlotto A, Xie X (2020) Logarithmic regret in the dynamic and stochastic knapsack problem with equal rewards. Stochastic Systems 10(2):170--191

2020
[4]

Management Science 71(6):4877--4894

Banerjee S, Freund D (2024) Good prophets know when the end is near. Management Science 71(6):4877--4894

2024
[5]

Operations Research 72(5):2168--2189

Balseiro SR, Besbes O, Pizarro D (2024) Survey of dynamic resource-constrained reward collection problems: Unified model and analysis. Operations Research 72(5):2168--2189

2024
[6]

Operations Research 73(3):1273--1288

Besbes O, Kanoria Y, Kumar A (2024) Dynamic resource allocation: Algorithmic design principles and spectrum of achievable performances. Operations Research 73(3):1273--1288

2024
[7]

Oxford University Press

Boucheron S, Lugosi G, Massart P (2013) Concentration Inequalities: A Nonasymptotic Theory of Independence. Oxford University Press

2013
[8]

Operations Research 73(4):2188--2203

Bray RL (2024) Logarithmic regret in multisecretary and online linear programs with continuous valuations. Operations Research 73(4):2188--2203

2024
[9]

Management Science 66(7):2993--3009

Bumpensanti P, Wang H (2020) A re-solving heuristic with uniformly bounded loss for network revenue management. Management Science 66(7):2993--3009

2020
[10]

Proceedings of the 42nd ACM Symposium on Theory of Computing, 311--320

Chawla S, Hartline JD, Malec DL, Sivan B (2010) Multi-parameter mechanism design and sequential posted pricing. Proceedings of the 42nd ACM Symposium on Theory of Computing, 311--320

2010
[11]

Operations Research 72(3):1124--1138

Chen G, Li X, Ye Y (2024) An improved analysis of LP-based control for revenue management. Operations Research 72(3):1124--1138

2024
[12]

Beyond non-degeneracy: Revisiting certainty equivalent heuristic for online linear programming

Chen Y, Wang W (2025) Beyond non-degeneracy: Revisiting certainty equivalent heuristic for online linear programming. arXiv preprint arXiv:2501.01716

work page arXiv 2025
[13]

Proceedings of the 2017 ACM Conference on Economics and Computation, 169--186

Correa J, Foncea P, Hoeksma R, Oosterwijk T, Vredeveld T (2017) Posted price mechanisms for a random stream of customers. Proceedings of the 2017 ACM Conference on Economics and Computation, 169--186

2017
[14]

ACM SIGecom Exchanges 17(1):61--70

Correa J, Foncea P, Hoeksma R, Oosterwijk T, Vredeveld T (2019) Recent developments in prophet inequalities. ACM SIGecom Exchanges 17(1):61--70

2019
[15]

II, 2nd ed

Feller W (1971) An Introduction to Probability Theory and Its Applications, Vol. II, 2nd ed. Wiley

1971
[16]

Mathematics of Operations Research 48(3):1344--1363

Freund D, Zhao J (2022) Overbooking with bounded loss. Mathematics of Operations Research 48(3):1344--1363

2022
[17]

Management Science 40(8):999--1020

Gallego G, van Ryzin G (1994) Optimal dynamic pricing of inventories with stochastic demand over finite horizons. Management Science 40(8):999--1020

1994
[18]

Beyond O( T) regret

Gao W, Ge D, Xue C, Sun C, Ye Y (2025) Beyond O( T) regret: Decoupling learning and decision-making in online linear programming. arXiv preprint arXiv:2501.02761

work page arXiv 2025
[19]

Operations Research 72(3):1139--1155

Gupta V (2024) Greedy algorithm for multiway matching with bounded regret. Operations Research 72(3):1139--1155

2024
[20]

Working paper, SSRN 5133857

He S, Wei Y, Xu J, Yu SH (2025) Online resource allocation without re-solving: The effectiveness of primal-dual policies. Working paper, SSRN 5133857

2025
[21]

random variables

Hill TP, Kertz RP (1982) Comparisons of stop rule and supremum expectations of i.i.d. random variables. The Annals of Probability 10(2):336--345

1982
[22]

Mathematics of Operations Research 37(2):313--345

Jasin S, Kumar S (2012) A re-solving heuristic with bounded revenue loss for network revenue management with customer choice. Mathematics of Operations Research 37(2):313--345

2012
[23]

Online resource allocation with stochastic resource consumption

Jiang J, Zhang J (2020) Online resource allocation with stochastic resource consumption. arXiv preprint arXiv:2012.07933

work page arXiv 2020
[24]

Operations Research 73(3):1703--1721

Jiang J, Ma W, Zhang J (2024) Tight guarantees for multi-unit prophet inequalities and online stochastic knapsack. Operations Research 73(3):1703--1721

2024
[25]

Operations Research 73(6):3405--3420

Jiang J, Ma W, Zhang J (2025a) Degeneracy is OK: Logarithmic regret for network revenue management with indiscrete distributions. Operations Research 73(6):3405--3420
[26]

Mathematics of Operations Research 51(2):956--987

Jiang J, Ma W, Zhang J (2025b) Tightness without counterexamples: A new approach and new results for prophet inequalities. Mathematics of Operations Research 51(2):956--987
[27]

Infrequent resolving algorithm for online linear programming

Li G, Wang Z, Zhang J (2024) Infrequent resolving algorithm for online linear programming. arXiv preprint arXiv:2408.00465

work page arXiv 2024
[28]

Operations Research 70(5):2948--2966

Li X, Ye Y (2022) Online linear programming: Dual convergence, new algorithms, and regret bounds. Operations Research 70(5):2948--2966

2022
[29]

Journal of Algorithms 29(2):277--305

Lueker GS (1998) Average-case analysis of off-line and on-line knapsack problems. Journal of Algorithms 29(2):277--305

1998
[30]

The Annals of Probability 12(4):1213--1216

Samuel-Cahn E (1984) Comparison of threshold stop rules and maximum for independent nonnegative random variables. The Annals of Probability 12(4):1213--1216

1984
[31]

Management Science 44(11):1577--1593

Talluri K, van Ryzin G (1998) An analysis of bid-price controls for network revenue management. Management Science 44(11):1577--1593

1998
[32]

Management Science 67(3):1368--1391

Vera A, Banerjee S (2021) The Bayesian prophet: A low-regret framework for online decision making. Management Science 67(3):1368--1391

2021
[33]

Operations Research 69(3):821--840

Vera A, Banerjee S, Gurvich I (2021) Online allocation and pricing: Constant regret via Bellman inequalities. Operations Research 69(3):821--840

2021
[34]

Working paper, SSRN 4357216

Wei Y, Xu J, Yu SH (2023) Constant regret primal-dual policy for multi-way dynamic matching. Working paper, SSRN 4357216

2023
[35]

Management Science, forthcoming

Xie Y, Ma W, Xin L (2025) The benefits of delay to online decision making. Management Science, forthcoming

2025

[1] [1]

SIAM Journal on Computing 43(2):930--972

Alaei S (2014) Bayesian combinatorial auctions: Expanding single buyer mechanisms to many buyers. SIAM Journal on Computing 43(2):930--972

2014

[2] [2]

Stochastic Systems 9(3):231--260

Arlotto A, Gurvich I (2019) Uniformly bounded regret in the multisecretary problem. Stochastic Systems 9(3):231--260

2019

[3] [3]

Stochastic Systems 10(2):170--191

Arlotto A, Xie X (2020) Logarithmic regret in the dynamic and stochastic knapsack problem with equal rewards. Stochastic Systems 10(2):170--191

2020

[4] [4]

Management Science 71(6):4877--4894

Banerjee S, Freund D (2024) Good prophets know when the end is near. Management Science 71(6):4877--4894

2024

[5] [5]

Operations Research 72(5):2168--2189

Balseiro SR, Besbes O, Pizarro D (2024) Survey of dynamic resource-constrained reward collection problems: Unified model and analysis. Operations Research 72(5):2168--2189

2024

[6] [6]

Operations Research 73(3):1273--1288

Besbes O, Kanoria Y, Kumar A (2024) Dynamic resource allocation: Algorithmic design principles and spectrum of achievable performances. Operations Research 73(3):1273--1288

2024

[7] [7]

Oxford University Press

Boucheron S, Lugosi G, Massart P (2013) Concentration Inequalities: A Nonasymptotic Theory of Independence. Oxford University Press

2013

[8] [8]

Operations Research 73(4):2188--2203

Bray RL (2024) Logarithmic regret in multisecretary and online linear programs with continuous valuations. Operations Research 73(4):2188--2203

2024

[9] [9]

Management Science 66(7):2993--3009

Bumpensanti P, Wang H (2020) A re-solving heuristic with uniformly bounded loss for network revenue management. Management Science 66(7):2993--3009

2020

[10] [10]

Proceedings of the 42nd ACM Symposium on Theory of Computing, 311--320

Chawla S, Hartline JD, Malec DL, Sivan B (2010) Multi-parameter mechanism design and sequential posted pricing. Proceedings of the 42nd ACM Symposium on Theory of Computing, 311--320

2010

[11] [11]

Operations Research 72(3):1124--1138

Chen G, Li X, Ye Y (2024) An improved analysis of LP-based control for revenue management. Operations Research 72(3):1124--1138

2024

[12] [12]

Beyond non-degeneracy: Revisiting certainty equivalent heuristic for online linear programming

Chen Y, Wang W (2025) Beyond non-degeneracy: Revisiting certainty equivalent heuristic for online linear programming. arXiv preprint arXiv:2501.01716

work page arXiv 2025

[13] [13]

Proceedings of the 2017 ACM Conference on Economics and Computation, 169--186

Correa J, Foncea P, Hoeksma R, Oosterwijk T, Vredeveld T (2017) Posted price mechanisms for a random stream of customers. Proceedings of the 2017 ACM Conference on Economics and Computation, 169--186

2017

[14] [14]

ACM SIGecom Exchanges 17(1):61--70

Correa J, Foncea P, Hoeksma R, Oosterwijk T, Vredeveld T (2019) Recent developments in prophet inequalities. ACM SIGecom Exchanges 17(1):61--70

2019

[15] [15]

II, 2nd ed

Feller W (1971) An Introduction to Probability Theory and Its Applications, Vol. II, 2nd ed. Wiley

1971

[16] [16]

Mathematics of Operations Research 48(3):1344--1363

Freund D, Zhao J (2022) Overbooking with bounded loss. Mathematics of Operations Research 48(3):1344--1363

2022

[17] [17]

Management Science 40(8):999--1020

Gallego G, van Ryzin G (1994) Optimal dynamic pricing of inventories with stochastic demand over finite horizons. Management Science 40(8):999--1020

1994

[18] [18]

Beyond O( T) regret

Gao W, Ge D, Xue C, Sun C, Ye Y (2025) Beyond O( T) regret: Decoupling learning and decision-making in online linear programming. arXiv preprint arXiv:2501.02761

work page arXiv 2025

[19] [19]

Operations Research 72(3):1139--1155

Gupta V (2024) Greedy algorithm for multiway matching with bounded regret. Operations Research 72(3):1139--1155

2024

[20] [20]

Working paper, SSRN 5133857

He S, Wei Y, Xu J, Yu SH (2025) Online resource allocation without re-solving: The effectiveness of primal-dual policies. Working paper, SSRN 5133857

2025

[21] [21]

random variables

Hill TP, Kertz RP (1982) Comparisons of stop rule and supremum expectations of i.i.d. random variables. The Annals of Probability 10(2):336--345

1982

[22] [22]

Mathematics of Operations Research 37(2):313--345

Jasin S, Kumar S (2012) A re-solving heuristic with bounded revenue loss for network revenue management with customer choice. Mathematics of Operations Research 37(2):313--345

2012

[23] [23]

Online resource allocation with stochastic resource consumption

Jiang J, Zhang J (2020) Online resource allocation with stochastic resource consumption. arXiv preprint arXiv:2012.07933

work page arXiv 2020

[24] [24]

Operations Research 73(3):1703--1721

Jiang J, Ma W, Zhang J (2024) Tight guarantees for multi-unit prophet inequalities and online stochastic knapsack. Operations Research 73(3):1703--1721

2024

[25] [25]

Operations Research 73(6):3405--3420

Jiang J, Ma W, Zhang J (2025a) Degeneracy is OK: Logarithmic regret for network revenue management with indiscrete distributions. Operations Research 73(6):3405--3420

[26] [26]

Mathematics of Operations Research 51(2):956--987

Jiang J, Ma W, Zhang J (2025b) Tightness without counterexamples: A new approach and new results for prophet inequalities. Mathematics of Operations Research 51(2):956--987

[27] [27]

Infrequent resolving algorithm for online linear programming

Li G, Wang Z, Zhang J (2024) Infrequent resolving algorithm for online linear programming. arXiv preprint arXiv:2408.00465

work page arXiv 2024

[28] [28]

Operations Research 70(5):2948--2966

Li X, Ye Y (2022) Online linear programming: Dual convergence, new algorithms, and regret bounds. Operations Research 70(5):2948--2966

2022

[29] [29]

Journal of Algorithms 29(2):277--305

Lueker GS (1998) Average-case analysis of off-line and on-line knapsack problems. Journal of Algorithms 29(2):277--305

1998

[30] [30]

The Annals of Probability 12(4):1213--1216

Samuel-Cahn E (1984) Comparison of threshold stop rules and maximum for independent nonnegative random variables. The Annals of Probability 12(4):1213--1216

1984

[31] [31]

Management Science 44(11):1577--1593

Talluri K, van Ryzin G (1998) An analysis of bid-price controls for network revenue management. Management Science 44(11):1577--1593

1998

[32] [32]

Management Science 67(3):1368--1391

Vera A, Banerjee S (2021) The Bayesian prophet: A low-regret framework for online decision making. Management Science 67(3):1368--1391

2021

[33] [33]

Operations Research 69(3):821--840

Vera A, Banerjee S, Gurvich I (2021) Online allocation and pricing: Constant regret via Bellman inequalities. Operations Research 69(3):821--840

2021

[34] [34]

Working paper, SSRN 4357216

Wei Y, Xu J, Yu SH (2023) Constant regret primal-dual policy for multi-way dynamic matching. Working paper, SSRN 4357216

2023

[35] [35]

Management Science, forthcoming

Xie Y, Ma W, Xin L (2025) The benefits of delay to online decision making. Management Science, forthcoming

2025