Minimizing Upper Confidence Bounds: A Data-Driven Framework for Stochastic Programming
Pith reviewed 2026-05-24 02:51 UTC · model grok-4.3
The pith
Minimizing the average percentile upper bound on expected cost yields decisions robust to epistemic uncertainty in stochastic programs.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central contribution is the Average Percentile Upper Bound (APUB), a new statistical construct that serves as both a statistically rigorous upper bound for population means and an approximate risk metric for sample means. The paper rigorously proves the asymptotic correctness and consistency of APUB, establishing a reliable foundation for data-driven decision-making. Practical solution methods including a bootstrap sampling approximation and an L-shaped method are developed for APUB optimization problems, with focus on two-stage linear stochastic optimization with random recourse.
What carries the argument
The Average Percentile Upper Bound (APUB), which averages percentile-based upper bounds to serve simultaneously as a population-mean upper bound and a sample risk metric.
If this is right
- APUB minimization produces decisions that remain reliable when the true distribution is unknown.
- Bootstrap sampling supplies a practical approximation for solving APUB optimization problems.
- The L-shaped method extends directly to two-stage linear stochastic programs under the APUB objective.
- Decisions obtained from APUB optimization display greater consistency across repeated samples from the same limited data.
Where Pith is reading between the lines
- The same APUB construction might apply to nonlinear or multi-stage stochastic programs if the asymptotic guarantees carry over.
- APUB could serve as an alternative to distributionally robust optimization by remaining fully data-driven rather than imposing ambiguity sets.
- Performance comparisons across a range of sample sizes would clarify when the finite-sample risk-metric property holds.
Load-bearing premise
The percentile construction of APUB yields a useful finite-sample risk metric whose minimization produces decisions robust to epistemic uncertainty, an assumption resting on the bootstrap approximation accurately capturing the optimization problem for the sample sizes used in practice.
What would settle it
Run out-of-sample tests that draw fresh data from a known true distribution, optimize decisions by minimizing APUB versus by sample-average approximation, and check whether the APUB decisions exhibit materially lower realized cost variance or better worst-case performance on the held-out data.
Figures
read the original abstract
Stochastic programming is often challenged by epistemic uncertainty, where critical probability distributions are poorly characterized or unknown due to a lack of data. To address this, we pioneer a novel framework for stochastic programming that minimizes an upper confidence bound (UCB) on the expected random cost, acting as a robustness-seeking strategy. Our central contribution is the Average Percentile Upper Bound (APUB), a new statistical construct that serves as both a statistically rigorous upper bound for population means and an approximate risk metric for sample means. We rigorously prove the asymptotic correctness and consistency of APUB, establishing a reliable foundation for data-driven decision-making. We also develop practical solution methods, including a bootstrap sampling approximation method and an L-shaped method, to solve APUB optimization problems, with a specific focus on two-stage linear stochastic optimization with random recourse. Empirical demonstrations on a two-stage product mix problem reveal the significant benefits of our APUB optimization framework, which fortifies the process against epistemic uncertainty while reinforcing key decision-making attributes like reliability and consistency. The implementation and source code are available at https://github.com/8Wings/APUB-Optimization.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes minimizing an Average Percentile Upper Bound (APUB) as a data-driven robustness strategy for stochastic programming under epistemic uncertainty. It introduces APUB as both an asymptotically correct upper bound on population means and a risk metric for sample means, claims rigorous proofs of its asymptotic correctness and consistency, develops a bootstrap sampling approximation and an L-shaped method to solve the resulting optimization problems (with focus on two-stage linear programs with random recourse), and reports empirical benefits on a two-stage product mix instance. Code is provided on GitHub.
Significance. If the asymptotic properties are correctly established and the bootstrap approximation reliably captures the relevant tail behavior at practical sample sizes, the framework would supply a statistically grounded alternative to existing robust and distributionally robust stochastic programming methods, with the open-source implementation strengthening reproducibility.
major comments (2)
- [bootstrap sampling approximation method (described after the APUB definition)] The central practical claim—that minimizing the bootstrap-approximated APUB yields decisions robust to epistemic uncertainty—rests on the finite-sample accuracy of the percentile optimization. No analysis or diagnostic is provided showing that the bootstrap reproduces the tail quantiles of the sample-mean distribution (including dependence induced by recourse decisions) at the sample sizes used in the numerical experiments.
- [Empirical demonstrations] The empirical section reports benefits on a two-stage product mix problem but does not include a controlled comparison of out-of-sample performance or sensitivity to sample size that would test whether the observed reliability and consistency arise from the claimed properties of APUB rather than from the specific instance.
minor comments (2)
- [APUB definition] Notation for the percentile level and the averaging window in the APUB definition should be introduced with a single consistent symbol set and cross-referenced to the consistency theorem.
- [L-shaped method] The L-shaped method description would benefit from an explicit statement of how the APUB objective is linearized or approximated within the master problem.
Simulated Author's Rebuttal
We thank the referee for the constructive comments, which help clarify the practical aspects of the APUB framework. We address each major comment below and outline planned revisions.
read point-by-point responses
-
Referee: [bootstrap sampling approximation method (described after the APUB definition)] The central practical claim—that minimizing the bootstrap-approximated APUB yields decisions robust to epistemic uncertainty—rests on the finite-sample accuracy of the percentile optimization. No analysis or diagnostic is provided showing that the bootstrap reproduces the tail quantiles of the sample-mean distribution (including dependence induced by recourse decisions) at the sample sizes used in the numerical experiments.
Authors: We agree that explicit finite-sample diagnostics for the bootstrap approximation of tail quantiles would strengthen the practical claims. The manuscript establishes asymptotic correctness and consistency of APUB, and the bootstrap method is presented as a computational approximation whose validity is supported by the numerical results on the product-mix instance. However, no dedicated diagnostic section comparing bootstrap percentiles to Monte Carlo ground truth (accounting for recourse-induced dependence) is included. In the revision we will add such diagnostics for the sample sizes used in the experiments. revision: yes
-
Referee: [Empirical demonstrations] The empirical section reports benefits on a two-stage product mix problem but does not include a controlled comparison of out-of-sample performance or sensitivity to sample size that would test whether the observed reliability and consistency arise from the claimed properties of APUB rather than from the specific instance.
Authors: We acknowledge that the current empirical section is limited to a single illustrative instance without systematic out-of-sample or sample-size sensitivity controls. The reported benefits are intended to demonstrate the framework rather than provide exhaustive validation. To address the concern, the revised manuscript will expand the experiments to include controlled out-of-sample evaluations across varying sample sizes and at least one additional problem class, with direct comparisons that isolate APUB's contribution. revision: yes
Circularity Check
No significant circularity; APUB defined and proved independently
full rationale
The paper introduces APUB as a novel statistical construct and states that it provides rigorous proofs of asymptotic correctness and consistency. No load-bearing derivation step is shown to reduce by the paper's own equations to a self-definition, a fitted parameter renamed as prediction, or a self-citation chain. The bootstrap approximation is presented as a practical solution method rather than a definitional input, and the framework remains self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Standard assumptions of two-stage linear stochastic programming with random recourse, including finite expectations and appropriate recourse structures.
invented entities (1)
-
Average Percentile Upper Bound (APUB)
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Y. Abbasi-Yadkori, D. P \'a l, and C. Szepesv \'a ri. Improved algorithms for linear stochastic bandits. Advances in neural information processing systems, 24, 2011
work page 2011
-
[2]
P. Artzner, F. Delbaen, J.-M. Eber, and D. Heath. Coherent measures of risk . Mathematical Finance, 9 0 (3): 0 203--228, 1999. ISSN 1467-9965. doi:10.1111/1467-9965.00068. URL https://onlinelibrary.wiley.com/doi/abs/10.1111/1467-9965.00068. \_eprint: https://onlinelibrary.wiley.com/doi/pdf/10.1111/1467-9965.00068
-
[3]
K. B. Athreya. Strong law for the bootstrap. Statistics & Probability Letters, 1 0 (3): 0 147--150, Mar. 1983. ISSN 0167-7152. doi:10.1016/0167-7152(83)90063-9. URL https://www.sciencedirect.com/science/article/pii/0167715283900639
-
[4]
P. Auer, N. Cesa-Bianchi, and P. Fischer. Finite-time Analysis of the Multiarmed Bandit Problem . Machine Learning, 47 0 (2): 0 235--256, May 2002. ISSN 1573-0565. doi:10.1023/A:1013689704352. URL https://doi.org/10.1023/A:1013689704352
-
[5]
G. Bayraksan and D. K. Love. Data-driven stochastic programming using phi-divergences. In J. C. Smith, S. Ceria, and P. M. Pardalos, editors, The Operations Research Revolution, pages 1--19. INFORMS, Catonsville, MD, 2015
work page 2015
-
[6]
A. Ben-Tal, D. den Hertog, A. De Waegenaere, B. Melenberg, and G. Rennen. Robust Solutions of Optimization Problems Affected by Uncertain Probabilities . Management Science, 59 0 (2): 0 341--357, Feb. 2013. ISSN 0025-1909. doi:10.1287/mnsc.1120.1641. URL https://pubsonline.informs.org/doi/abs/10.1287/mnsc.1120.1641. Publisher: INFORMS
-
[7]
A. C. Berry. The accuracy of the gaussian approximation to the sum of independent variates. Transactions of the american mathematical society, 49 0 (1): 0 122--136, 1941
work page 1941
-
[8]
J. R. Birge and F. Louveaux. Introduction to Stochastic Programming. Springer Series in Operations Research and Financial Engineering. Springer, New York, NY, 2 edition, 2011. ISBN 978-1-4614-0237-4. doi:10.1007/978-1-4614-0237-4
-
[9]
J. Blanchet and K. Murthy. Quantifying Distributional Model Risk via Optimal Transport . Mathematics of Operations Research, 44 0 (2): 0 565--600, May 2019. ISSN 0364-765X. doi:10.1287/moor.2018.0936. URL https://pubsonline.informs.org/doi/abs/10.1287/moor.2018.0936. Publisher: INFORMS
-
[10]
G. C. Calafiore and L. E. Ghaoui. On Distributionally Robust Chance - Constrained Linear Programs . Journal of Optimization Theory and Applications, 130 0 (1): 0 1--22, July 2006. ISSN 1573-2878. doi:10.1007/s10957-006-9084-x. URL https://doi.org/10.1007/s10957-006-9084-x
-
[11]
E. Delage and Y. Ye. Distributionally Robust Optimization Under Moment Uncertainty with Application to Data - Driven Problems . Operations Research, 58 0 (3): 0 595--612, June 2010. ISSN 0030-364X. doi:10.1287/opre.1090.0741. URL https://pubsonline.informs.org/doi/abs/10.1287/opre.1090.0741. Publisher: INFORMS
-
[12]
J. Devore. Probability and statistics for engineering and the sciences, Mar. 2009. URL https://hero.epa.gov/hero/index.cfm/reference/details/reference_id/196740
work page 2009
- [13]
-
[14]
B. Efron. Nonparametric standard errors and confidence intervals. Canadian Journal of Statistics, 9 0 (2): 0 139--158, 1981. ISSN 1708-945X. doi:10.2307/3314608. URL https://onlinelibrary.wiley.com/doi/abs/10.2307/3314608. \_eprint: https://onlinelibrary.wiley.com/doi/pdf/10.2307/3314608
-
[15]
B. Efron. Better Bootstrap Confidence Intervals . Journal of the American Statistical Association, 82 0 (397): 0 171--185, 1987. ISSN 0162-1459. doi:10.2307/2289144. URL https://www.jstor.org/stable/2289144. Publisher: [American Statistical Association, Taylor & Francis, Ltd.]
-
[16]
Z. Fan, W. Wang, S. H. Ng, and Q. Hu. Minimizing UCB : a better local search strategy in local bayesian optimization. In The Thirty-eighth Annual Conference on Neural Information Processing Systems, 2024. URL https://openreview.net/forum?id=5GCgNFZSyo
work page 2024
-
[17]
R. Gao and A. Kleywegt. Distributionally Robust Stochastic Optimization with Wasserstein Distance . Mathematics of Operations Research, 48 0 (2): 0 603--655, May 2023. ISSN 0364-765X. doi:10.1287/moor.2022.1275. URL https://pubsonline.informs.org/doi/full/10.1287/moor.2022.1275. Publisher: INFORMS
- [18]
-
[19]
A. Hazra. Using the confidence interval confidently. Journal of Thoracic Disease, 9 0 (10): 0 4125--4130, Oct. 2017. ISSN 2072-1439. doi:10.21037/jtd.2017.09.14. URL https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5723800/
-
[20]
A. J. King. Stochastic programming problems: Examples from the literature. Numerical techniques for stochastic optimization, pages 543--567, 1988
work page 1988
-
[21]
F. Lin, X. Fang, and Z. Gao. Distributionally robust optimization: A review on theory and applications. Numerical Algebra, Control and Optimization, 12 0 (1): 0 159--212, 2022
work page 2022
- [22]
-
[23]
V. Mnih, C. Szepesvári, and J.-Y. Audibert. Empirical Bernstein stopping. In Proceedings of the 25th international conference on Machine learning , ICML '08, pages 672--679, New York, NY, USA, July 2008. Association for Computing Machinery. ISBN 978-1-60558-205-4. doi:10.1145/1390156.1390241. URL https://dl.acm.org/doi/10.1145/1390156.1390241
-
[24]
P. Mohajerin Esfahani and D. Kuhn. Data-driven distributionally robust optimization using the Wasserstein metric: performance guarantees and tractable reformulations. Mathematical Programming, 171 0 (1): 0 115--166, Sept. 2018. ISSN 1436-4646. doi:10.1007/s10107-017-1172-1. URL https://doi.org/10.1007/s10107-017-1172-1
-
[25]
Distributionally robust optimization: A review
H. Rahimian and S. Mehrotra. Distributionally robust optimization: A review. arXiv preprint arXiv:1908.05659, 2019
-
[26]
T. R. C. Read and N. Cressie. Goodness- Of - Fit Statistics for Discrete Multivariate Data Semantic Scholar , 2012. URL https://www.semanticscholar.org/paper/Goodness-Of-Fit-Statistics-for-Discrete-Data-Read-Cressie/563e27a8b8c4b1e412f306be93cea9467789b560
work page 2012
-
[27]
R. T. Rockafellar. Convex Analysis : ( PMS -28). In Convex Analysis . Princeton University Press, Princeton, NJ, Apr. 2015. ISBN 978-1-4008-7317-3. doi:10.1515/9781400873173. URL https://www.degruyter.com/document/doi/10.1515/9781400873173/html?lang=en
-
[28]
R. T. Rockafellar and S. Uryasev. Optimization of conditional value-at-risk. The Journal of Risk, 2 0 (3): 0 21--41, 2000. ISSN 14651211. doi:10.21314/JOR.2000.038. URL http://www.risk.net/journal-of-risk/technical-paper/2161159/optimization-conditional-value-risk
-
[29]
S. Sarykalin, G. Serraino, and S. Uryasev. Value-at- Risk vs. Conditional Value -at- Risk in Risk Management and Optimization . In State-of-the- Art Decision - Making Tools in the Information - Intensive Age , INFORMS TutORials in Operations Research , pages 270--294. INFORMS, Catonsville, MD, Sept. 2008. ISBN 978-1-877640-23-0. doi:10.1287/educ.1080.0052...
-
[30]
J. Shao and D. Tu. The Jackknife and Bootstrap . Springer Science & Business Media, New York, NY, Dec. 2012. ISBN 978-1-4612-0795-5. Google-Books-ID: VO3SBwAAQBAJ
work page 2012
-
[31]
A. Shapiro, D. Dentcheva, and A. Ruszczynski. Lectures on Stochastic Programming : Modeling and Theory , Third Edition . MOS - SIAM Series on Optimization . Society for Industrial and Applied Mathematics, Philadelphia, PA, July 2021. ISBN 978-1-61197-658-8. doi:10.1137/1.9781611976595. URL https://epubs.siam.org/doi/book/10.1137/1.9781611976595
-
[32]
A. Slivkins. Introduction to multi-armed bandits. Foundations and Trends® in Machine Learning, 12 0 (1-2): 0 1--286, 2019. ISSN 1935-8237. doi:10.1561/2200000068
-
[33]
A. W. v. d. Vaart. Asymptotic Statistics . Cambridge Series in Statistical and Probabilistic Mathematics . Cambridge University Press, Cambridge, 1998. ISBN 978-0-521-78450-4. doi:10.1017/CBO9780511802256. URL https://www.cambridge.org/core/books/asymptotic-statistics/A3C7DAD3F7E66A1FA60E9C8FE132EE1D
-
[34]
R. M. Van Slyke and R. Wets. L- shaped linear programs with applications to optimal control and stochastic programming . SIAM Journal on Applied Mathematics, 17 0 (4): 0 638--663, 1969. ISSN 0036-1399. URL https://www.jstor.org/stable/2099310. Publisher: Society for Industrial and Applied Mathematics
-
[35]
Y. Wang, K. Dong, X. Chen, and L. Wang. Q-learning with ucb exploration is sample efficient for infinite-horizon mdp. In International Conference on Learning Representations, 2020. URL https://openreview.net/forum?id=BkglSTNFDB
work page 2020
-
[36]
W. Wiesemann, D. Kuhn, and M. Sim. Distributionally Robust Convex Optimization . Operations Research, 62 0 (6): 0 1358--1376, Dec. 2014. ISSN 0030-364X. doi:10.1287/opre.2014.1314. URL https://pubsonline.informs.org/doi/10.1287/opre.2014.1314. Publisher: INFORMS
-
[37]
W. Xie. Tractable reformulations of two-stage distributionally robust linear programs over the type- Wasserstein ball. Operations Research Letters, 48, June 2020. doi:10.1016/j.orl.2020.06.003
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.