pith. sign in

arxiv: 2412.01763 · v3 · submitted 2024-12-02 · 🧮 math.OC · cs.LG· stat.ML

The Data-Driven Censored Newsvendor Problem

Pith reviewed 2026-05-23 07:49 UTC · model grok-4.3

classification 🧮 math.OC cs.LGstat.ML
keywords censored newsvendordemand censoringregret boundsdistributionally robust optimizationdata-driven inventoryfinite-sample guarantees
0
0 comments X

The pith

Demand censoring permits vanishing regret in the newsvendor problem only under a specific condition on the demand distribution beyond the largest observed order.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper studies how censored sales observations limit the ability to learn optimal order quantities that minimize overage and underage costs. It isolates the effect of censoring through a worst-case analysis over an ambiguity set of possible demand distributions that agree with observed data up to the maximum historical order but can behave arbitrarily beyond it. A necessary and sufficient condition on this ambiguity set determines whether any learning policy can drive regret to zero with more data. When the condition fails, the analysis supplies an exact positive lower bound on the regret that no policy can beat, even with unlimited samples. The work then constructs an adaptive algorithm whose finite-sample performance tracks these sharp characterizations across all censoring regimes.

Core claim

Under the ambiguity set of all distributions matching the true demand up to the largest historical order quantity, vanishing regret is achievable if and only if a natural condition holds; otherwise the worst-case regret remains bounded below by a positive constant even with infinitely many samples, and this bound is tight.

What carries the argument

The ambiguity set defined by the largest historical order quantity, which forces all candidate distributions to coincide with the true demand only up to that boundary and allows arbitrary continuation afterward.

If this is right

  • When the condition holds, regret of suitable policies tends to zero as the number of samples grows.
  • When the condition fails, every policy incurs at least a fixed positive regret no matter how large the sample size.
  • The adaptive algorithm achieves finite-sample guarantees that match the lower bounds up to polylogarithmic factors in every regime.
  • Performance remains robust on both synthetic and real-world demand data across varying degrees of historical censoring.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Inventory systems could first test the observed data against the condition before deciding whether to trust historical sales for future ordering.
  • The same boundary-based ambiguity construction may apply to other sequential decision problems with one-sided observability, such as dynamic pricing with lost sales.
  • The information-loss lower bound suggests that collecting uncensored demand probes at occasional high order levels could restore vanishing regret when the condition would otherwise fail.

Load-bearing premise

The ambiguity set contains every distribution that matches demand exactly up to the largest observed order while permitting any behavior beyond that point.

What would settle it

Generate demand samples from a distribution violating the condition, run any policy on increasing numbers of censored observations, and check whether the realized regret remains bounded away from zero by the predicted constant.

read the original abstract

We study a censored variant of the data-driven newsvendor problem, where the decision-maker must select an ordering quantity that minimizes expected overage and underage costs based only on offline censored sales data, rather than historical demand realizations. Our goal is to understand how the degree of historical demand censoring affects the performance of any learning algorithm for this problem. To isolate this impact, we adopt a distributionally robust optimization framework, evaluating policies according to their worst-case regret over an ambiguity set of distributions. This set is defined by the largest historical order quantity (the observable boundary of the dataset), and contains all distributions matching the true demand distribution up to this boundary, while allowing them to be arbitrary afterwards. We demonstrate a spectrum of achievability under demand censoring by deriving a natural necessary and sufficient condition under which vanishing regret is an achievable goal. In regimes in which it is not, we exactly characterize the information loss due to censoring: an insurmountable lower bound on the performance of any policy, even when the decision-maker has access to infinitely many demand samples. We then leverage these sharp characterizations to propose a natural robust algorithm that adapts to the historical level of demand censoring. We derive finite-sample guarantees for this algorithm across all possible censoring regimes and show its near-optimality with matching lower bounds (up to polylogarithmic factors). We moreover demonstrate its robust performance via extensive numerical experiments on both synthetic and real-world datasets.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 3 minor

Summary. The paper studies the censored data-driven newsvendor problem in a distributionally robust framework. The ambiguity set is defined solely by the largest historical order quantity and consists of all distributions agreeing with the unknown true demand up to that boundary. The central results are a necessary-and-sufficient condition on the censoring level for vanishing regret to be achievable, an exact matching lower bound on regret (even with infinitely many samples) when the condition fails, and an adaptive robust algorithm whose finite-sample regret guarantees are near-optimal up to polylog factors across all regimes, supported by numerical experiments on synthetic and real data.

Significance. If the derivations hold, the work supplies the first sharp, distribution-free characterization of the precise information loss induced by censoring in the newsvendor setting and delivers a practical algorithm that automatically adapts to the observed censoring level. The explicit N&S condition, matching lower bounds, and finite-sample guarantees (with reproducible code or parameter-free derivations not claimed) constitute a substantive advance for robust optimization under partial observability.

major comments (2)
  1. [§4, Theorem 1] §4 (Theorem 1 and Corollary 1): the necessity direction of the N&S condition for vanishing regret is derived from a worst-case distribution that places all remaining mass above the observed boundary; the paper should explicitly verify that this construction remains feasible under the cost asymmetry parameters c_u and c_o used throughout the analysis.
  2. [§5.2, Theorem 3] §5.2, Algorithm 1 and Theorem 3: the finite-sample upper bound is stated to match the lower bound up to polylog factors, yet the proof sketch invokes a uniform convergence argument over the censored empirical process; the dependence of the covering number on the unknown boundary value b should be made fully explicit to confirm the claimed rate.
minor comments (3)
  1. [§2] Notation for the ambiguity set P_b is introduced in §2 but reused with different subscripts in §3; a single consistent definition would improve readability.
  2. [Figure 2] Figure 2 (regret vs. sample size) lacks error bars or shading for variability across the 50 replications; adding these would clarify the reported stability.
  3. [§6.2] The real-world dataset description in §6.2 omits the precise definition of the 'sales' column used to construct the censored observations.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading, positive recommendation, and constructive suggestions. We address each major comment below and will incorporate the requested clarifications into the revised manuscript.

read point-by-point responses
  1. Referee: [§4, Theorem 1] §4 (Theorem 1 and Corollary 1): the necessity direction of the N&S condition for vanishing regret is derived from a worst-case distribution that places all remaining mass above the observed boundary; the paper should explicitly verify that this construction remains feasible under the cost asymmetry parameters c_u and c_o used throughout the analysis.

    Authors: The worst-case distribution is constructed to lie inside the ambiguity set, which by definition allows arbitrary mass placement beyond the observed boundary b. Because the newsvendor costs satisfy c_u > 0 and c_o > 0 (as maintained throughout the paper), the expected cost of this distribution is finite and the construction is feasible. To make the verification fully explicit as requested, we will add a short clarifying paragraph immediately after the statement of Theorem 1. revision: yes

  2. Referee: [§5.2, Theorem 3] §5.2, Algorithm 1 and Theorem 3: the finite-sample upper bound is stated to match the lower bound up to polylog factors, yet the proof sketch invokes a uniform convergence argument over the censored empirical process; the dependence of the covering number on the unknown boundary value b should be made fully explicit to confirm the claimed rate.

    Authors: We agree that an explicit derivation of the covering-number dependence on b will strengthen the presentation. The entropy integral for the censored class grows at most logarithmically with b, which is absorbed into the polylog factor already stated. In the revision we will expand the proof sketch of Theorem 3 to display this dependence explicitly, confirming that the claimed rate holds uniformly over all admissible b. revision: yes

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper explicitly defines its ambiguity set via the largest observed order quantity and derives the necessary-and-sufficient condition for vanishing regret, plus the matching lower bound, via direct worst-case analysis over distributions consistent with observed data up to that boundary. Finite-sample guarantees follow from standard empirical-process arguments adapted to censoring; none of the central claims reduce by construction to fitted parameters, self-referential predictions, or load-bearing self-citations. The modeling choice is stated upfront and determines the results without circular reduction.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claims rest on the definition of the ambiguity set and standard regret analysis; no free parameters are introduced, no new entities are postulated, and axioms are standard mathematical definitions of regret and distributional robustness.

axioms (2)
  • standard math Regret is measured as the difference between the policy's expected cost and the optimal cost under the true distribution.
    Invoked when defining vanishing regret and lower bounds.
  • domain assumption The ambiguity set contains every distribution that agrees with the observed censored data up to the maximum historical order quantity.
    This is the modeling choice that defines the worst-case analysis.

pith-pipeline@v0.9.0 · 5790 in / 1399 out tokens · 48117 ms · 2026-05-23T07:49:59.054701+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

67 extracted references · 67 canonical work pages · 1 internal anchor

  1. [1]

    , " * write output.state after.block = add.period write newline

    ENTRY address author booktitle chapter doi edition editor eid howpublished institution isbn issn journal key month note number organization pages publisher school series title type url volume year label extra.label sort.label short.list INTEGERS output.state before.all mid.sentence after.sentence after.block FUNCTION init.state.consts #0 'before.all := #1...

  2. [2]

    write newline

    " write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION word.in "" FUNCTION format.date year ...

  3. [3]

    Naval Research Logistics (NRL) 43(6):839--861

    Agrawal N, Smith SA (1996) Estimating negative binomial demand for retail inventory management with unobservable lost sales. Naval Research Logistics (NRL) 43(6):839--861

  4. [4]

    Operations Research 70(3):1646--1664

    Agrawal S, Jia R (2022) Learning in structured mdps with convex cost functions: Improved regret bounds for inventory management. Operations Research 70(3):1646--1664

  5. [5]

    Management Science 31(9):1150--1160

    Azoury KS (1985) Bayes solution to dynamic inventory models under unknown demand distribution. Management Science 31(9):1150--1160

  6. [6]

    Operations Research 68(2):309--326

    Ban GY (2020) Confidence intervals for data-driven inventory policies with demand censoring. Operations Research 68(2):309--326

  7. [7]

    Operations Research 67(1):90--108

    Ban GY, Rudin C (2019) The big data newsvendor: Practical insights from machine learning. Operations Research 67(1):90--108

  8. [8]

    the censored newsvendor and the optimal acquisition of information

    Bensoussan A, Sethi SP (2009) A note on “the censored newsvendor and the optimal acquisition of information”. Operations Research 57(3):791--794

  9. [9]

    Stochastic Systems 12(4):319--339

    Besbes O, Chaneton JM, Moallemi CC (2022 a ) The exploration-exploitation trade-off in the newsvendor problem. Stochastic Systems 12(4):319--339

  10. [10]

    Advances in Neural Information Processing Systems 35:23979--23991

    Besbes O, Ma W, Mouchtaki O (2022 b ) Beyond iid: data-driven decision-making in heterogeneous environments. Advances in Neural Information Processing Systems 35:23979--23991

  11. [11]

    Management Science 69(10):5848--5865

    Besbes O, Mouchtaki O (2023) How big should your data really be? data-driven newsvendor: Learning one sample at a time. Management Science 69(10):5848--5865

  12. [12]

    Management Science 59(6):1407--1424

    Besbes O, Muharremoglu A (2013) On implications of demand censoring in the newsvendor problem. Management Science 59(6):1407--1424

  13. [13]

    Manufacturing & Service Operations Management 13(4):525--533

    Bisi A, Dada M, Tokdar S (2011) A censored-data multiperiod inventory problem with newsvendor demand distributions. Manufacturing & Service Operations Management 13(4):525--533

  14. [14]

    Boucheron S, Lugosi G, Massart P (2013) Concentration Inequalities: A Nonasymptotic Theory of Independence (Oxford University Press), ISBN 9780199535255, ://dx.doi.org/10.1093/acprof:oso/9780199535255.001.0001

  15. [15]

    Management Science 69(2):885--903

    Bu J, Simchi-Levi D, Wang L (2023) Offline pricing and demand learning with censored data. Management Science 69(2):885--903

  16. [16]

    Casella G, Berger R (2024) Statistical inference (CRC Press)

  17. [17]

    Management Science

    Chen B, Jiang J, Zhang J, Zhou Z (2024) Learning to order for inventory systems with lost sales and uncertain supplies. Management Science

  18. [18]

    arXiv preprint arXiv:2409.03505

    Chen Z, Ma W (2024) Survey of data-driven newsvendor: Unified analysis and spectrum of achievable regrets. arXiv preprint arXiv:2409.03505

  19. [19]

    Mathematics of Operations Research 44(2):668--692

    Cheung WC, Simchi-Levi D (2019) Sampling-based approximation schemes for capacitated stochastic inventory control models. Mathematics of Operations Research 44(2):668--692

  20. [20]

    ://dx.doi.org/10.5281/zenodo.14007206

    Davidson-Pilon C (2024) lifelines, survival analysis in python. ://dx.doi.org/10.5281/zenodo.14007206

  21. [21]

    Manufacturing & Service Operations Management 26(3):1157--1172

    Ding J, Huh WT, Rong Y (2024) Feature-based inventory control with censored demand. Manufacturing & Service Operations Management 26(3):1157--1172

  22. [22]

    Operations Research 50(3):517--527

    Ding X, Puterman ML, Bisi A (2002) The censored newsvendor and the optimal acquisition of information. Operations Research 50(3):517--527

  23. [23]

    ://web.stanford.edu/class/stats311/lecture-notes.pdf

    Duchi J (2024) Lecture notes on statistics and information theory. ://web.stanford.edu/class/stats311/lecture-notes.pdf

  24. [24]

    Available at SSRN 4828001

    Fan X, Chen B, Lennon Olsen T, Pinedo M, Qin H, Zhou Z (2024) Don’t follow RL blindly: Lower sample complexity of learning optimal inventory control policies with fixed ordering costs. Available at SSRN 4828001

  25. [25]

    Available at SSRN 4178567

    Fan X, Chen B, Zhou Z (2022) Sample complexity of policy learning for inventory control with censored demand. Available at SSRN 4178567

  26. [26]

    Management Science 59(2):305--322

    Farias VF, Jagabathula S, Shah D (2013) A nonparametric approach to modeling choice with limited data. Management Science 59(2):305--322

  27. [27]

    Manufacturing & Service Operations Management 26(5):1962--1977

    Fu M, Li X, Zhang L (2024) Distributionally robust newsvendor under stochastic dominance with a feature-based application. Manufacturing & Service Operations Management 26(5):1962--1977

  28. [28]

    Journal of the Operational Research Society 44(8):825--834

    Gallego G, Moon I (1993) The distribution free newsboy problem: review and extensions. Journal of the Operational Research Society 44(8):825--834

  29. [29]

    Management Science 47(8):1101--1112

    Godfrey GA, Powell WB (2001) An adaptive, distribution-free algorithm for the newsvendor problem with censored demands, with applications to inventory and distribution. Management Science 47(8):1101--1112

  30. [30]

    Biometrika 69(3):635--640

    Harrell FE, Davis C (1982) A new distribution-free quantile estimator. Biometrika 69(3):635--640

  31. [31]

    Operations Research 59(4):929--941

    Huh WT, Levi R, Rusmevichientong P, Orlin JB (2011) Adaptive data-driven inventory control with censored demand based on kaplan-meier estimator. Operations Research 59(4):929--941

  32. [32]

    Mathematics of Operations Research 34(1):103--123

    Huh WT, Rusmevichientong P (2009) A nonparametric asymptotic analysis of inventory planning with censored demand. Mathematics of Operations Research 34(1):103--123

  33. [33]

    Operations Research 63(1):134--150

    Jain A, Rudi N, Wang T (2015) Demand estimation and ordering under censoring: Stock-out timing is (almost) all you need. Operations Research 63(1):134--150

  34. [34]

    Communications in Statistics-Theory and Methods 11(19):2217--2238

    Kalgh W, Lachenbruch PA (1982) A generalized quantile estimator. Communications in Statistics-Theory and Methods 11(19):2217--2238

  35. [35]

    Journal of the American Statistical Association 53(282):457--481

    Kaplan EL, Meier P (1958) Nonparametric estimation from incomplete observations. Journal of the American Statistical Association 53(282):457--481

  36. [36]

    SIAM Journal on Optimization 12(2):479--502

    Kleywegt AJ, Shapiro A, Homem-de Mello T (2002) The sample average approximation method for stochastic discrete optimization. SIAM Journal on Optimization 12(2):479--502

  37. [37]

    Management Science 45(3):346--363

    Lariviere MA, Porteus EL (1999) Stalking information: Bayesian inventory management with unobserved lost sales. Management Science 45(3):346--363

  38. [38]

    Lattimore T, Szepesv \'a ri C (2020) Bandit algorithms (Cambridge University Press)

  39. [39]

    Journal of the Operational Research Society 72(8):1879--1897

    Lee S, Kim H, Moon I (2021) A data-driven distributionally robust newsvendor model with a wasserstein ambiguity set. Journal of the Operational Research Society 72(8):1879--1897

  40. [40]

    Operations Research 63(6):1294--1306

    Levi R, Perakis G, Uichanco J (2015) The data-driven newsvendor problem: New bounds and insights. Operations Research 63(6):1294--1306

  41. [41]

    Mathematics of Operations Research 32(4):821--839

    Levi R, Roundy RO, Shmoys DB (2007) Provably near-optimal sampling-based policies for stochastic inventory control models. Mathematics of Operations Research 32(4):821--839

  42. [42]

    Operations Research 70(4):1996--2012

    Lin M, Huh WT, Krishnan H, Uichanco J (2022) Data-driven newsvendor problem: Performance of the sample average approximation. Operations Research 70(4):1996--2012

  43. [43]

    Operations Research Letters 33(4):341--348

    Liyanage LH, Shanthikumar JG (2005) A practical inventory control policy using operational statistics. Operations Research Letters 33(4):341--348

  44. [44]

    Operations Research 56(4):1034--1038

    Lu X, Song JS, Zhu K (2008) Analysis of perishable-inventory systems with censored demand data. Operations Research 56(4):1034--1038

  45. [45]

    INFORMS Journal on Optimization 6(2):63--83

    Lugosi G, Markakis MG, Neu G (2024) On the hardness of learning from censored and nonstationary demand. INFORMS Journal on Optimization 6(2):63--83

  46. [46]

    The Annals of Probability 1269--1283

    Massart P (1990) The tight constant in the dvoretzky-kiefer-wolfowitz inequality. The Annals of Probability 1269--1283

  47. [47]

    Manufacturing & Service Operations Management 17(3):335--349

    Mersereau AJ (2015) Demand estimation from censored observations with inventory record inaccuracy. Manufacturing & Service Operations Management 17(3):335--349

  48. [48]

    Naval Research Logistics (NRL) 41(6):739--757

    Nahmias S (1994) Demand estimation in lost sales inventory systems. Naval Research Logistics (NRL) 41(6):739--757

  49. [49]

    Management Science 64(7):3146--3167

    Natarajan K, Sim M, Uichanco J (2018) Asymmetry and ambiguity in newsvendor models. Management Science 64(7):3146--3167

  50. [50]

    Operations Research 56(1):188--203

    Perakis G, Roels G (2008) Regret in the newsvendor model with partial information. Operations Research 56(1):188--203

  51. [51]

    Journal of the American Statistical Association 98(464):1001--1012

    Portnoy S (2003) Censored regression quantiles. Journal of the American Statistical Association 98(464):1001--1012

  52. [52]

    Journal of econometrics 32(1):143--155

    Powell JL (1986) Censored regression quantiles. Journal of econometrics 32(1):143--155

  53. [53]

    Operations Research 64(1):167--185

    Saghafian S, Tomlin B (2016) The newsvendor under demand ambiguity: Combining data with moment and tail information. Operations Research 64(1):167--185

  54. [54]

    ://www.kaggle.com/datasets/rohitsahoo/sales-forecasting, accessed: 2024-11-10

    Sahoo R (2023) Superstore sales dataset for sales forecasting. ://www.kaggle.com/datasets/rohitsahoo/sales-forecasting, accessed: 2024-11-10

  55. [55]

    On Reverse Pinsker Inequalities

    Sason I (2015) On reverse pinsker inequalities. arXiv preprint arXiv:1503.07118

  56. [56]

    The Annals of Mathematical Statistics 30(2):490--508

    Scarf H (1959) Bayes solutions of the statistical inventory problem. The Annals of Mathematical Statistics 30(2):490--508

  57. [57]

    Studies in the mathematical theory of inventory and production

    Scarf HE (1958) A min-max solution of an inventory problem. Studies in the mathematical theory of inventory and production

  58. [58]

    Operations Research 64(2):362--370

    Shi C, Chen W, Duenyas I (2016) Nonparametric data-driven algorithms for multiproduct inventory systems with censored demand. Operations Research 64(2):362--370

  59. [59]

    SIAM Journal on Scientific and Statistical Computing 4(4):706--711

    Tierney L (1983) A space-efficient recursive procedure for estimating a quantile of an unknown distribution. SIAM Journal on Scientific and Statistical Computing 4(4):706--711

  60. [60]

    Operations Research 60(2):313--334

    Vulcano G, Van Ryzin G, Ratliff R (2012) Estimating primary demand for substitutable products from sales transaction data. Operations Research 60(2):313--334

  61. [61]

    arXiv preprint arXiv:2404.11509

    Xie Y, Ma W, Xin L (2024) Vc theory for inventory policies. arXiv preprint arXiv:2404.11509

  62. [62]

    Manufacturing & Service Operations Management 24(1):504--523

    Xu L, Zheng Y, Jiang L (2022) A robust data-driven approach for the newsvendor problem with nonparametric information. Manufacturing & Service Operations Management 24(1):504--523

  63. [63]

    Journal of the American Statistical Association 80(392):1004--1011

    Yang SS (1985) A smooth nonparametric estimator of a quantile function. Journal of the American Statistical Association 80(392):1004--1011

  64. [64]

    Management Science 67(10):6089--6115

    Yuan H, Luo Q, Shi C (2021) Marrying stochastic gradient descent with bandits: Learning algorithms for inventory systems with fixed costs. Management Science 67(10):6089--6115

  65. [65]

    Management Science 70(4):2315--2329

    Zhang L, Yang J, Gao R (2024) Optimal robust policy for feature-based newsvendor. Management Science 70(4):2315--2329

  66. [66]

    Statistics & Probability Letters 45(1):79--84

    Zieli \'n ski R (1999) Best equivariant nonparametric estimator of a quantile. Statistics & Probability Letters 45(1):79--84

  67. [67]

    Zipkin P (2000) Foundations of inventory management (Maidenhead, England: Irwin Professional Publishing)