pith. sign in

arxiv: 2605.07080 · v1 · submitted 2026-05-08 · 💻 cs.AI · cs.DS

Online Allocation with Unknown Shared Supply

Pith reviewed 2026-05-11 01:26 UTC · model grok-4.3

classification 💻 cs.AI cs.DS
keywords online allocationshared supplyapproximation algorithmsinventory managementlost salesfixed-charge costslearning-augmented algorithms
0
0 comments X

The pith

A deterministic threshold-proportional policy achieves a 4/3-approximation for online allocation of unknown shared supply.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper defines the Online Shared Supply Allocation problem, in which a central allocator must distribute a fixed but unknown total supply to sites that face demands arriving one by one, paying fixed charges to ship and incurring permanent losses on unmet demand. It introduces the GPA policy, a simple deterministic rule that sets proportional thresholds to decide how much supply to send at each step, and proves that this rule stays within a 4/3 factor of the best offline plan plus an additive error that does not grow with supply size. The result is relevant to prepositioning tasks such as vaccine distribution or humanitarian logistics, where supply cannot be adjusted after the fact and stockouts are irreversible. Matching lower bounds show that both the ratio and the additive error are necessary in the worst case, even for randomized algorithms that know the total supply in advance. The authors further give a learning-augmented version of GPA that uses possibly inaccurate forecasts to improve performance while remaining no worse than the base policy if the forecasts are bad.

Core claim

We introduce the OSSA model for sequential adversarial demands against unknown shared supply with fixed-charge transportation costs and lost-sales penalties. The GPA threshold-proportional policy allocates supply by comparing each site's demand against dynamically set proportional thresholds and achieves a 4/3-approximation to the offline optimum up to an additive term independent of total supply. We prove matching lower bounds establishing that the ratio is tight and that the additive error cannot be removed even by randomized algorithms that know the supply in advance, and we give a robust learning-augmented extension that incorporates imperfect forecasts.

What carries the argument

The GPA policy: a deterministic threshold-proportional allocation rule that decides shipments by comparing realized demand to thresholds scaled proportionally across sites to balance transportation and lost-sales costs.

If this is right

  • Real-world systems facing irreversible stockouts, such as vaccine prepositioning or humanitarian aid, can allocate supply online without knowing the total in advance while staying within 4/3 of the best offline plan.
  • No algorithm, even randomized and given the total supply upfront, can guarantee a better worst-case ratio than 4/3.
  • The additive error term remains unavoidable regardless of whether the algorithm knows the supply size.
  • Imperfect forecasts from experts or models can be folded into the policy to improve typical performance without risking outcomes worse than the base GPA when the forecasts are poor.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same proportional-threshold idea could be tested on related online problems with unknown capacities, such as dynamic server allocation or energy dispatch.
  • In practice the additive error might be bounded explicitly by the number of sites or by a simple estimate of demand variability.
  • The forecast-robust extension provides a template for safe hybrid human-ML decision rules in other sequential allocation settings.

Load-bearing premise

Demands arrive sequentially and must be met immediately or lost forever, with no backlogging, no replenishment, and fixed per-shipment transportation charges.

What would settle it

A concrete sequence of site demands in which the total cost of GPA (or any online policy) exceeds 4/3 times the offline optimum by more than a constant independent of supply size.

Figures

Figures reproduced from arXiv: 2605.07080 by Davin Choo, Mengchu Yue, Milind Tambe, Tzeh Yuan Neoh.

Figure 1
Figure 1. Figure 1: Consider an example with n = 6 sites. Suppose OPT’s supply allocation proportion γ¯ ⋆ (black crosses) has pivotal site i ⋆ = 4 with γ ⋆ 4 = ζ, and an online algorithm ALG allocates proportional allocation γ¯ (blue circles). For each site i, ALG incurs additional sitewise penalty for unmet demand when γ ⋆ i > γi and additional sitewise transport cost when γ ⋆ i < γi . A threshold-proportional online allocat… view at source ↗
Figure 2
Figure 2. Figure 2: (Left) Competitive ratio α as a function of the hyperparameter τ . The dotted blue and red curves denote the upper bounds from Lemma 2 and Lemma 3 respectively. The solid green curve represents the overall competitive ratio, given by the pointwise maximum of the two bounds, and is minimized at τ = 1/3. For example, if one is willing to tolerate a competitive ratio of α = 1.8, any choice of τ ∈ [0.2, 0.8] s… view at source ↗
Figure 3
Figure 3. Figure 3: Top row: Synthetic experiments over 30 runs. Bottom row: Real-world taxi dataset [NYC] repurposed for OSSA. The left column records cost incurred by all policies as a function of the total supply available. Meanwhile, the ratios cost(ALG)/cost(OPT) are separated across two plots for visual clarity. The middle plot compares the ratio of GPA (blue) with other baselines while the right plot compares GPA with … view at source ↗
Figure 4
Figure 4. Figure 4: Synthetic experiments over 30 runs on an OSSA instance with n = 50 sites and T = 10, 000 time steps. To simulate different distributions of weights, each set of experiment uses weights drawn independently from β(x, x) distribution for different x ∈ {0.5, 1.0, 2.0} parameters. See Section B.2 for further details. where Di = PT t=1 d t i is the total demand at site i. We repeat each setting over 30 independe… view at source ↗
Figure 5
Figure 5. Figure 5: Real-world inspired experiment by repurposing the NYC taxi dataset See [PITH_FULL_IMAGE:figures/full_fig_p032_5.png] view at source ↗
read the original abstract

Many real-world resource allocation systems, such as humanitarian logistics and vaccine distribution, must preposition limited supply across multiple locations before demand is realized while stockouts incur irreversible service losses. To study this, we introduce the Online Shared Supply Allocation (OSSA) problem, a stateful online model in which a central hub allocates a finite, unknown supply to multiple sites facing sequential demand under fixed-charge transportation costs and lost-sales penalties. Unlike classical make-to-stock or make-to-order inventory models, OSSA precludes backlogging and replenishment only hedges against future demand. To tackle OSSA, we propose a deterministic threshold-proportional policy GPA and prove that it achieves a $4/3$-approximation to the offline optimum up to an additive term independent of the total supply. We complement this with matching lower bounds showing that the $4/3$ ratio is tight and that the additive-error dependence is unavoidable, even for randomized algorithms that know the total supply upfront. Finally, we develop a learning-augmented extension to GPA that principally incorporates imperfect forecasts (e.g., from human experts or ML models) commonly available in practice, enabling us to exploit high-quality advice while being robust against arbitrary bad ones. Synthetic and real-world experiments show that GPA outperforms natural baselines with global supply is scarce.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 3 minor

Summary. The manuscript introduces the Online Shared Supply Allocation (OSSA) problem, a stateful online model in which a central hub allocates finite unknown supply to multiple sites facing sequential adversarial demands under fixed-charge transportation costs and lost-sales penalties (no backlogging or replenishment). The authors propose a deterministic threshold-proportional policy called GPA and claim to prove that it achieves a 4/3-approximation to the offline optimum up to an additive error term independent of total supply. They provide matching lower bounds (including for randomized algorithms that know total supply), develop a learning-augmented extension that exploits high-quality forecasts while remaining robust to poor advice, and report synthetic and real-world experiments showing outperformance over baselines when supply is scarce.

Significance. If the claimed proofs are correct, the result is significant for online algorithms and inventory/logistics applications. The 4/3 ratio with additive error independent of supply size, combined with tightness via matching lower bounds, provides a clean benchmark for shared-supply allocation under adversarial demands. The learning-augmented variant adds practical value by handling imperfect forecasts from experts or ML models. The model assumptions are explicitly stated, and the central claim is not circular (the offline optimum is independently defined).

minor comments (3)
  1. Abstract: the final sentence contains a grammatical error ('with global supply is scarce' should read 'when global supply is scarce').
  2. Abstract: a one-sentence comparison to classical make-to-stock or make-to-order models would help readers immediately see what is new about precluding backlogging and replenishment.
  3. The paper claims proofs for the 4/3-approximation, tightness, and lower bounds, but the provided text does not include the detailed derivations; these should be clearly labeled (e.g., Theorem 1, Theorem 2) with all steps shown.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive assessment of our work on the OSSA problem, the GPA policy, and its learning-augmented extension. We appreciate the recognition of the significance of the 4/3-approximation result with matching lower bounds and the practical value of the forecast-robust variant. We will prepare a revised manuscript addressing any minor points.

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper defines the OSSA model with an independently specified offline optimum, introduces the GPA policy as a threshold-proportional rule, and derives the 4/3-approximation via standard competitive analysis with separate matching lower bounds. No step reduces a claimed prediction or guarantee to a fitted parameter, self-definition, or load-bearing self-citation chain; the offline benchmark and ratio are not constructed from the online policy itself. The learning-augmented variant similarly balances advice without circular reduction. This is a self-contained algorithmic result under the stated assumptions.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 2 invented entities

The central contribution rests on defining a new problem model and designing an algorithm with approximation guarantees under standard online computation assumptions; no free parameters are fitted in the stated results.

axioms (2)
  • domain assumption Demands arrive sequentially without knowledge of future requests or total supply.
    Core to the online nature of OSSA as described.
  • domain assumption Replenishment is not possible after initial allocation; only hedging against future demand is allowed.
    Distinguishes OSSA from classical inventory models with backlogging.
invented entities (2)
  • OSSA problem no independent evidence
    purpose: Models allocation of unknown shared finite supply with fixed-charge costs and lost-sales penalties.
    Newly introduced model in the paper.
  • GPA policy no independent evidence
    purpose: Deterministic threshold-proportional allocation rule for OSSA.
    Newly proposed algorithm with proven guarantee.

pith-pipeline@v0.9.0 · 5529 in / 1633 out tokens · 57232 ms · 2026-05-11T01:26:50.096686+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

15 extracted references · 15 canonical work pages

  1. [1]

    Sincer t i = (kt i −d t i)+, and in this caser t i >0, we haver t i =k t i −d t i

    Supposek t i ≥b i. Sincer t i = (kt i −d t i)+, and in this caser t i >0, we haver t i =k t i −d t i. Therefore, Lt i =L t−1 i +ℓ t i ≥γ iDt−1 i +k t i −b i +ℓ t i (By induction hypothesis) =γ i(Dt i −d t i) +d t i +r t i −b i +ℓ t i (SinceD t i =D t−1 i +d t i andr t i =k t i −d t i) =γ iDt i + (1−γ i)dt i +k t+1 i −b i (Sincek t+1 i =r t i +ℓ t i) ≥γ iD...

  2. [2]

    nX i=1 (d2 i −k 2 i ) # =E H

    Suppose kt i < b i. If rt i = 0 , then dt i ≤b i =b i −r t i. Meanwhile, if rt i >0 , then rt i =k t i −d t i < b i −d t i. In either case, regardless of whetherr t i = 0orr t i >0, we have bi −r t i ≥d t i.(5) Therefore, Lt i =L t−1 i +ℓ t i ≥γ iDt−1 i +ℓ t i (By the induction hypothesis) =γ i(Dt i −d t i) +ℓ t i (SinceD t i =D t−1 i +d t i) ≥γ i(Dt i −(...

  3. [3]

    Define constant K= 28C

    We first tackle the site-wise total demand prediction and one-step lookahead prediction. Define constant K= 28C . For both E1 and E2, we have dt 1 = 0, dt 2 = 1, dt 3 = 0 for timestep t= 2, . . . , K+ 1 and we have dt 1 = 1, dt 2 = 0, dt 3 = 0 for timestep t=K+ 2, . . . ,2K+ 1 . As the demand arrivals for both events E1 and E2 are the same, the prediction...

  4. [4]

    Let K= 28C

    We next tackle the total demand and total supply prediction. Let K= 28C . For both E1 and E2, we have s=K . We define a probability distribution over the two remaining demand arrival eventsE 1 andE 2 with equal probability. (a)E 1:Define dt 1 = 0, d t 2 = 1, d t 3 = 0 for timestep t= 2, . . . , K+ 1 and we have dt 1 = 1, dt 2 = 0, dt 3 = 0for timestept=K+...

  5. [5]

    Proof.For the first claim, we first note that √ 1 +λ− √ λ= ( √ 1 +λ− √ λ)( √ 1 +λ+ √ λ)√ 1 +λ+ √ λ = (1 +λ)−λ√ 1 +λ+ √ λ = 1√ 1 +λ+ √ λ

    Ifτ= ( √ 1 +λ− √ λ)2, then (1−τ) 2 4τ =λandτ≤ (1−λ)2 4λ . Proof.For the first claim, we first note that √ 1 +λ− √ λ= ( √ 1 +λ− √ λ)( √ 1 +λ+ √ λ)√ 1 +λ+ √ λ = (1 +λ)−λ√ 1 +λ+ √ λ = 1√ 1 +λ+ √ λ . Hence √ 1 +λ− √ λ 2 = 1 ( √1+λ+ √ λ) 2 . Thus, (1−τ) 2 4τ = 1 4 1√τ − √τ 2 = 1 4 √ 1 +λ+ √ λ− √ 1 +λ+ √ λ 2 =λ 21 For the second claim, as we showed that λ= (1−τ...

  6. [6]

    , i ′}, we have γi = min{1, τ pci wi }< τ pci wi by construction

    For i∈ {1, . . . , i ′}, we have γi = min{1, τ pci wi }< τ pci wi by construction. So, for i∈ {1, . . . , i′}, costi(¯ℓ)≤ 1 + (1−τ) 2 4τ ·cost i(OPT) + X i∈[n] 3p(bi +c i) (By site-wise proof in Lemma 2) = (1 +λ)·cost i(OPT) + X i∈[n] 3p(bi +c i)(By Lemma 15)

  7. [7]

    , n}, we haveγi ≥ˆγi by Lemma 16

    Fori∈ {i ′ + 1, . . . , n}, we haveγi ≥ˆγi by Lemma 16. 25 Meanwhile, by Lemma 16, we haveγ i ≥ˆγi fori∈ {i ′ + 1, . . . , n}. So costi(¯ℓ) = transporti(¯ℓ) + penaltyi(¯ℓ) ≤γ iDi wi ci +p(b i + 2ci) + (1−γ i)pDi +p(b i +c i)(By Eq. (2) and Eq. (3)) =pD i −γ iDi(p− wi ci ) +p(2b i + 3ci) ≤pD i −ˆγiDi(p− wi ci ) +p(2b i + 3ci)(Sinceγ i ≥ˆγi andw i ≤pc i) ≤p...

  8. [8]

    , i ′}, we haveγ i ≤ˆγi by Lemma 17

    Fori∈ {1, . . . , i ′}, we haveγ i ≤ˆγi by Lemma 17. By item 1 of Lemma 4, we have 26 transporti(¯ℓ)≤γ iDi wi ci +p(b i + 2ci)(item 2 of Lemma 4) ≤γ i(Ni +b i) wi ci +p(b i + 2ci)(asN i +b i =c i) ≤γ iNi wi ci +p(2b i + 2ci)(as wi ci ≤p) ≤ˆγiNi wi ci +p(2b i + 2ci)(asγ i ≤ˆγi) Meanwhile,transport i( ¯OPT) =γ OPT i Ni wi ci . Thus, ∆(transporti) = transpor...

  9. [9]

    , n}, we have γi = min(1, λpci wi )≤ λpci wi

    For i∈ {i ′ + 1, . . . , n}, we have γi = min(1, λpci wi )≤ λpci wi . By item 2 of Lemma 4, we have∆(transport i)≤λ·penalty i(OPT) + 2p(bi +c i). In either case, asλ >0and the term|γ OPT i −ˆγi|Nipis non-negative, we have ∆(transporti)≤λ·penalty i(OPT) +|γ OPT i −ˆγi|Nip+ 2p(b i +c i)(13) Hence, cost(¯ℓ) = cost(OPT) + ∆(transport) + ∆(penalty) = cost(OPT)...

  10. [10]

    Robustness:cost(LA-GPA(λ))≤(1 + (1−λ)2 4λ )·cost(OPT) +O( P i∈[n](bi +c i))

  11. [11]

    Proposition 10.Consider the predictions {ˆs, { ˆDi}i∈[n]} for the given OSSA instance with prediction error η=|s−ˆs|+ Pn i=1 |Di − ˆDi| ≥0

    Consistency / Smoothness: cost(LA-GPA(λ))≤(1+λ)·cost(OPT)+3ηp+O( P i∈[n] p(bi+ci)) 27 Proof.Combine the conclusions of Lemma 22 and Lemma 23. Proposition 10.Consider the predictions {ˆs, { ˆDi}i∈[n]} for the given OSSA instance with prediction error η=|s−ˆs|+ Pn i=1 |Di − ˆDi| ≥0 . For any λ∈(0, 1 3] and 0< ε <1 , no (possibly randomized) online algorithm...

  12. [12]

    For allη≥0, we haveE[cost(ALG)]≤(1 + (1−λ)2 4λ )·cost(OPT) +O( P i∈[n] p(bi +c i))

  13. [13]

    Let τ=λε

    Ifη= 0, thenE[cost(ALG)]≤(1 +λε)·cost(OPT) +O( P i∈[n] p(bi +c i)) Proof. Let τ=λε . Since 0< ε <1 , we have0< τ < λ≤1/3 . Define Rλ = 1+ (1−λ)2 4λ = (1+λ)2 4λ andR τ = 1 + (1−τ) 2 4τ = (1+τ) 2 4τ . Sinceτ < λ≤1/3, we haveR τ > R λ. Let∆ =R τ −R λ >0. Suppose, for contradiction, that there exists a possibly randomized online algorithm ALG satisfying both ...

  14. [14]

    AlwaysFill: An aggressive replenishment policy. After each round of demand, this policy always requests enough supply to restore each site’s inventory to at leastbi, rounded up to ci to “extract” maximum utility from each fixed transportation costwi incurred. 6https://github.com/cxjdavin/online-allocation-with-unknown-shared-supply 29 4.ρ -Greedy: This po...

  15. [15]

    Backlog: This policy is inspired by backlogging strategies from the OWMR and JRP liter- ature. For each site i∈[n] , this policy tracks accumulated unmet demand since the last successful resupply and requests enough supply to restore the inventory to at least bi once the accumulated unmet demand exceedsw i. 7.LA-GPA(λ) : We run Algorithm 1 using threshold...