Pricing in Queues with Abandonments: Optimal Policies and Practical Heuristics
Pith reviewed 2026-05-22 14:27 UTC · model grok-4.3
The pith
In queues where customers abandon before service, optimal prices do not always increase with queue length.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Unlike traditional queueing systems without abandonments, we show that the optimal quoted prices do not always increase with the queue length in this setting. We fully characterize the possible structure of the optimal dynamic pricing policy and provide conditions guaranteeing that the optimal policy is increasing in the number of customers in the system. We introduce two heuristics that simplify the optimal dynamic pricing policy: the cutoff-static policy charges all admitted customers a fixed price up to a threshold, and the two-price policy charges one price for immediate service and another if the customer must wait. Both heuristics achieve near optimality in general.
What carries the argument
The dynamic pricing policy that selects a price for each arriving customer based on the current number in system, with the customer joining only if willingness-to-pay exceeds that price, under holding and abandonment costs.
If this is right
- The optimal policy may quote a lower price at a longer queue than at a shorter one under some parameter values.
- Sufficient conditions on costs and the willingness-to-pay distribution make the optimal policy monotone increasing in queue length.
- Both the cutoff-static heuristic and the two-price heuristic achieve long-run average profits close to those of the optimal policy.
- The two-price heuristic remains more robust than the cutoff-static heuristic across different parameter regimes.
Where Pith is reading between the lines
- The same structural results could guide pricing in other impatient-customer systems such as ride-hailing platforms or cloud computing queues.
- A natural next test is to measure how much profit is lost when the heuristics are applied to instances where the optimal policy is known to be non-monotone.
- Relaxing the fixed-distribution assumption so that willingness-to-pay can depend on observed queue length would likely alter the admissible policy structures.
Load-bearing premise
Each customer's willingness-to-pay is drawn independently from a fixed distribution that does not depend on the current system state or future prices.
What would settle it
For any chosen set of holding costs, abandonment penalties, and willingness-to-pay distribution, numerically solving the Markov decision process and obtaining a strictly increasing sequence of optimal prices for all queue lengths would refute the claim that prices need not increase.
read the original abstract
We investigate the optimal pricing strategy in a service-providing framework, where customers can leave the system prior to service completion. In this setting, a price is quoted to an incoming customer based on the current number of customers in the system. When the quoted price is lower than the price the incoming customer is willing to pay (which follows a fixed probability distribution), then the customer joins the system and a reward equal to the quoted price is earned. A cost is incurred upon abandonment and a holding cost is incurred for customers waiting to be served. Our goal is to determine the pricing policy that maximizes the long-run average profit. Unlike traditional queueing systems without abandonments, we show that the optimal quoted prices do not always increase with the queue length in this setting. We fully characterize the possible structure of the optimal dynamic pricing policy and provide conditions guaranteeing that the optimal policy is increasing in the number of customers in the system. Moreover, we introduce two heuristics that simplify the optimal dynamic pricing policy. Both heuristics admit customers until the number of customers in the system reaches a certain threshold. The cutoff-static policy charges all admitted customers a fixed price while the two-price policy charges one price when the arriving customer can enter service immediately and another price if the customer needs to wait. By selecting the price(s) and threshold that maximize the long-run average profit, both heuristics achieve near optimality in general and the two-price policy provides more robustness compared to the cutoff-static policy.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper studies optimal dynamic pricing in a single-server queue with linear abandonment rates, where an arriving customer is quoted a state-dependent price and joins if it is below their willingness-to-pay drawn from a fixed distribution. The objective is long-run average profit maximization, accounting for holding costs and abandonment penalties. The central claims are that, unlike abandonment-free systems, the optimal quoted price need not be monotonically increasing in queue length; the authors provide a full structural characterization of the optimal policy via average-reward MDP analysis on the birth-death process and derive sufficient conditions for monotonicity. They further introduce and numerically evaluate two threshold-based heuristics (cutoff-static and two-price) that achieve near-optimal performance.
Significance. If the structural results hold, the work usefully extends the MDP queue-control literature by isolating the effect of abandonments on policy monotonicity and by supplying simple, implementable heuristics whose performance is quantified across parameter regimes. The average-reward formulation and first-difference analysis of the relative value function are standard yet yield concrete, falsifiable predictions about admissible policy structures; the numerical comparison of the heuristics against the optimal policy adds practical value.
major comments (1)
- [§3.2] §3.2, optimality equations (3)–(5): the first-difference analysis establishing possible sign changes in the marginal value function is the load-bearing step for the non-monotonicity claim; the manuscript should explicitly verify that the abandonment term can indeed produce a non-monotone difference sequence for some parameter values (e.g., high abandonment rate relative to holding cost).
minor comments (3)
- [§5] The numerical section reports average optimality gaps but does not tabulate the frequency or magnitude of non-monotonicity instances; adding a column or figure showing when the optimal policy deviates from monotonicity would strengthen the empirical support for the theoretical claim.
- [§2] Notation for the relative value function and its first differences is introduced without a consolidated table; a short notation summary would improve readability.
- [§4.2] The two-price heuristic is defined by a single threshold and two prices; the manuscript should clarify whether the immediate-service price is always lower than the waiting price or whether the ordering is optimized without restriction.
Simulated Author's Rebuttal
We thank the referee for the careful reading and constructive feedback. We address the single major comment below and will incorporate the requested verification in the revised manuscript.
read point-by-point responses
-
Referee: [§3.2] §3.2, optimality equations (3)–(5): the first-difference analysis establishing possible sign changes in the marginal value function is the load-bearing step for the non-monotonicity claim; the manuscript should explicitly verify that the abandonment term can indeed produce a non-monotone difference sequence for some parameter values (e.g., high abandonment rate relative to holding cost).
Authors: We agree that an explicit numerical verification would strengthen the presentation of the non-monotonicity result. In the revised version we will add a short paragraph (or small table) immediately after the first-difference analysis in §3.2. For concrete parameter values with high abandonment rate relative to holding cost (e.g., abandonment rate 2.0, holding cost 0.5, service rate 1.0, and a standard exponential willingness-to-pay distribution), we will solve the average-reward optimality equations numerically and display the first differences of the relative value function, confirming that the abandonment term produces a sign change and hence a non-monotone optimal price sequence. This addition will be purely illustrative and will not alter the existing structural theorems. revision: yes
Circularity Check
No significant circularity identified
full rationale
The paper derives its main results on non-monotonic optimal prices and policy structure directly from standard average-reward MDP optimality equations applied to a birth-death process incorporating linear abandonment rates and a fixed willingness-to-pay distribution. The first-difference analysis of the relative value function and characterization of admissible policies follow from these equations without reduction to fitted inputs, self-citations that carry the central claim, or any renaming of known results as new derivations. The work is self-contained against external benchmarks of MDP theory and does not invoke author-specific uniqueness theorems or ansatzes.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Customer willingness-to-pay is drawn i.i.d. from a fixed distribution independent of system state
- domain assumption Arrivals, service, and abandonment times permit a Markovian state description
Reference graph
Works this paper leans on
-
[1]
+ 𝑁∑︁ 𝑗=0 P(𝑗|1, 𝜆 ∗ 1)ℎ(𝑗) −𝑟 0(𝜆∗
-
[2]
− 𝑁∑︁ 𝑗=0 P(𝑗|0, 𝜆 ∗ 1)ℎ(𝑗) =𝑟1(𝜆∗
-
[3]
+𝜆 ∗ 1ℎ(2) +𝛾 1ℎ(0) + (1−𝜆 ∗ 1 −𝛾 1)ℎ(1) − [𝑟 0(𝜆∗
-
[4]
+𝜆 ∗ 1ℎ(1) + (1−𝜆 ∗ 1)ℎ(0)] =𝑟1(𝜆∗
-
[5]
+𝜆 ∗ 1(ℎ(2) −ℎ(1)) + (1−𝜆 ∗ 1 − (𝜇+𝜃 𝑠)) (ℎ(1) −ℎ(0)) =− (𝑐 ℎ +𝑐 𝑠𝜃𝑠) +𝜆 ∗ 1 [ℎ(2) −ℎ(1) − (ℎ(1) −ℎ(0))] + (1− (𝜇+𝜃 𝑠)) (ℎ(1) −ℎ(0)) =𝜆∗ 1Δ2ℎ(0) +Δℎ(0) − (𝑐 ℎ +𝑐 𝑠𝜃𝑠) − (𝜇+𝜃 𝑠)Δℎ(0) =𝜆∗ 1Δ2ℎ(0) +Δℎ(0) −Δ𝑤(0).(EC.10) Hence,Δℎ(0) ≤𝜆 ∗ 1Δ2ℎ(0) +Δℎ(0) −Δ𝑤(0)and we immediately haveΔ𝑤(0) ≤𝜆 ∗ 1Δ2ℎ(0).Similarly, from (EC.2) and (EC.9), we have Δℎ(0)=ℎ(1) −ℎ(0)=𝑔...
-
[6]
+ 𝑁∑︁ 𝑗=0 P(𝑗|1, 𝜆 ∗ 0)ℎ(𝑗) −𝑟 0(𝜆∗
-
[7]
− 𝑁∑︁ 𝑗=0 P(𝑗|0, 𝜆 ∗ 0)ℎ(𝑗) =𝑟1(𝜆∗
-
[8]
+𝜆 ∗ 0ℎ(2) +𝛾 1ℎ(0) + (1−𝜆 ∗ 0 −𝛾 1)ℎ(1) − [𝑟 0(𝜆∗
-
[9]
+𝜆 ∗ 0ℎ(1) + (1−𝜆 ∗ 0)ℎ(0)] =𝑟1(𝜆∗
-
[10]
+𝜆 ∗ 0(ℎ(2) −ℎ(1)) + (1−𝜆 ∗ 0 − (𝜇+𝜃 𝑠)) (ℎ(1) −ℎ(0)) ec6e-companion toDi, Andrad ´ottir, Ayhan:Pricing in Queues with Abandonments =− (𝑐 ℎ +𝑐 𝑠𝜃𝑠) +𝜆 ∗ 0 [ℎ(2) −ℎ(1) − (ℎ(1) −ℎ(0))] + (1− (𝜇+𝜃 𝑠)) (ℎ(1) −ℎ(0)) =𝜆∗ 0Δ2ℎ(0) +Δℎ(0) − (𝑐 ℎ +𝑐 𝑠𝜃𝑠) − (𝜇+𝜃 𝑠)Δℎ(0) =𝜆∗ 0Δ2ℎ(0) +Δℎ(0) −Δ𝑤(0),(EC.11) and consequentlyΔ𝑤(0) ≥𝜆 ∗ 0Δ2ℎ(0).Thus, we have𝜆 ∗ 0Δ2ℎ(0) ≤Δ𝑤...
-
[11]
+𝛾 𝑁−1 Δ2ℎ(𝑁−2) ≤0, which completes the proof.□ We then derive the required lemmas to prove Lemmas EC.1.2 and EC.1.3. To simplify the notation, recall the sign function sgn(𝑥)= 1 when𝑥 >0, 0 when𝑥=0, −1 when𝑥 <0. Also, when sgn(𝑥)=sgn(𝑦)and𝑎≥0, we have sgn(𝑥+𝑎𝑦)=sgn(𝑥)=sgn(𝑦). The following lemma shows that the quantities in Lemma EC.1.1...
-
[12]
sgn(Δ 2ℎ(𝑛))=sgn(Δ𝑤(𝑛) +𝛾 𝑛Δ2ℎ(𝑛−1))when𝑛=1, . . . , 𝑁−2
-
[13]
𝑚−1∑︁ 𝑛=0 𝑃𝑛 ( ®𝜆∗) ! ˜R𝑠 + 𝑁∑︁ 𝑛=𝑚 𝑃𝑛 ( ®𝜆∗) ! ˜R𝑞 # =(1−𝑃 Λ 𝑞 ) ˜R𝑠 +𝑃 Λ 𝑞 ˜R𝑞 −
sgn(Δ 2ℎ(0))=sgn(Δ𝑤(0)). Proof of Lemma EC.1.4From Lemma EC.1.1, if 1≤𝑛≤𝑁−2 andΔ 2ℎ(𝑛)>0, thenΔ𝑤(𝑛) + 𝛾𝑛Δ2ℎ(𝑛−1) ≥𝜆 ∗ 𝑛Δ2ℎ(𝑛)>0. Conversely, whenΔ𝑤(𝑛) +Δ 2ℎ(𝑛−1)>0, we have𝜆 ∗ 𝑛+1Δ2ℎ(𝑛) ≥ Δ𝑤(𝑛) +𝛾 𝑛Δ2ℎ(𝑛−1)>0. Since𝜆 ∗ 𝑛+1 >0, we concludeΔ 2ℎ(𝑛)>0. Similarly, we get the same result whenΔ 2ℎ(𝑛)<0. WhenΔ 2ℎ(𝑛)=0, we have 0=𝜆 ∗ 𝑛Δ2ℎ(𝑛) ≤Δ𝑤(𝑛) +𝛾 𝑛Δ2ℎ(𝑛−1) ≤𝜆...
work page 2013
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.