Inpatient Overflow Management with Proximal Policy Optimization

Jim Dai; Jingjing Sun; Pengyi Shi

arxiv: 2410.13767 · v5 · submitted 2024-10-17 · 🧮 math.OC

Inpatient Overflow Management with Proximal Policy Optimization

Jingjing Sun , Jim Dai , Pengyi Shi This is my paper

Pith reviewed 2026-05-23 19:13 UTC · model grok-4.3

classification 🧮 math.OC

keywords inpatient overflow managementproximal policy optimizationatomic actionshospital patient flowqueueing-informed approximationtime-periodic decisionsreinforcement learningpartially shared policy network

0 comments

The pith

Proximal policy optimization with atomic actions manages inpatient overflow decisions at the scale of twenty patient classes and twenty wards.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes a reinforcement learning method based on proximal policy optimization that assigns arriving patients to primary or overflow wards while respecting time-periodic arrival patterns and long-run average costs. It tackles the combinatorial explosion of state and action spaces by decomposing simultaneous multi-patient decisions into sequential atomic actions and by adding a partially shared policy network plus a queueing-informed value approximation. These changes allow the algorithm to produce competitive policies using far less simulation data than standard reinforcement learning. Case studies show the resulting policies match or exceed those from existing benchmarks, including approximate dynamic programming that cannot run beyond five wards. If the approach holds, hospital operators gain a practical, scalable tool for real-time overflow management in large systems.

Core claim

The authors develop a scalable PPO framework for time-periodic inpatient overflow management that introduces atomic actions to decompose multi-patient routing into sequential assignments, employs a partially-shared policy network to balance parameter sharing with time-specific adaptations, and uses a queueing-informed value function approximation to improve evaluation. In systems with up to twenty patient classes and twenty wards, the resulting policies match or outperform benchmarks while approximate dynamic programming becomes computationally infeasible beyond five wards, and the method reduces the volume of simulation data required.

What carries the argument

Atomic actions that break multi-patient routing into sequential single-patient assignments, inside a PPO loop augmented by a partially-shared policy network and queueing-informed value approximation.

If this is right

Overflow decisions become feasible for hospital systems an order of magnitude larger than those handled by dynamic programming methods.
The volume of simulation runs needed to train effective policies drops substantially compared with standard reinforcement learning.
Domain-specific adaptations such as atomic actions and queueing value estimates matter more for performance than further neural-network tuning.
The resulting policies remain explainable enough for managerial review while operating in long-run average-cost settings with periodic demand.
The same decomposition and approximation pattern can be reused for other periodic resource-allocation problems that share the same combinatorial structure.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The atomic-action decomposition may simplify reinforcement learning for other matching problems that currently suffer from exponential action spaces.
If the queueing-informed approximation generalizes, similar value-function shortcuts could accelerate learning in other queueing-control domains.
Real-time hospital data streams could be used to test whether the long-run average-cost objective aligns with short-term operational targets.
The partially-shared network design offers a template for balancing global and time-local policies in other periodic scheduling tasks.

Load-bearing premise

The queueing-informed value function approximation and partially-shared policy network together preserve near-optimal performance on the original multi-patient problem without introducing bias that grows with system size.

What would settle it

Apply the trained policy to a simulated system with twenty-five wards and twenty-five patient classes and measure whether the achieved average cost remains within a small percentage of the best available benchmark or lower bound.

read the original abstract

Problem Definition: Managing inpatient flow in large hospital systems is challenging due to the complexity of assigning randomly arriving patients -- either waiting for primary units or being overflowed to alternative units. Current practices rely on ad-hoc rules, while prior analytical approaches struggle with the intractably large state and action spaces inherent in patient-unit matching. Scalable decision support is needed to optimize overflow management while accounting for time-periodic fluctuations in patient flow. Methodology/Results: We develop a scalable decision-making framework using Proximal Policy Optimization (PPO) to optimize overflow decisions in a time-periodic, long-run average cost setting. To address the combinatorial complexity, we introduce atomic actions, which decompose multi-patient routing into sequential assignments. We further enhance computational efficiency through a partially-shared policy network designed to balance parameter sharing with time-specific policy adaptations, and a queueing-informed value function approximation to improve policy evaluation. Our method significantly reduces the need for extensive simulation data, a common limitation in reinforcement learning applications. Case studies on hospital systems with up to twenty patient classes and twenty wards demonstrate that our approach matches or outperforms existing benchmarks, including approximate dynamic programming, which is computationally infeasible beyond five wards. Managerial Implications: Our framework offers a scalable, efficient, and explainable solution for managing patient flow in complex hospital systems. More broadly, our results highlight that domain-aware adaptation is more critical to improving algorithm performance than fine-tuning neural network parameters when applying general-purpose algorithms to specific applications.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper adapts PPO with atomic actions, partially-shared policies, and queueing VFA to scale inpatient overflow decisions to 20 wards where ADP fails, but the abstract supplies no numbers or validation details.

read the letter

The core contribution here is a set of PPO modifications tailored to time-periodic patient-unit matching: atomic actions that turn simultaneous multi-patient routing into sequential decisions, a partially-shared network that reuses parameters across periods while allowing time-specific adjustments, and a queueing-informed value function approximation. These let the method handle systems with 20 patient classes and 20 wards, where standard ADP becomes intractable beyond five wards, and the abstract states it matches or beats the benchmarks on smaller cases while cutting simulation data requirements. That combination is new relative to the cited prior work and directly targets the combinatorial and periodic structure of the problem rather than relying on generic RL scaling tricks. The domain-aware choices are the real strength; they show how to embed queueing knowledge into the value estimate and policy architecture without exploding the parameter count. On the downside, the abstract gives no quantitative results, no description of how the simulation instances were generated or validated against real hospital data, and no error bars or statistical comparisons. Without those, it is impossible to judge whether the approximations stay unbiased as the system grows or whether the reported gains are robust. The managerial claim that domain adaptation matters more than hyperparameter tuning is plausible but rests on the same unshown experiments. This work is aimed at operations researchers who apply RL to periodic stochastic assignment problems in healthcare or similar settings. A reader already working on ADP or RL for matching would find the specific adaptations worth examining. It deserves peer review because the problem is well-motivated, the algorithmic ideas are concrete and falsifiable, and the scalability claim can be checked once the full experiments and code are available.

Referee Report

2 major / 2 minor

Summary. The paper develops a Proximal Policy Optimization (PPO) framework for inpatient overflow management in time-periodic hospital systems. It introduces atomic actions to decompose combinatorial patient-unit assignments, a partially-shared policy network for time-specific adaptations, and a queueing-informed value function approximation. Case studies claim that the method scales to 20 patient classes and 20 wards while matching or outperforming benchmarks including approximate dynamic programming (infeasible beyond 5 wards).

Significance. If the empirical claims hold with detailed validation, the work demonstrates a scalable RL approach for a high-dimensional combinatorial problem in healthcare operations by embedding queueing structure and domain knowledge, rather than relying solely on generic neural network tuning. This could inform practical decision support tools where ADP is intractable.

major comments (2)

[Abstract, §5] Abstract and §5 (case studies): the central claim that the approach 'matches or outperforms' ADP and other benchmarks for systems up to 20 wards is presented without any numerical performance metrics, cost values, overflow rates, error bars, statistical tests, or description of how simulation data were generated and validated. This absence makes it impossible to assess whether the atomic actions and queueing-informed VFA preserve performance without bias that grows with system size.
[§3.2] §3.2 (atomic actions): the assertion that atomic actions preserve optimality of the original multi-patient assignment problem is stated as an axiom but lacks a formal proof or counter-example analysis showing that sequential decomposition does not alter the long-run average cost for the time-periodic MDP.

minor comments (2)

[§4] Notation for the partially-shared policy network parameters is introduced without a clear diagram or pseudocode showing which layers are shared versus time-period specific.
[§6] The managerial implications section repeats the abstract's claim about domain-aware adaptation without referencing specific ablation results that isolate the contribution of the queueing-informed VFA versus the policy network.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive comments, which highlight important areas for improving the clarity and rigor of our presentation. We respond to each major comment below, indicating the revisions we plan to make.

read point-by-point responses

Referee: [Abstract, §5] Abstract and §5 (case studies): the central claim that the approach 'matches or outperforms' ADP and other benchmarks for systems up to 20 wards is presented without any numerical performance metrics, cost values, overflow rates, error bars, statistical tests, or description of how simulation data were generated and validated. This absence makes it impossible to assess whether the atomic actions and queueing-informed VFA preserve performance without bias that grows with system size.

Authors: We agree with this observation. The full case studies in §5 contain detailed simulation results, but these were not sufficiently summarized in the abstract or highlighted with specific metrics. In the revised version, we will update the abstract to include key numerical findings such as average costs and overflow rates for the 20-ward systems, along with comparisons to benchmarks. Additionally, we will expand §5 to include tables with performance metrics, error bars from multiple simulation runs, statistical significance tests, and a detailed description of the simulation setup and validation procedures. revision: yes
Referee: [§3.2] §3.2 (atomic actions): the assertion that atomic actions preserve optimality of the original multi-patient assignment problem is stated as an axiom but lacks a formal proof or counter-example analysis showing that sequential decomposition does not alter the long-run average cost for the time-periodic MDP.

Authors: We acknowledge that a more rigorous justification is needed. The atomic actions are intended to decompose the combinatorial action space without changing the underlying decision problem, as each sequence of atomic assignments corresponds to a feasible multi-patient assignment. We will revise §3.2 to include a formal argument demonstrating that the long-run average cost is preserved under this decomposition in the time-periodic MDP, or provide counter-example analysis if applicable to clarify the conditions under which optimality is maintained. revision: yes

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper introduces a PPO algorithm with atomic actions, partially-shared policy network, and queueing-informed VFA as explicit algorithmic constructions for the overflow MDP. Performance claims rest on direct empirical comparison to ADP and other benchmarks on external hospital instances (up to 20 wards), not on any fitted parameter or self-citation that is redefined as the result. No equation reduces the reported policy quality to its own inputs by construction, and the method is presented as a new scalable approximation whose validity is tested outside the fitting process.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 2 invented entities

The central claim rests on standard MDP assumptions plus three domain-specific modeling choices whose independent support is not supplied in the abstract.

axioms (2)

domain assumption The inpatient overflow problem can be modeled as a time-periodic Markov decision process with long-run average cost.
Invoked when the authors formulate the problem as a long-run average cost setting with time-periodic fluctuations.
ad hoc to paper Atomic actions preserve optimality of the original multi-patient assignment problem.
Introduced to address combinatorial complexity; no proof or reference given in abstract.

invented entities (2)

atomic actions no independent evidence
purpose: Decompose multi-patient routing into sequential single-patient assignments to reduce action space size.
New modeling device introduced in the methodology section of the abstract.
partially-shared policy network no independent evidence
purpose: Balance parameter sharing across time periods with time-specific adaptations.
Introduced to improve computational efficiency for periodic dynamics.

pith-pipeline@v0.9.0 · 5790 in / 1563 out tokens · 23576 ms · 2026-05-23T19:13:56.562353+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

38 extracted references · 38 canonical work pages · 2 internal anchors

[1]

, " * write output.state after.block = add.period write newline

ENTRY address author booktitle chapter doi edition editor eid howpublished institution isbn issn journal key month note number organization pages publisher school series title type url volume year label extra.label sort.label short.list INTEGERS output.state before.all mid.sentence after.sentence after.block FUNCTION init.state.consts #0 'before.all := #1...

work page
[2]

write newline

" write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION word.in "" FUNCTION format.date year ...

work page
[3]

European journal of internal medicine 20(8):764--767

Alameda C, Su \'a rez C (2009) Clinical outcomes in medical outliers admitted to hospital with heart failure. European journal of internal medicine 20(8):764--767

work page 2009
[4]

Federated Learning with Personalization Layers

Arivazhagan MG, Aggarwal V, Singh AK, Choudhary S (2019) Federated learning with personalization layers. arXiv preprint arXiv:1912.00818

work page internal anchor Pith review Pith/arXiv arXiv 2019
[5]

Management Science 70(7):4893--4911

Bertsimas D, Pauphilet J (2024) Hospital-wide inpatient flow optimization. Management Science 70(7):4893--4911

work page 2024
[6]

Manufacturing & Service Operations Management 17(2):157--176

Best TJ, Sand k c B, Eisenstein DD, Meltzer DO (2015) Managing hospital inpatient bed capacity through partitioning care into focused wings. Manufacturing & Service Operations Management 17(2):157--176

work page 2015
[7]

A Primal-Dual Approach to Constrained Markov Decision Processes

Chen Y, Dong J, Wang Z (2021) A primal-dual approach to constrained markov decision processes. arXiv preprint arXiv:2101.10895

work page arXiv 2021
[8]

Production and Operations Management 31(11):4038--4056

Cire AA, Diamant A (2022) Dynamic scheduling of home care patients to medical providers. Production and Operations Management 31(11):4038--4056

work page 2022
[9]

Stochastic Systems 12(1):30--67

Dai JG, Gluzman M (2022) Queueing network controls via deep reinforcement learning. Stochastic Systems 12(1):30--67

work page 2022
[10]

Manufacturing & Service Operations Management 21(4):894--911

Dai JG, Shi P (2019) Inpatient overflow: An approximate dynamic programming approach. Manufacturing & Service Operations Management 21(4):894--911

work page 2019
[11]

Dean A (2024) Learning, Matching, and Allocation Algorithms for Healthcare and Revenue Management Problems with Reusable Resources. Ph.D. thesis

work page 2024
[12]

o rg \"u l \

Dong J, G \"o rg \"u l \"u B, Sarhangian V (2025) Multiclass queue scheduling under slowdown: An approximate dynamic programming approach. arXiv preprint arXiv:2501.10523

work page arXiv 2025
[13]

Columbia Business School Research Paper Forthcoming

Dong J, Shi P, Zheng F, Jin X (2019) Off-service placement in inpatient ward network: Resource pooling versus service slowdown. Columbia Business School Research Paper Forthcoming

work page 2019
[14]

2021 American Control Conference (ACC), 3743--3748 (IEEE)

Feng J, Gluzman M, Dai JG (2021) Scalable deep reinforcement learning for ride-hailing. 2021 American Control Conference (ACC), 3743--3748 (IEEE)

work page 2021
[15]

Manufacturing & Service Operations Management 26(2):447--464

Gao X, Kong N, Griffin P (2024) Shortening emergency medical response time with joint operations of uncrewed aerial vehicles with ambulances. Manufacturing & Service Operations Management 26(2):447--464

work page 2024
[16]

Manufacturing & Service Operations Management 23(1):139--154

Izady N, Mohamed I (2021) A clustered overflow configuration of inpatient beds in hospitals. Manufacturing & Service Operations Management 23(1):139--154

work page 2021
[17]

Emergency Medicine Journal 39(3):168--173

Jones S, Moulton C, Swift S, Molyneux P, Black S, Mason N, Oakley R, Mann C (2022) Association between delays to patient admission from the emergency department and all-cause 30-day mortality. Emergency Medicine Journal 39(3):168--173

work page 2022
[18]

Transportation Science 58(4):841--859

Khorasanian D, Patrick J, Saur \'e A (2024) Dynamic home care routing and scheduling with uncertain number of visits per referral. Transportation Science 58(4):841--859

work page 2024
[19]

Management Science 61(1):19--38

Kim SH, Chan CW, Olivares M, Escobar G (2015) Icu admission control: An empirical study of capacity allocation and its implication for patient outcomes. Management Science 61(1):19--38

work page 2015
[20]

2020 fourth world conference on smart trends in systems, security and sustainability (WorldS4), 794--797 (IEEE)

Kulkarni V, Kulkarni M, Pant A (2020) Survey of personalization techniques for federated learning. 2020 fourth world conference on smart trends in systems, security and sustainability (WorldS4), 794--797 (IEEE)

work page 2020
[21]

Management Science 70(11):7692--7711

Lim JM, Song H, Yang JJ (2024) The spillover effects of capacity pooling in hospitals. Management Science 70(11):7692--7711

work page 2024
[22]

Queueing systems 67:145--182

Liu Y, Whitt W (2011 a ) Large-time asymptotics for the g t/m t/st+ gi t many-server fluid queue with abandonment. Queueing systems 67:145--182

work page 2011
[23]

Stochastic Systems 1(2):340--410

Liu Y, Whitt W (2011 b ) Nearly periodic behavior in the overloaded g/d/s+ gi queue. Stochastic Systems 1(2):340--410

work page 2011
[24]

The International Journal of Health Planning and Management 38(3):805--828

Manning L, Islam MS (2023) A systematic review to identify the challenges to achieving effective patient flow in public hospitals. The International Journal of Health Planning and Management 38(3):805--828

work page 2023
[25]

Clinical and experimental emergency medicine 6(3):189

McKenna P, Heslin SM, Viccellio P, Mallon WK, Hernandez C, Morley EJ (2019) Emergency department and hospital crowding: causes, consequences, and cures. Clinical and experimental emergency medicine 6(3):189

work page 2019
[26]

Meyn SP, Tweedie RL (2012) Markov chains and stochastic stability (Springer Science & Business Media)

work page 2012
[27]

Processes 9(1):102

Perez HD, Hubbs CD, Li C, Grossmann IE (2021) Algorithmic approaches to inventory management optimization. Processes 9(1):102

work page 2021
[28]

Puterman ML (2014) Markov decision processes: discrete stochastic dynamic programming (John Wiley & Sons)

work page 2014
[29]

BMC medical informatics and decision making 13(1):1--19

Schmidt R, Geisler S, Spreckelsen C (2013) Decision support for hospital bed management using adaptable individual length of stay estimations and shared resources. BMC medical informatics and decision making 13(1):1--19

work page 2013
[30]

Proximal Policy Optimization Algorithms

Schulman J, Wolski F, Dhariwal P, Radford A, Klimov O (2017) Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347

work page internal anchor Pith review Pith/arXiv arXiv 2017
[31]

Management Science 62(1):1--28

Shi P, Chou MC, Dai JG, Ding D, Sim J (2016) Models and insights for hospital inpatient operations: Time-dependent ed boarding time. Management Science 62(1):1--28

work page 2016
[32]

Management Science 66(9):3825--3842

Song H, Tucker AL, Graue R, Moravick S, Yang JJ (2020) Capacity pooling in hospitals: The hidden consequences of off-service placement. Management Science 66(9):3825--3842

work page 2020
[33]

Medical journal of Australia 184(5):208--212

Sprivulis PC, Da Silva JA, Jacobs IG, Jelinek GA, Frazer AR (2006) The association between hospital overcrowding and mortality among patients admitted via western australian emergency departments. Medical journal of Australia 184(5):208--212

work page 2006
[34]

Scandinavian journal of trauma, resuscitation and emergency medicine 21:1--7

Stowell A, Claret PG, Sebbane M, Bobbia X, Boyard C, Genre Grandpierre R, Moreau A, de La Coussaye JE (2013) Hospital out-lying through lack of beds and its impact on care and patient outcome. Scandinavian journal of trauma, resuscitation and emergency medicine 21:1--7

work page 2013
[35]

BMJ open 7(5):e015676

Stylianou N, Fackrell R, Vasilakis C (2017) Are medical outliers associated with worse patient outcomes? a retrospective study within a regional nhs hospital using routine data. BMJ open 7(5):e015676

work page 2017
[36]

Interfaces 43(5):435--448

Thomas BG, Bollapragada S, Akbay K, Toledano D, Katlic P, Dulgeroglu O, Yang D (2013) Automated bed assignments in a complex and dynamic hospital environment. Interfaces 43(5):435--448

work page 2013
[37]

European Journal of Operational Research 313(1):373--386

Vanvuchelen N, De Boeck K, Boute RN (2024) Cluster-based lateral transshipments for the zambian health supply chain. European Journal of Operational Research 313(1):373--386

work page 2024
[38]

Health care management science 23:117--141

Zhang H, Best TJ, Chivu A, Meltzer DO (2020) Simulation-based optimization to improve hospital patient assignment to physicians and clinical units. Health care management science 23:117--141

work page 2020

[1] [1]

, " * write output.state after.block = add.period write newline

ENTRY address author booktitle chapter doi edition editor eid howpublished institution isbn issn journal key month note number organization pages publisher school series title type url volume year label extra.label sort.label short.list INTEGERS output.state before.all mid.sentence after.sentence after.block FUNCTION init.state.consts #0 'before.all := #1...

work page

[2] [2]

write newline

" write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION word.in "" FUNCTION format.date year ...

work page

[3] [3]

European journal of internal medicine 20(8):764--767

Alameda C, Su \'a rez C (2009) Clinical outcomes in medical outliers admitted to hospital with heart failure. European journal of internal medicine 20(8):764--767

work page 2009

[4] [4]

Federated Learning with Personalization Layers

Arivazhagan MG, Aggarwal V, Singh AK, Choudhary S (2019) Federated learning with personalization layers. arXiv preprint arXiv:1912.00818

work page internal anchor Pith review Pith/arXiv arXiv 2019

[5] [5]

Management Science 70(7):4893--4911

Bertsimas D, Pauphilet J (2024) Hospital-wide inpatient flow optimization. Management Science 70(7):4893--4911

work page 2024

[6] [6]

Manufacturing & Service Operations Management 17(2):157--176

Best TJ, Sand k c B, Eisenstein DD, Meltzer DO (2015) Managing hospital inpatient bed capacity through partitioning care into focused wings. Manufacturing & Service Operations Management 17(2):157--176

work page 2015

[7] [7]

A Primal-Dual Approach to Constrained Markov Decision Processes

Chen Y, Dong J, Wang Z (2021) A primal-dual approach to constrained markov decision processes. arXiv preprint arXiv:2101.10895

work page arXiv 2021

[8] [8]

Production and Operations Management 31(11):4038--4056

Cire AA, Diamant A (2022) Dynamic scheduling of home care patients to medical providers. Production and Operations Management 31(11):4038--4056

work page 2022

[9] [9]

Stochastic Systems 12(1):30--67

Dai JG, Gluzman M (2022) Queueing network controls via deep reinforcement learning. Stochastic Systems 12(1):30--67

work page 2022

[10] [10]

Manufacturing & Service Operations Management 21(4):894--911

Dai JG, Shi P (2019) Inpatient overflow: An approximate dynamic programming approach. Manufacturing & Service Operations Management 21(4):894--911

work page 2019

[11] [11]

Dean A (2024) Learning, Matching, and Allocation Algorithms for Healthcare and Revenue Management Problems with Reusable Resources. Ph.D. thesis

work page 2024

[12] [12]

o rg \"u l \

Dong J, G \"o rg \"u l \"u B, Sarhangian V (2025) Multiclass queue scheduling under slowdown: An approximate dynamic programming approach. arXiv preprint arXiv:2501.10523

work page arXiv 2025

[13] [13]

Columbia Business School Research Paper Forthcoming

Dong J, Shi P, Zheng F, Jin X (2019) Off-service placement in inpatient ward network: Resource pooling versus service slowdown. Columbia Business School Research Paper Forthcoming

work page 2019

[14] [14]

2021 American Control Conference (ACC), 3743--3748 (IEEE)

Feng J, Gluzman M, Dai JG (2021) Scalable deep reinforcement learning for ride-hailing. 2021 American Control Conference (ACC), 3743--3748 (IEEE)

work page 2021

[15] [15]

Manufacturing & Service Operations Management 26(2):447--464

Gao X, Kong N, Griffin P (2024) Shortening emergency medical response time with joint operations of uncrewed aerial vehicles with ambulances. Manufacturing & Service Operations Management 26(2):447--464

work page 2024

[16] [16]

Manufacturing & Service Operations Management 23(1):139--154

Izady N, Mohamed I (2021) A clustered overflow configuration of inpatient beds in hospitals. Manufacturing & Service Operations Management 23(1):139--154

work page 2021

[17] [17]

Emergency Medicine Journal 39(3):168--173

Jones S, Moulton C, Swift S, Molyneux P, Black S, Mason N, Oakley R, Mann C (2022) Association between delays to patient admission from the emergency department and all-cause 30-day mortality. Emergency Medicine Journal 39(3):168--173

work page 2022

[18] [18]

Transportation Science 58(4):841--859

Khorasanian D, Patrick J, Saur \'e A (2024) Dynamic home care routing and scheduling with uncertain number of visits per referral. Transportation Science 58(4):841--859

work page 2024

[19] [19]

Management Science 61(1):19--38

Kim SH, Chan CW, Olivares M, Escobar G (2015) Icu admission control: An empirical study of capacity allocation and its implication for patient outcomes. Management Science 61(1):19--38

work page 2015

[20] [20]

2020 fourth world conference on smart trends in systems, security and sustainability (WorldS4), 794--797 (IEEE)

Kulkarni V, Kulkarni M, Pant A (2020) Survey of personalization techniques for federated learning. 2020 fourth world conference on smart trends in systems, security and sustainability (WorldS4), 794--797 (IEEE)

work page 2020

[21] [21]

Management Science 70(11):7692--7711

Lim JM, Song H, Yang JJ (2024) The spillover effects of capacity pooling in hospitals. Management Science 70(11):7692--7711

work page 2024

[22] [22]

Queueing systems 67:145--182

Liu Y, Whitt W (2011 a ) Large-time asymptotics for the g t/m t/st+ gi t many-server fluid queue with abandonment. Queueing systems 67:145--182

work page 2011

[23] [23]

Stochastic Systems 1(2):340--410

Liu Y, Whitt W (2011 b ) Nearly periodic behavior in the overloaded g/d/s+ gi queue. Stochastic Systems 1(2):340--410

work page 2011

[24] [24]

The International Journal of Health Planning and Management 38(3):805--828

Manning L, Islam MS (2023) A systematic review to identify the challenges to achieving effective patient flow in public hospitals. The International Journal of Health Planning and Management 38(3):805--828

work page 2023

[25] [25]

Clinical and experimental emergency medicine 6(3):189

McKenna P, Heslin SM, Viccellio P, Mallon WK, Hernandez C, Morley EJ (2019) Emergency department and hospital crowding: causes, consequences, and cures. Clinical and experimental emergency medicine 6(3):189

work page 2019

[26] [26]

Meyn SP, Tweedie RL (2012) Markov chains and stochastic stability (Springer Science & Business Media)

work page 2012

[27] [27]

Processes 9(1):102

Perez HD, Hubbs CD, Li C, Grossmann IE (2021) Algorithmic approaches to inventory management optimization. Processes 9(1):102

work page 2021

[28] [28]

Puterman ML (2014) Markov decision processes: discrete stochastic dynamic programming (John Wiley & Sons)

work page 2014

[29] [29]

BMC medical informatics and decision making 13(1):1--19

Schmidt R, Geisler S, Spreckelsen C (2013) Decision support for hospital bed management using adaptable individual length of stay estimations and shared resources. BMC medical informatics and decision making 13(1):1--19

work page 2013

[30] [30]

Proximal Policy Optimization Algorithms

Schulman J, Wolski F, Dhariwal P, Radford A, Klimov O (2017) Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347

work page internal anchor Pith review Pith/arXiv arXiv 2017

[31] [31]

Management Science 62(1):1--28

Shi P, Chou MC, Dai JG, Ding D, Sim J (2016) Models and insights for hospital inpatient operations: Time-dependent ed boarding time. Management Science 62(1):1--28

work page 2016

[32] [32]

Management Science 66(9):3825--3842

Song H, Tucker AL, Graue R, Moravick S, Yang JJ (2020) Capacity pooling in hospitals: The hidden consequences of off-service placement. Management Science 66(9):3825--3842

work page 2020

[33] [33]

Medical journal of Australia 184(5):208--212

Sprivulis PC, Da Silva JA, Jacobs IG, Jelinek GA, Frazer AR (2006) The association between hospital overcrowding and mortality among patients admitted via western australian emergency departments. Medical journal of Australia 184(5):208--212

work page 2006

[34] [34]

Scandinavian journal of trauma, resuscitation and emergency medicine 21:1--7

Stowell A, Claret PG, Sebbane M, Bobbia X, Boyard C, Genre Grandpierre R, Moreau A, de La Coussaye JE (2013) Hospital out-lying through lack of beds and its impact on care and patient outcome. Scandinavian journal of trauma, resuscitation and emergency medicine 21:1--7

work page 2013

[35] [35]

BMJ open 7(5):e015676

Stylianou N, Fackrell R, Vasilakis C (2017) Are medical outliers associated with worse patient outcomes? a retrospective study within a regional nhs hospital using routine data. BMJ open 7(5):e015676

work page 2017

[36] [36]

Interfaces 43(5):435--448

Thomas BG, Bollapragada S, Akbay K, Toledano D, Katlic P, Dulgeroglu O, Yang D (2013) Automated bed assignments in a complex and dynamic hospital environment. Interfaces 43(5):435--448

work page 2013

[37] [37]

European Journal of Operational Research 313(1):373--386

Vanvuchelen N, De Boeck K, Boute RN (2024) Cluster-based lateral transshipments for the zambian health supply chain. European Journal of Operational Research 313(1):373--386

work page 2024

[38] [38]

Health care management science 23:117--141

Zhang H, Best TJ, Chivu A, Meltzer DO (2020) Simulation-based optimization to improve hospital patient assignment to physicians and clinical units. Health care management science 23:117--141

work page 2020