Inpatient Overflow Management with Proximal Policy Optimization

· 2024 · math.OC · arXiv 2410.13767

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

open full Pith review browse 1 citing papers arXiv PDF

abstract

Problem Definition: Managing inpatient flow in large hospital systems is challenging due to the complexity of assigning randomly arriving patients -- either waiting for primary units or being overflowed to alternative units. Current practices rely on ad-hoc rules, while prior analytical approaches struggle with the intractably large state and action spaces inherent in patient-unit matching. Scalable decision support is needed to optimize overflow management while accounting for time-periodic fluctuations in patient flow. Methodology/Results: We develop a scalable decision-making framework using Proximal Policy Optimization (PPO) to optimize overflow decisions in a time-periodic, long-run average cost setting. To address the combinatorial complexity, we introduce atomic actions, which decompose multi-patient routing into sequential assignments. We further enhance computational efficiency through a partially-shared policy network designed to balance parameter sharing with time-specific policy adaptations, and a queueing-informed value function approximation to improve policy evaluation. Our method significantly reduces the need for extensive simulation data, a common limitation in reinforcement learning applications. Case studies on hospital systems with up to twenty patient classes and twenty wards demonstrate that our approach matches or outperforms existing benchmarks, including approximate dynamic programming, which is computationally infeasible beyond five wards. Managerial Implications: Our framework offers a scalable, efficient, and explainable solution for managing patient flow in complex hospital systems. More broadly, our results highlight that domain-aware adaptation is more critical to improving algorithm performance than fine-tuning neural network parameters when applying general-purpose algorithms to specific applications.

representative citing papers

Bellman-Taylor Score Decoding for Markov Decision Processes with State-Dependent Feasible Action Sets

cs.AI · 2026-06-09 · unverdicted · novelty 5.0

Bellman-Taylor score decoding framework for MDPs with implicit state-dependent action constraints, enabling standard DRL optimization with a decomposed optimality gap guarantee.

citing papers explorer

Showing 1 of 1 citing paper after filters.

Bellman-Taylor Score Decoding for Markov Decision Processes with State-Dependent Feasible Action Sets cs.AI · 2026-06-09 · unverdicted · none · ref 12 · internal anchor
Bellman-Taylor score decoding framework for MDPs with implicit state-dependent action constraints, enabling standard DRL optimization with a decomposed optimality gap guarantee.

Inpatient Overflow Management with Proximal Policy Optimization

fields

years

verdicts

representative citing papers

citing papers explorer