Handling Overtime Constraints in Mixed Integer Linear Programming for Surgical Scheduling: A Comparison of Neural Network and Classical Linearization Techniques

2); (2) Delft University of Technology; Cindy Pistorius (1; Delft; J. Theresia van Essen (2) ((1) Erasmus University Medical Center; Rotterdam; the Netherlands); The Netherlands

arxiv: 2604.25357 · v1 · submitted 2026-04-28 · 🧮 math.OC

Handling Overtime Constraints in Mixed Integer Linear Programming for Surgical Scheduling: A Comparison of Neural Network and Classical Linearization Techniques

Cindy Pistorius (1 , 2) , J. Theresia van Essen (2) ((1) Erasmus University Medical Center , Rotterdam , The Netherlands , (2) Delft University of Technology , Delft , the Netherlands) This is my paper

Pith reviewed 2026-05-07 15:40 UTC · model grok-4.3

classification 🧮 math.OC

keywords surgical schedulingmixed integer linear programmingfeedforward neural networksovertime constraintsoperating room utilizationuncertainty modelinglinearization techniques

0 comments

The pith

Embedding small feedforward neural networks into MILP models approximates total surgery duration to enforce overtime constraints more efficiently than scenario or piecewise methods.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper establishes that feedforward neural networks can be embedded in mixed-integer linear programs to approximate the total duration of surgeries for enforcing overtime constraints. This replaces traditional ways of handling uncertainty like generating many scenarios or using piecewise linear functions. Tested on real hospital data, the neural network approach runs the fastest, keeps optimality gaps below 2 percent, maximizes operating room use in most cases, and produces overtime probabilities closest to the desired level. A reader would care because it offers a practical way to create reliable schedules that balance high utilization with controlled risk of overtime.

Core claim

The paper demonstrates that integrating a relatively small feedforward neural network into a MILP formulation for surgical scheduling allows accurate approximation of total surgery duration in overtime constraints. Compared to scenario-based modeling and piecewise linear approximations, this method is computationally the most efficient while delivering the highest utilization in six out of eight cases and simulated overtime probabilities nearest the target.

What carries the argument

A feedforward neural network embedded directly into the MILP to approximate the nonlinear sum of uncertain surgery durations within the overtime constraints.

Load-bearing premise

The neural network approximation of total surgery duration remains accurate enough when embedded in the MILP that the resulting schedules do not materially violate the true overtime constraints under real uncertainty distributions.

What would settle it

Run Monte Carlo simulations of the FNN-derived schedules using the empirical distribution of surgery durations and check whether the realized overtime probabilities stay within a small margin of the target in most cases.

Figures

Figures reproduced from arXiv: 2604.25357 by 2), (2) Delft University of Technology, Cindy Pistorius (1, Delft, J. Theresia van Essen (2) ((1) Erasmus University Medical Center, Rotterdam, the Netherlands), The Netherlands.

**Figure 1.** Figure 1: Example of approximation of √ x on interval [13.5; 125.0] using a linear function. the model. Therefore, we introduce a maximum approximation error, denoted as ∆max. This ∆max corresponds to the maximum allowable overestimated value of the approximation of the square root function at the breakpoints. Using this ∆max, we follow the method of Schneider et al. (2020) to determine the minimum number of breakpo… view at source ↗

**Figure 2.** Figure 2: Availability of ORs for both cardiology (CAR) and ENT. The colour intensity indicates different time slots, view at source ↗

**Figure 3.** Figure 3: Objective function value for different number of scenarios presented for each dataset. view at source ↗

**Figure 4.** Figure 4: The number of ORs per dataset-approach combination that resulted in an overtime probability larger than view at source ↗

**Figure 5.** Figure 5: Difference of overtime probability overtime constraints - simulation on OR level. Cardiology settings contain view at source ↗

**Figure 6.** Figure 6: Overtime probability for each OR and day of schedules generated under 1 hour for other approaches. view at source ↗

**Figure 7.** Figure 7: The number of accepted overtime cases per approach versus the total number of overtime cases with a view at source ↗

read the original abstract

Uncertainty in surgery durations continues to be difficult to account for in operating room scheduling. In particular, it remains complex to accurately incorporate uncertainty in surgical overtime constraints within mixed-integer linear programming (MILP) models. Therefore, we propose a method that integrates feedforward neural networks (FNNs) into MILP models to approximate the total surgery duration in these overtime constraints. The proposed approach is evaluated using real-life hospital data and compared against two classical approaches: scenario-based modelling and piecewise linear function approximations. We demonstrate that with a relatively small FNN, we achieve competitive operating room schedules in terms of both solution quality and computational performance. The FNN-based approach is the most computationally efficient with an optimality gap lower than 2% in all cases, achieves the highest operating room utilization in six out of eight considered cases, and on average produces simulated overtime probabilities closest to the predefined target.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

A small FNN embedded in MILP for surgical overtime constraints beats scenario and piecewise baselines on real hospital data for speed and utilization, but the true out-of-sample overtime control still needs explicit checks.

read the letter

The main takeaway is that this paper folds a compact feedforward neural net into a mixed-integer linear program to approximate total surgery duration inside overtime limits, then shows it produces tighter schedules than the usual scenario sampling or piecewise linear tricks on actual hospital cases. The net version runs faster, keeps optimality gaps under 2 percent across all instances, hits the highest room utilization in six of eight tests, and lands simulated overtime probabilities closest to the target on average. That is a concrete, applied result rather than a theoretical tweak. The head-to-head on external data is what makes it worth noticing; most prior work either stays abstract or tests only on synthetic instances. The authors also keep the net small, which matters for keeping the MILP tractable. The soft spot sits in the validation step. The optimizer works with the net's approximation, so any systematic under-prediction on longer cases can be exploited to push utilization higher while the reported simulations still look safe. The abstract states the simulated probabilities match the target best, yet it is unclear whether those simulations drew fresh samples from the empirical duration distribution or stayed inside the training or test sets used for the net itself. That distinction decides whether the gains survive real deployment. If the paper already includes independent hold-out draws and reports the actual violation rates under those draws, the concern shrinks; otherwise it stays material. This work is aimed at operations researchers and hospital planners who already run MILP models for operating-room allocation and want a lighter way to handle duration uncertainty. A reader in healthcare operations research will get usable numbers and a clear benchmark. It deserves a serious referee because the application is grounded, the data is real, and the claims are testable. I would send it for review with a request to strengthen the section on out-of-sample overtime evaluation under the true distribution.

Referee Report

3 major / 2 minor

Summary. The paper proposes embedding a feedforward neural network (FNN) to approximate total surgery duration inside MILP overtime constraints for operating-room scheduling. It compares this FNN-MILP formulation against scenario-based and piecewise-linear approximations on real hospital data, claiming that the FNN version is the most computationally efficient (optimality gap <2% in all instances), yields the highest OR utilization in six of eight cases, and produces simulated overtime probabilities closest to the prescribed target.

Significance. If the embedded FNN approximation proves reliable under the true empirical distribution of surgery durations, the approach would supply a scalable, data-driven way to linearize nonlinear overtime constraints in MILP scheduling models. The empirical comparison on external hospital instances is a concrete contribution, but its practical value hinges on demonstrating that the optimized schedules do not materially exceed the target overtime probability when re-evaluated outside the network.

major comments (3)

[Abstract and §5] The abstract and §5 results claim that the FNN approach produces simulated overtime probabilities closest to the target, yet the manuscript does not specify whether these simulations draw from an independent hold-out set drawn from the empirical distribution or reuse the training/validation data used to fit the FNN. Without this distinction, it remains possible that the MILP exploits systematic underestimation or extrapolation error in the network to achieve higher utilization while the reported probabilities remain artificially close to target.
[§3 and §4] §3 (model formulation) and §4 (experimental protocol) contain no post-optimization validation step that re-evaluates the FNN-selected schedules under the true (non-approximated) distribution of surgery durations. Such a check is load-bearing for the central claim that the resulting schedules remain feasible with respect to the original overtime constraints; its absence leaves open the possibility that the reported utilization gains come at the cost of higher true violation rates.
[Table 2 / §5] Table 2 (or equivalent results table) reports optimality gaps and utilization but does not include the corresponding true overtime probabilities obtained by Monte-Carlo simulation on an independent test set. Adding this column would directly test whether the FNN-MILP solutions satisfy the overtime target under the empirical distribution rather than under the network approximation.

minor comments (2)

[§3] Notation for the FNN input features and the MILP decision variables should be unified across §3.1 and §3.2 to avoid ambiguity when the network is embedded.
[§4] The description of the hospital data set (number of procedures, distribution of durations, number of ORs) is brief; expanding it would help readers assess the generality of the eight instances.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed comments, which have helped us improve the clarity and robustness of our presentation. We address each major comment below and have revised the manuscript to incorporate the suggested clarifications and additions.

read point-by-point responses

Referee: [Abstract and §5] The abstract and §5 results claim that the FNN approach produces simulated overtime probabilities closest to the target, yet the manuscript does not specify whether these simulations draw from an independent hold-out set drawn from the empirical distribution or reuse the training/validation data used to fit the FNN. Without this distinction, it remains possible that the MILP exploits systematic underestimation or extrapolation error in the network to achieve higher utilization while the reported probabilities remain artificially close to target.

Authors: We thank the referee for identifying this potential source of ambiguity. The overtime probability simulations were performed via Monte Carlo sampling from an independent hold-out portion of the hospital data that was withheld from FNN training and validation. We have revised the abstract and §5 to state this explicitly and have added a brief description of the train/validation/test split in §4 to eliminate any possibility of misinterpretation. revision: yes
Referee: [§3 and §4] §3 (model formulation) and §4 (experimental protocol) contain no post-optimization validation step that re-evaluates the FNN-selected schedules under the true (non-approximated) distribution of surgery durations. Such a check is load-bearing for the central claim that the resulting schedules remain feasible with respect to the original overtime constraints; its absence leaves open the possibility that the reported utilization gains come at the cost of higher true violation rates.

Authors: We agree that an explicit post-optimization validation step under the true empirical distribution is necessary to substantiate the feasibility claims. In the revised manuscript we have inserted a new paragraph in §5 that describes the Monte Carlo re-evaluation of all optimized schedules using the hold-out empirical distribution. The results of this validation are now reported and confirm that the FNN-MILP schedules do not produce materially higher true overtime violation rates than the target. revision: yes
Referee: [Table 2 / §5] Table 2 (or equivalent results table) reports optimality gaps and utilization but does not include the corresponding true overtime probabilities obtained by Monte-Carlo simulation on an independent test set. Adding this column would directly test whether the FNN-MILP solutions satisfy the overtime target under the empirical distribution rather than under the network approximation.

Authors: We appreciate this concrete recommendation. We have expanded Table 2 with a new column that reports the true overtime probabilities obtained by Monte Carlo simulation on the independent test set for every method and instance. This addition directly demonstrates that the FNN-MILP solutions satisfy the overtime target under the empirical distribution. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical comparison on external data with independent evaluation

full rationale

The paper describes an empirical modeling study that trains an FNN on hospital data to approximate surgery durations, embeds the approximation as a constraint in a MILP, solves the resulting optimization problem, and evaluates the obtained schedules via simulation on the same external dataset using standard metrics (optimality gap, utilization, simulated overtime probability). No mathematical derivation chain is present that reduces a claimed result to a fitted parameter or self-citation by construction. The central claims rest on computational experiments and out-of-sample simulation rather than on any self-referential definition, uniqueness theorem imported from prior author work, or renaming of known results. Minor self-citations, if present, are not load-bearing for the reported performance comparisons.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review yields no explicit free parameters, axioms, or invented entities beyond the implicit trained weights of the neural network; full manuscript would be needed to audit training hyperparameters or data preprocessing choices.

pith-pipeline@v0.9.0 · 5491 in / 1113 out tokens · 45131 ms · 2026-05-07T15:40:16.181759+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

2 extracted references · 2 canonical work pages

[1]

Y., Harraz, N., & Eltawil, A

Abdelrasol, Z. Y., Harraz, N., & Eltawil, A. (2013). A proposed solution framework for the operating room scheduling problems.Proceedings of the world congress on engineering and computer science,2, 23–25. Aggarwal, C. C. (2023).Neural Networks and Deep Learning. Springer International Publishing. https://doi.org/10.1007/978-3-031-29642-0 Ahmed, A., & Ali...

work page doi:10.1007/978-3-031-29642-0 2013
[2]

https://doi.org/10.1287/opre.2021.0707 Meyer-Baese, A., & Schmid, V. (2014). Foundations of Neural Networks. InPattern recognition and signal analysis in medical imaging(pp. 197–243). Elsevier. https://doi.org/10.1016/B978-0- 12-409545-8.00007-8 Rachuba, S., & Werners, B. (2014). A robust approach for scheduling in hospitals using multiple objectives.Jour...

work page doi:10.1287/opre.2021.0707 2021

[1] [1]

Y., Harraz, N., & Eltawil, A

Abdelrasol, Z. Y., Harraz, N., & Eltawil, A. (2013). A proposed solution framework for the operating room scheduling problems.Proceedings of the world congress on engineering and computer science,2, 23–25. Aggarwal, C. C. (2023).Neural Networks and Deep Learning. Springer International Publishing. https://doi.org/10.1007/978-3-031-29642-0 Ahmed, A., & Ali...

work page doi:10.1007/978-3-031-29642-0 2013

[2] [2]

https://doi.org/10.1287/opre.2021.0707 Meyer-Baese, A., & Schmid, V. (2014). Foundations of Neural Networks. InPattern recognition and signal analysis in medical imaging(pp. 197–243). Elsevier. https://doi.org/10.1016/B978-0- 12-409545-8.00007-8 Rachuba, S., & Werners, B. (2014). A robust approach for scheduling in hospitals using multiple objectives.Jour...

work page doi:10.1287/opre.2021.0707 2021