pith. sign in

arxiv: 2505.02485 · v1 · submitted 2025-05-05 · 🧮 math.OC · cs.AI

Integrating Column Generation and Large Neighborhood Search for Bus Driver Scheduling with Complex Break Constraints

Pith reviewed 2026-05-22 16:21 UTC · model grok-4.3

classification 🧮 math.OC cs.AI
keywords Bus driver schedulingColumn generationLarge neighborhood searchBranch and priceBreak constraintsCombinatorial optimization
0
0 comments X

The pith

Reusing columns generated during large neighborhood search subproblems improves solutions to bus driver scheduling with complex break rules.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops solution methods for the bus driver scheduling problem of assigning shifts to cover given bus tours while minimizing cost and respecting driver preferences plus strict legal rules on breaks. It compares an exact branch-and-price approach with a large-neighborhood-search framework that calls column generation or branch-and-price inside each repair step. The central innovation stores every column produced while repairing a subproblem and re-uses those columns both inside later subproblems and when building improved global solutions. When this reuse mechanism is active the combined method produces exact solutions on small instances and keeps small gaps to a known lower bound on medium-sized instances, outperforming both plain branch-and-price and large-neighborhood search that treats column generation as a black box. A reader would care because the same reuse idea applies to any scheduling problem whose subproblems naturally generate useful columns that can be kept for future use.

Core claim

By storing columns generated inside the repair phases of large-neighborhood search and feeding them back into subsequent column-generation or branch-and-price calls, the method obtains new state-of-the-art results on the bus driver scheduling problem with complex break constraints: exact solutions for all small instances and low gaps to a known lower bound for mid-sized instances, with branch-and-price strongest on small data and the tightly integrated large-neighborhood search strongest on larger data.

What carries the argument

The column-reuse mechanism that archives every column produced while solving a large-neighborhood-search subproblem and re-inserts those columns into later subproblems or into the master problem to improve the global solution.

If this is right

  • The integrated approach yields exact solutions for every small instance tested.
  • On mid-sized instances the method keeps small gaps relative to a known lower bound.
  • Branch-and-price alone is strongest when the instance is small.
  • The tight coupling of large-neighborhood search with column generation beats the use of column generation as a black-box repair routine.
  • The same column-reuse idea is stated to be applicable to other rule sets and related scheduling problems.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If column reuse works across different break-rule sets, the same storage-and-reuse pattern could be tried on crew scheduling or nurse rostering problems that also generate many columns during neighborhood repair.
  • Keeping columns across multiple large-neighborhood-search iterations may reduce the total number of pricing calls needed to reach a given solution quality.
  • Dynamic policies for deciding which stored columns to keep or discard could further cut memory use while preserving most of the quality gain.

Load-bearing premise

That columns produced while repairing one neighborhood remain useful and do not introduce hidden bias when reused on other neighborhoods or on the full problem.

What would settle it

Apply the column-reuse version to a fresh collection of mid-sized instances whose break rules differ from the training set and check whether the optimality gaps stay below the published levels or whether exact solutions appear for all small instances.

Figures

Figures reproduced from arXiv: 2505.02485 by Lucas Kletzander, Nysret Musliu, Pascal Van Hentenryck, Tommaso Mannelli Mazzoli.

Figure 1
Figure 1. Figure 1: Example shift [KM20] 3.3. Work and Break Regulations. Valid shifts for drivers are constrained by work regulations and require frequent breaks. First, different measures of time related to an employee e containing the set of bus legs Ls need to be distinguished, as visualised in [PITH_FULL_IMAGE:figures/full_fig_p006_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Rest break positioning [KM20] The rest break may be split into one part of at least 30 min and one or more parts of at least 15 min. The first part has to occur after at most 6 h of working time. Whether rest breaks are paid or unpaid depends on break positions according to [PITH_FULL_IMAGE:figures/full_fig_p007_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: RCSPP graph for a Toy Instance 4.2.1. Costs. The costs for nodes and arcs are based on objective (1). The cost ci for each node i corresponding to a bus leg i is defined as ci = 3 · lengthi as each bus leg contributes its length to both Ws (weight 2) and Ts (weight 1). By definition, cs = ct = 0. Several helpful properties of arcs from node i to node j are defined and used for defining the arc costs: lengt… view at source ↗
Figure 4
Figure 4. Figure 4: Extension of RCSPP graph to deal with distance to shift end Equation (21) captures the part of the rest break that can be unpaid depending on the position. However, this requires knowing the end time of the shift end y which violates the general assumption of the algorithm that nodes can be processed in temporary order. Therefore, for each node j, with known end time endj if j is the last bus leg in the sh… view at source ↗
Figure 5
Figure 5. Figure 5: Throttling approaches for 40 16 [PITH_FULL_IMAGE:figures/full_fig_p021_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Memory usage of different B&P variants the integer master problem benefits from the additional computational resources, especially for larger problems. This also shows in the runtimes, where the time for the second phase is often much shorter. While these are significant changes, they come at the cost of excessive additional resources (1 vs. 8) threads, and show most benefit on instances where B&P is outpe… view at source ↗
Figure 7
Figure 7. Figure 7: Comparison of repair operators [PITH_FULL_IMAGE:figures/full_fig_p034_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: GAP for different values of k0 20 30 40 50 60 70 80 90 100 150 200 250 Instance size 0.00 0.25 0.50 0.75 1.00 1.25 1.50 1.75 GAP (%) nmax 5 10 15 20 30 50 (a) GAP for different values of nmax 20 30 40 50 60 70 80 90 100 150 200 250 Instance size 400 600 800 1000 1200 1400 1600 1800 Iterations nmax 5 10 15 20 30 50 (b) Number of iterations for different values of nmax [PITH_FULL_IMAGE:figures/full_fig_p035… view at source ↗
Figure 9
Figure 9. Figure 9: Comparison for different values of nmax [PITH_FULL_IMAGE:figures/full_fig_p035_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: GAP for different subsets of destroyers 20 30 40 50 60 70 80 90 100 150 200 250 Instance size 0 10 20 30 40 50 Success Rate TR EU EW EW+TR EU+TR ALL [PITH_FULL_IMAGE:figures/full_fig_p036_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Success rate for different subsets of destroyers 20 30 40 50 60 70 80 90 100 150 200 250 Instance size 0.0 0.5 1.0 1.5 2.0 GAP (%) LNS (only TR) ALNS ( = 1/3) ALNS ( = 1/2) ALNS ( = 2/3) [PITH_FULL_IMAGE:figures/full_fig_p036_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: GAP for adaptive and static LNS [PITH_FULL_IMAGE:figures/full_fig_p036_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: Metrics of different LNS integrations [PITH_FULL_IMAGE:figures/full_fig_p037_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: GAP for different LNS integrations 10 20 30 40 50 60 70 80 90 100 150 200 250 Instance size 2 4 6 8 10 Memory [GB] LNS LNS+R(B) LNS+B(B) LNS+RB(B) LNS+R(F) LNS+B(F) LNS+RB(F) [PITH_FULL_IMAGE:figures/full_fig_p038_14.png] view at source ↗
Figure 15
Figure 15. Figure 15: Memory usage of different LNS integrations [PITH_FULL_IMAGE:figures/full_fig_p038_15.png] view at source ↗
Figure 16
Figure 16. Figure 16: Convergence plots 10 20 30 40 50 60 70 80 90 100 150 200 250 Instance size 0 1 2 3 4 5 6 7 8 GAP (%) BP LNS LNS+RB(F) CMSA [PITH_FULL_IMAGE:figures/full_fig_p039_16.png] view at source ↗
Figure 17
Figure 17. Figure 17: Grouped results [PITH_FULL_IMAGE:figures/full_fig_p039_17.png] view at source ↗
Figure 18
Figure 18. Figure 18: Critical Difference plot [PITH_FULL_IMAGE:figures/full_fig_p040_18.png] view at source ↗
read the original abstract

The Bus Driver Scheduling Problem (BDSP) is a combinatorial optimization problem with the goal to design shifts to cover prearranged bus tours. The objective takes into account the operational cost as well as the satisfaction of drivers. This problem is heavily constrained due to strict legal rules and collective agreements. The objective of this article is to provide state-of-the-art exact and hybrid solution methods that can provide high-quality solutions for instances of different sizes. This work presents a comprehensive study of both an exact method, Branch and Price (B&P), as well as a Large Neighborhood Search (LNS) framework which uses B&P or Column Generation (CG) for the repair phase to solve the BDSP. It further proposes and evaluates a novel deeper integration of B&P and LNS, storing the generated columns from the LNS subproblems and reusing them for other subproblems, or to find better global solutions. The article presents a detailed analysis of several components of the solution methods and their impact, including general improvements for the B&P subproblem, which is a high-dimensional Resource Constrained Shortest Path Problem (RCSPP), and the components of the LNS. The evaluation shows that our approach provides new state-of-the-art results for instances of all sizes, including exact solutions for small instances, and low gaps to a known lower bound for mid-sized instances. Conclusions: We observe that B&P provides the best results for small instances, while the tight integration of LNS and CG can provide high-quality solutions for larger instances, further improving over LNS which just uses CG as a black box. The proposed methods are general and can also be applied to other rule sets and related optimization problems

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript develops an exact Branch-and-Price (B&P) algorithm and a Large Neighborhood Search (LNS) framework for the Bus Driver Scheduling Problem (BDSP) with complex legal and collective-agreement break constraints. The central contribution is a deeper integration in which columns generated inside LNS subproblems solved by Column Generation (CG) or B&P are stored and reused across subproblems or to tighten the global restricted master problem. Computational experiments on instances of varying sizes report that pure B&P yields exact solutions for small instances while the integrated LNS-CG approach produces new state-of-the-art results for medium and larger instances, with low gaps to a known lower bound; the paper also provides component-wise analyses of RCSPP enhancements and LNS neighborhood operators.

Significance. If the performance claims hold under controlled conditions, the work advances hybrid exact-heuristic methods for highly constrained personnel scheduling. The explicit contrast between black-box CG repair and column-reuse integration, together with the detailed component analysis, supplies reusable methodological insights for other rule-based rostering problems. The provision of exact solutions on small instances and reproducible lower-bound comparisons constitutes a concrete strength.

major comments (2)
  1. [§5.3] §5.3 (Ablation and component analysis): the reported SOTA gains for mid-sized instances are not accompanied by an ablation that disables column reuse while keeping all other improvements (RCSPP enhancements, LNS neighborhood design, and runtime limits) fixed. Without this isolation it remains unclear whether the deeper integration, rather than the other listed enhancements or longer effective runtimes, is the load-bearing mechanism for the observed gap reductions.
  2. [Table 4] Table 4 (mid-size instance results): the gaps to the lower bound are presented only for the full integrated method; a direct side-by-side column for the non-reuse LNS+CG baseline is missing, preventing quantification of the incremental benefit of reuse on the same instance set and parameter settings.
minor comments (2)
  1. [Section 3.2] The resource definitions for the RCSPP subproblem (Section 3.2) would benefit from an explicit enumerated list of all resources and their consumption rules rather than a narrative description.
  2. [Figure 2] Figure 2 (LNS neighborhood illustration) uses overlapping labels that reduce readability; a clearer legend or separate sub-figures would help.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thoughtful and constructive report. The comments highlight opportunities to strengthen the empirical evidence for the contribution of column reuse. We address each major comment below and will incorporate the requested analyses in the revised manuscript.

read point-by-point responses
  1. Referee: [§5.3] §5.3 (Ablation and component analysis): the reported SOTA gains for mid-sized instances are not accompanied by an ablation that disables column reuse while keeping all other improvements (RCSPP enhancements, LNS neighborhood design, and runtime limits) fixed. Without this isolation it remains unclear whether the deeper integration, rather than the other listed enhancements or longer effective runtimes, is the load-bearing mechanism for the observed gap reductions.

    Authors: We agree that a controlled ablation isolating column reuse is valuable for attributing performance gains. The current §5.3 provides component-wise results for RCSPP enhancements and LNS operators, and the manuscript already contrasts the integrated approach against a black-box CG repair baseline. However, that baseline does not hold every other enhancement fixed. We will add the requested ablation to the revised §5.3, comparing the full LNS-CG with column reuse against an otherwise identical configuration (same RCSPP improvements, same neighborhood operators, same runtime limits) on the mid-sized instances. This will quantify the incremental effect of reuse. revision: yes

  2. Referee: [Table 4] Table 4 (mid-size instance results): the gaps to the lower bound are presented only for the full integrated method; a direct side-by-side column for the non-reuse LNS+CG baseline is missing, preventing quantification of the incremental benefit of reuse on the same instance set and parameter settings.

    Authors: We acknowledge that Table 4 would benefit from an explicit side-by-side presentation. We will revise the table (or add a companion table) to include results for the non-reuse LNS+CG variant under identical instance sets, parameter settings, and runtime limits. This will allow direct quantification of the reuse contribution on the same data. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected in algorithmic description

full rationale

The paper describes an algorithmic framework for the Bus Driver Scheduling Problem that combines Branch-and-Price, Column Generation, and Large Neighborhood Search, including a proposed deeper integration via column reuse from LNS subproblems. All performance claims (exact solutions on small instances, low gaps on mid-sized instances, and new state-of-the-art results) are grounded in computational experiments that compare against known lower bounds and prior methods. No mathematical derivation chain exists that reduces a claimed result to a self-defined parameter, a fitted input renamed as a prediction, or a load-bearing self-citation; the method components (RCSPP enhancements, LNS neighborhoods, column storage) are presented as design choices whose impact is assessed empirically rather than by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The approach relies on standard assumptions of combinatorial optimization (NP-hardness of the RCSPP subproblem, validity of column generation bounds) and introduces no new physical entities or ad-hoc constants beyond typical solver parameters.

axioms (1)
  • domain assumption The Resource Constrained Shortest Path Problem subproblem can be solved to optimality within reasonable time for the generated columns.
    Invoked when describing the B&P subproblem improvements.

pith-pipeline@v0.9.0 · 5855 in / 1180 out tokens · 48374 ms · 2026-05-22T16:21:08.142473+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

16 extracted references · 16 canonical work pages

  1. [1]

    [CdMdA+17] Ademir Aparecido Constantino, Candido FX de Mendonca, Silvio Alexandre de Araujo, Dario Landa-Silva, Rog´ erio Calvi, and Allainclair Flausino dos Santos

    Accessed: 2025-01-30. [CdMdA+17] Ademir Aparecido Constantino, Candido FX de Mendonca, Silvio Alexandre de Araujo, Dario Landa-Silva, Rog´ erio Calvi, and Allainclair Flausino dos Santos. Solving a large real-world bus driver scheduling problem with a multi-assignment based heuristic algorithm. Journal of Universal Computer Science , 23(5),

  2. [2]

    [CS16] Borja Calvo and Guzm´ an Santaf´ e

    doi:10.1007/978-3-030-12255-34 . [CS16] Borja Calvo and Guzm´ an Santaf´ e. scmamp: Statistical Comparison of Multiple Algorithms in Multiple Problems. The R Journal , 8(1):248–256,

  3. [3]

    32614/RJ-2016-017

    doi:10. 32614/RJ-2016-017. [CSSC13] Shijun Chen, Yindong Shen, Xuan Su, and Heming Chen. A Crew Scheduling with Chinese Meal Break Rules. Journal of Transportation Systems Engineering and Information Technology, 13(2):90–95, April

  4. [4]

    A deci- sion support system prototype for automated bus driver scheduling

    [FMKM24] Nikolaus Frohner, Esther Mugdan, Lucas Kletzander, and Nysret Musliu. A deci- sion support system prototype for automated bus driver scheduling. InProceedings of the 14th International Conference on the Practice and Theory of Automated Timetabling, PATAT 2024, pages 259–262,

  5. [5]

    Computing in Science & Engi- neering9(3), 90–95 (2007) https://doi.org/10.1109/MCSE.2007.55

    doi:10.1109/MCSE.2007.55. [ID05] Stefan Irnich and Guy Desaulniers. Shortest path problems with resource con- straints. In Column generation, pages 33–65. Springer,

  6. [6]

    [KMM22] Lucas Kletzander, Tommaso Mannelli Mazzoli, and Nysret Musliu

    doi:10.1609/aaai.v37i10.26466. [KMM22] Lucas Kletzander, Tommaso Mannelli Mazzoli, and Nysret Musliu. Metaheuristic algorithms for the bus driver scheduling problem with complex break constraints. In Proceedings of the Genetic and Evolutionary Computation Conference . ACM, July

  7. [7]

    [KMVH21] Lucas Kletzander, Nysret Musliu, and Pascal Van Hentenryck

    doi:https://dl.acm.org/doi/10.1145/3512290.3528876. [KMVH21] Lucas Kletzander, Nysret Musliu, and Pascal Van Hentenryck. Branch and price for bus driver scheduling with complex break constraints. Proceedings of the AAAI Conference on Artificial Intelligence, 35(13):11853–11861, May

  8. [8]

    Construct, merge, solve and adapt applied to a bus driver sched- uling problem with complex break constraints

    [RKB+23] Roberto Maria Rosati, Lucas Kletzander, Christian Blum, Nysret Musliu, and Andrea Schaerf. Construct, merge, solve and adapt applied to a bus driver sched- uling problem with complex break constraints. In AIxIA 2022 – Advances in Artificial Intelligence , pages 254–267. Springer International Publishing,

  9. [9]

    [RMVP13] Ana Resp´ ıcio, Margarida Moz, and Margarida Vaz Pato

    doi:10.1007/978-3-031-27181-618 . [RMVP13] Ana Resp´ ıcio, Margarida Moz, and Margarida Vaz Pato. Enhanced genetic algo- rithms for a bi-objective bus driver rostering problem: a computational study. International Transactions in Operational Research, 20(4):443–470,

  10. [10]

    [Sha98] Paul Shaw

    doi:10.1287/trsc.1050.0135. [Sha98] Paul Shaw. Using constraint programming and local search methods to solve vehicle routing problems. In Principles and Practice of Constraint Program- ming — CP98 , pages 417–431. Springer Berlin Heidelberg,

  11. [11]

    [TK13] Attila T´ oth and Mikl´ os Kr´ esz

    URL: https://www.sciencedirect.com/science/article/ pii/0191260788900222, doi:10.1016/0191-2607(88)90022-2. [TK13] Attila T´ oth and Mikl´ os Kr´ esz. An efficient solution approach for real-world driver scheduling problems in urban bus transportation. Central European Journal of Operations Research, 21(S1):75–94, June

  12. [12]

    doi:10.21105/joss.03021

    doi:10.21105/joss.03021. [WKO19] WKO. Kollektivvertrag autobusbetriebe, arbeiter/innen / angestellte. https: //www.wko.at/service/kollektivvertrag/kv-private-autobusbetriebe-2019. html,

  13. [13]

    [WM14] Magdalena Widl and Nysret Musliu

    Accessed: 2024-04-10. [WM14] Magdalena Widl and Nysret Musliu. The break scheduling problem: complexity results and practical algorithms. Memetic Computing, 6(2):97–112, June

  14. [14]

    cherry-picked

    All articles: (1) All claims investigated in this work are clearly stated. [yes] (2) Clear explanations are given how the work reported substantiates the claims. [yes] (3) Limitations or technical assumptions are stated clearly and explicitly. [yes] (4) Conceptual outlines and/or pseudo-code descriptions of the AI methods introduced in this work are provi...

  15. [15]

    GAP for adaptive and static LNS INTEGRATING CG AND LNS FOR THE BDSP 37 10 20 30 40 50 60 70 80 90 100 150 200 250 Instance size 0 2000 4000 6000 8000 10000 12000 14000#Iterations LNS LNS+R(B) LNS+B(B) LNS+RB(B) LNS+R(F) LNS+B(F) LNS+RB(F) (a) Number of iterations 10 20 30 40 50 60 70 80 90 100 150 200 250 Instance size 0 2 4 6 8 10 12Average time used [s]...

  16. [16]

    Memory usage of different LNS integrations INTEGRATING CG AND LNS FOR THE BDSP 39 0 500 1000 1500 2000 2500 3000 3500 Time [s] 51000 52000 53000 54000 55000Objective Value LNS LNS+R(B) LNS+B(B) LNS+RB(B) LNS+R(F) LNS+B(F) LNS+RB(F) BP (a) Realistic 30 12 0 500 1000 1500 2000 2500 3000 3500 Time [s] 100000 102000 104000 106000 108000 110000Objective Value ...