Lawler-Moore Speedups via Additive Combinatorics

Danny Hermelin; Dvir Shabtay; Karl Bringmann; Tomohiro Koana

arxiv: 2604.13642 · v1 · submitted 2026-04-15 · 💻 cs.DS

Lawler-Moore Speedups via Additive Combinatorics

Karl Bringmann , Danny Hermelin , Tomohiro Koana , Dvir Shabtay This is my paper

Pith reviewed 2026-05-10 12:16 UTC · model grok-4.3

classification 💻 cs.DS

keywords parallel machine schedulingdynamic programmingLawler-Mooreadditive combinatoricsstate pruningscheduling objectivesload balancing

0 comments

The pith

A swapping argument bounds load differences to 4p_max squared in optimal schedules, replacing P with p_max squared in Lawler-Moore dynamic programming.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper improves the classical Lawler-Moore dynamic program for scheduling jobs on m identical parallel machines under three regular objectives. It introduces a swapping operation supported by an additive combinatorics lemma showing that, for any schedule that is locally optimal under the swap, there exists an equivalent optimal schedule where every prefix of jobs has machine loads differing by at most 4p_max squared. This bound lets the algorithm discard many redundant states that differ only in how total processing time is distributed across machines. The resulting algorithms run in O(p_max to the power of 2m minus 2 times n) for minimizing weighted completion time and maximum lateness, and a slightly higher but still improved bound for the weighted number of late jobs.

Core claim

We prove that, whenever this swap does not increase the objective value, there exists an optimal schedule in which, for every prefix of jobs, the load difference between any two machines is at most 4p_max squared. This lets us prune redundant states throughout the dynamic program, replacing the dependence on P by a dependence on p_max squared. We show that the swap is non-increasing for all three objectives above.

What carries the argument

The swapping argument combined with an additive-combinatorial lemma that guarantees the existence of an optimal schedule with bounded prefix load differences.

If this is right

Pm||sum w_j C_j and Pm||L_max admit O(p_max^{2m-2} n) algorithms.
Pm||sum w_j U_j admits O(p_max^{2m-2} P n) time which is at most O(p_max^{2m-1} n^2).
When p_max is polylogarithmic in n the first two problems obtain near-linear time in n.
The new bounds are strictly faster than the original Lawler-Moore times whenever p_max is o(sqrt(P)).

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same bounded-difference property may allow state pruning in other dynamic programs that track machine loads for regular scheduling objectives.
If similar swapping arguments apply to unrelated-machine or precedence-constrained settings, dependence on total processing time could be reduced there as well.
When all processing times are bounded by a small constant the new algorithms become polynomial in n with degree independent of m for the first two objectives.

Load-bearing premise

The swap operation never increases the objective value and the combinatorial lemma accurately limits how much machine loads can differ in some optimal schedule.

What would settle it

An instance and objective where every optimal schedule has at least one job prefix with two machines whose loads differ by more than 4p_max squared would show that the pruning step discards necessary states.

Figures

Figures reproduced from arXiv: 2604.13642 by Danny Hermelin, Dvir Shabtay, Karl Bringmann, Tomohiro Koana.

**Figure 1.** Figure 1: Gantt chart depicting machine loads. Light gray blocks represent the jobs {1, . . . , j} already considered in the dynamic programming state, while dark gray blocks represent future jobs in the priority sequence. The state space is pruned by ensuring the load difference between any two machines at step j remains within O(p 2 max). We use this additive combinatorics lemma to prove that there exists an optim… view at source ↗

**Figure 2.** Figure 2: A depiction of the intermediate schedule σ ′ obtained from a swap performed between machines Mh and Mi. The lighter gray depicts jobs in Jj = {1, . . . , j}. The darker gray depicts jobs in J \ Jj . In the final schedule σ ′′, the jobs of JH′ and JI ′′ can move to the left and right on Mi, while the jobs of JI ′ can move to the right on Mh. Suppose these three conditions hold for σ. Let ti := P(Ji,j (σ)) d… view at source ↗

read the original abstract

The Lawler-Moore dynamic programming framework is a classical tool in scheduling on parallel machines. It applies when the objective is regular, i.e. monotone in job completion times, and each machine follows a fixed priority order such as Smith's Rule or Jackson's Rule. For the basic objectives $Pm||\sum w_jC_j$, $Pm||L_{\max}$, and $Pm||\sum w_jU_j$, it gives running times $O(P^{m-1}n)$, $O(P^{m-1}n)$, and $O(P^mn)$, respectively, where $P$ is the total processing time. Recent SETH-based lower bounds indicate that the dependence on $P$ is essentially optimal, but they do not rule out improved dependence on the maximum processing time $p_{\max}$. We give the first major speedup of the Lawler-Moore recurrence. Our main ingredients are a new state-pruning method and a swapping argument based on an additive-combinatorial lemma. We prove that, whenever this swap does not increase the objective value, there exists an optimal schedule in which, for every prefix of jobs, the load difference between any two machines is at most $4p_{\max}^2$. This lets us prune redundant states throughout the dynamic program, replacing the dependence on $P$ by a dependence on $p_{\max}^2$. We show that the swap is non-increasing for all three objectives above. Hence $Pm||\sum w_jC_j$ and $Pm||L_{\max}$ admit algorithms with running time $O(p_{\max}^{2m-2}n)$, while $Pm||\sum w_jU_j$ can be solved in time $O(p_{\max}^{2m-2}Pn)\le O(p_{\max}^{2m-1}n^2)$. These bounds strictly improve the original Lawler-Moore runtimes whenever $p_{\max}=o(\sqrt{P})$. In particular, for $Pm||\sum w_jC_j$ and $Pm||L_{\max}$, we obtain the first near-linear-time algorithms when processing times are polylogarithmic in $n$.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper delivers a genuine improvement to the Lawler-Moore DP by pruning states to depend on p_max squared instead of total P, using a swapping argument and additive combinatorics lemma.

read the letter

The core advance is a state-pruning rule that keeps only configurations where any two machines differ in load by at most 4 p_max squared on every prefix of jobs. The authors show that an optimal schedule always exists with this property whenever a certain swap does not raise the objective, and they verify the swap works for sum w_j C_j, L_max, and sum w_j U_j. This replaces the P^{m-1} or P^m factor with p_max to a similar power, which is a clear win whenever p_max is o(sqrt(P)) and yields near-linear time for polylog p_max on the first two problems. The proof strategy is direct: the combinatorial lemma bounds the differences, the hereditary prefix property lets them prune throughout the DP, and they check the non-increasing condition separately for each objective. The manuscript appears free of circularity or hidden fitting; the new lemma is stated and used explicitly. The main limitation is that the improvement is regime-dependent and the constant 4 is probably not tight, though neither issue undercuts the result. For sum w_j U_j the bound still carries a linear P factor, but it is still asymptotically better than the original. The work is aimed at people who care about exact algorithms for scheduling and who already know the Lawler-Moore framework. It deserves a serious referee because the claimed speedup is concrete, the proof ingredients are new, and the stress-test confirms the swapping argument and pruning step hold up in the full text. I would send it out.

Referee Report

0 major / 3 minor

Summary. The paper improves the classical Lawler-Moore dynamic programming framework for three regular scheduling objectives on m parallel machines (Pm||∑w_j C_j, Pm||L_max, and Pm||∑w_j U_j). It introduces a state-pruning technique derived from a swapping argument and a new additive-combinatorial lemma, proving that there exists an optimal schedule in which, for every prefix of jobs, the load difference between any two machines is at most 4p_max². This replaces the original O(P^{m-1}n) or O(P^m n) dependence on total processing time P with a dependence on p_max², yielding running times O(p_max^{2m-2}n) for the first two problems and O(p_max^{2m-2} P n) ≤ O(p_max^{2m-1} n²) for the third. The swap is shown to be non-increasing for each objective.

Significance. If the combinatorial lemma and non-increasing property hold, the result is a substantial advance: the first major asymptotic improvement to Lawler-Moore DP in decades, with near-linear time when p_max is polylogarithmic in n. The hereditary prefix property and direct verification for each objective avoid circularity or fitted parameters, and the approach may generalize to other DP-based scheduling problems.

minor comments (3)

Abstract: the inequality O(p_max^{2m-2} P n) ≤ O(p_max^{2m-1} n²) holds when P = O(p_max n), which is always true but should be stated explicitly for clarity.
The manuscript should include a self-contained statement of the additive-combinatorial lemma (including the precise constant 4 and any assumptions on processing times) in the main body, even if the proof is deferred to an appendix.
It would strengthen the presentation to include a brief comparison table of the new versus original Lawler-Moore exponents for small fixed m (e.g., m=2,3).

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive and accurate summary of our paper, as well as the recommendation for minor revision. No specific major comments appear in the report, so we have no points to address individually at this stage. We remain available to incorporate any additional minor suggestions from the editor or referee.

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper's central speedup derives from a new additive-combinatorial lemma and a swapping argument proven directly for the three objectives (sum w_j C_j, L_max, sum w_j U_j). The key existence claim—that an optimal schedule exists with prefix load differences at most 4 p_max^2—is established by verifying that the swap is non-increasing for each objective, then using the hereditary prefix property to prune DP states. No step reduces by construction to fitted parameters, self-citations, or tautological definitions; the reduction from dependence on P to p_max^2 follows from the independent combinatorial bound.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the validity of the additive-combinatorics lemma and the proof that the swap is non-increasing for the three objectives. No free parameters are fitted to data, as this is a worst-case algorithmic analysis.

axioms (1)

domain assumption Additive combinatorics lemma that enables bounding machine load differences by 4p_max^2
Invoked to prove existence of an optimal schedule with the stated load property for any prefix.

pith-pipeline@v0.9.0 · 5712 in / 1407 out tokens · 60791 ms · 2026-05-10T12:16:23.346350+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

25 extracted references · 25 canonical work pages

[1]

Scheduling lower bounds via AND subset sum

Amir Abboud, Karl Bringmann, Danny Hermelin, and Dvir Shabtay. Scheduling lower bounds via AND subset sum. Journal of Computer and System Sciences , 127:29--40, 2022

work page 2022
[2]

Tight (S)ETH -based lower bounds for pseudopolynomial algorithms for bin packing and multi-machine scheduling

Karl Bringmann, Anita D \"u rr, and Karol W e grzycki. Tight (S)ETH -based lower bounds for pseudopolynomial algorithms for bin packing and multi-machine scheduling. CoRR , abs/2603.12999, 2026

work page arXiv 2026
[3]

Faster minimization of tardy processing time on a single machine

Karl Bringmann, Nick Fischer, Danny Hermelin, Dvir Shabtay, and Philip Wellnitz. Faster minimization of tardy processing time on a single machine. Algorithmica , 84(5):1341--1356, 2022

work page 2022
[4]

Coffman Jr., and Ravi Sethi

John Bruno, Edward G. Coffman Jr., and Ravi Sethi. Scheduling independent tasks to reduce mean finishing time. Communications of the ACM , 17:382--387, 1974

work page 1974
[5]

Knapsack with small items in near-quadratic time

Karl Bringmann. Knapsack with small items in near-quadratic time. In Proc. of the 56th Annual ACM Symposium on Theory Of Computing ( STOC ) , pages 259--270, 2024

work page 2024
[6]

Conway, William L

Richard W. Conway, William L. Maxwell, and Leslie W. Miller. Theory of Scheduling . Addison-Wesley, 1967

work page 1967
[7]

Minimizing tardy processing time on a single machine in near-linear time

Nick Fischer and Leo Wennmann. Minimizing tardy processing time on a single machine in near-linear time. TheoretiCS , 4, 2025

work page 2025
[8]

Garey and David S

Michael R. Garey and David S. Johnson. S trong N P -completeness results: motivation, examples, and implications. Journal of the ACM , 25(3):499--508, 1978

work page 1978
[9]

Garey and David S

Michael R. Garey and David S. Johnson. Computers and Intractability; A Guide to the Theory of NP-Completeness . W. H. Freeman & Co., 1990

work page 1990
[10]

Ronald L. Graham. Bounds on multiprocessing timing anomalies. SIAM Journal on Applied Mathematics , 17(2):416--429, 1969

work page 1969
[11]

Minimizing the weighted number of tardy jobs is W [1]-hard

Klaus Heeger and Danny Hermelin. Minimizing the weighted number of tardy jobs is W [1]-hard. In Proc. of the 32nd Annual European Symposium on Algorithms ( ESA ) , pages 68:1--68:14, 2024

work page 2024
[12]

Pinedo, and Dvir Shabtay

Danny Hermelin, Shlomo Karhi, Michael L. Pinedo, and Dvir Shabtay. New algorithms for minimizing the weighted number of tardy jobs on a single machine. Ann. Oper. Res. , 298(1):271--287, 2021

work page 2021
[13]

Minimizing the weighted number of tardy jobs via (max,+)-convolutions

Danny Hermelin, Hendrik Molter, and Dvir Shabtay. Minimizing the weighted number of tardy jobs via (max,+)-convolutions. INFORMS Journal of Computing , 36(3):836--848, 2024

work page 2024
[14]

Fast makespan minimization via short ILP s

Danny Hermelin and Dvir Shabtay. Fast makespan minimization via short ILP s. CoRR , abs/2502.13631, 2026

work page arXiv 2026
[15]

James R. Jackson. Scheduling a production line to minimize maximum tardiness. Management Science Research Project, University of California, Los Angeles, CA. , 1955

work page 1955
[16]

0-1 knapsack in nearly quadratic time

Ce Jin. 0-1 knapsack in nearly quadratic time. In Proc. of the 56th Annual ACM Symposium on Theory Of Computing (STOC) , pages 271--282, 2024

work page 2024
[17]

Scheduling meets n -fold integer programming

Dusan Knop and Martin Kouteck \' y . Scheduling meets n -fold integer programming. Journal of Scheduling , 21(5):493--503, 2018

work page 2018
[18]

On minimizing tardy processing time, max-min skewed convolution, and triangular structured ILP s

Kim - Manuel Klein, Adam Polak, and Lars Rohwedder. On minimizing tardy processing time, max-min skewed convolution, and triangular structured ILP s. In Proceedings of the 2023 ACM-SIAM Symposium on Discrete Algorithms, SODA 2023, Florence, Italy, January 22-25, 2023 , pages 2947--2960. SIAM , 2023

work page 2023
[19]

B. J. Lageweg, J. K. Lenstra, and A. H. G. Rinnooy Kan. Minimizing maximum lateness on one machine: Computational experience and some applications. Management Science , 23(10):1151--1163, 1976

work page 1976
[20]

Lawler and James M

Eugene L. Lawler and James M. Moore. A functional equation and its application to resource allocation and sequencing problems. Management Science , 16(1):77--84, 1969

work page 1969
[21]

Lenstra, A.H.G

J.K. Lenstra, A.H.G. Rinnooy Kan, and P. Brucker. Complexity of machine scheduling problems. Annals of Discrete Mathematics , 1:343--362, 1977

work page 1977
[22]

James M. Moore. An n job, one machine sequencing algorithm for minimizing the number of late jobs. Management Science , 15:102--109, 1968

work page 1968
[23]

Knapsack and subset sum with small items

Adam Polak, Lars Rohwedder, and Karol W e grzycki. Knapsack and subset sum with small items. In Proc. of the 48th International Colloquium on Automata, Languages, and Programming (ICALP) , pages 106:1--106:19, 2021

work page 2021
[24]

Wayne E. Smith. Various optimizers for single-stage production. Naval Research Logistics Quarterly , 3:59--66, 1956

work page 1956
[25]

Quick minimization of tardy processing time on a single machine

Baruch Schieber and Pranav Sitaraman. Quick minimization of tardy processing time on a single machine. In Proc. of the 18th International Symposium on Algorithms and Data Structures ( WADS ) , pages 637--643, 2023

work page 2023

[1] [1]

Scheduling lower bounds via AND subset sum

Amir Abboud, Karl Bringmann, Danny Hermelin, and Dvir Shabtay. Scheduling lower bounds via AND subset sum. Journal of Computer and System Sciences , 127:29--40, 2022

work page 2022

[2] [2]

Tight (S)ETH -based lower bounds for pseudopolynomial algorithms for bin packing and multi-machine scheduling

Karl Bringmann, Anita D \"u rr, and Karol W e grzycki. Tight (S)ETH -based lower bounds for pseudopolynomial algorithms for bin packing and multi-machine scheduling. CoRR , abs/2603.12999, 2026

work page arXiv 2026

[3] [3]

Faster minimization of tardy processing time on a single machine

Karl Bringmann, Nick Fischer, Danny Hermelin, Dvir Shabtay, and Philip Wellnitz. Faster minimization of tardy processing time on a single machine. Algorithmica , 84(5):1341--1356, 2022

work page 2022

[4] [4]

Coffman Jr., and Ravi Sethi

John Bruno, Edward G. Coffman Jr., and Ravi Sethi. Scheduling independent tasks to reduce mean finishing time. Communications of the ACM , 17:382--387, 1974

work page 1974

[5] [5]

Knapsack with small items in near-quadratic time

Karl Bringmann. Knapsack with small items in near-quadratic time. In Proc. of the 56th Annual ACM Symposium on Theory Of Computing ( STOC ) , pages 259--270, 2024

work page 2024

[6] [6]

Conway, William L

Richard W. Conway, William L. Maxwell, and Leslie W. Miller. Theory of Scheduling . Addison-Wesley, 1967

work page 1967

[7] [7]

Minimizing tardy processing time on a single machine in near-linear time

Nick Fischer and Leo Wennmann. Minimizing tardy processing time on a single machine in near-linear time. TheoretiCS , 4, 2025

work page 2025

[8] [8]

Garey and David S

Michael R. Garey and David S. Johnson. S trong N P -completeness results: motivation, examples, and implications. Journal of the ACM , 25(3):499--508, 1978

work page 1978

[9] [9]

Garey and David S

Michael R. Garey and David S. Johnson. Computers and Intractability; A Guide to the Theory of NP-Completeness . W. H. Freeman & Co., 1990

work page 1990

[10] [10]

Ronald L. Graham. Bounds on multiprocessing timing anomalies. SIAM Journal on Applied Mathematics , 17(2):416--429, 1969

work page 1969

[11] [11]

Minimizing the weighted number of tardy jobs is W [1]-hard

Klaus Heeger and Danny Hermelin. Minimizing the weighted number of tardy jobs is W [1]-hard. In Proc. of the 32nd Annual European Symposium on Algorithms ( ESA ) , pages 68:1--68:14, 2024

work page 2024

[12] [12]

Pinedo, and Dvir Shabtay

Danny Hermelin, Shlomo Karhi, Michael L. Pinedo, and Dvir Shabtay. New algorithms for minimizing the weighted number of tardy jobs on a single machine. Ann. Oper. Res. , 298(1):271--287, 2021

work page 2021

[13] [13]

Minimizing the weighted number of tardy jobs via (max,+)-convolutions

Danny Hermelin, Hendrik Molter, and Dvir Shabtay. Minimizing the weighted number of tardy jobs via (max,+)-convolutions. INFORMS Journal of Computing , 36(3):836--848, 2024

work page 2024

[14] [14]

Fast makespan minimization via short ILP s

Danny Hermelin and Dvir Shabtay. Fast makespan minimization via short ILP s. CoRR , abs/2502.13631, 2026

work page arXiv 2026

[15] [15]

James R. Jackson. Scheduling a production line to minimize maximum tardiness. Management Science Research Project, University of California, Los Angeles, CA. , 1955

work page 1955

[16] [16]

0-1 knapsack in nearly quadratic time

Ce Jin. 0-1 knapsack in nearly quadratic time. In Proc. of the 56th Annual ACM Symposium on Theory Of Computing (STOC) , pages 271--282, 2024

work page 2024

[17] [17]

Scheduling meets n -fold integer programming

Dusan Knop and Martin Kouteck \' y . Scheduling meets n -fold integer programming. Journal of Scheduling , 21(5):493--503, 2018

work page 2018

[18] [18]

On minimizing tardy processing time, max-min skewed convolution, and triangular structured ILP s

Kim - Manuel Klein, Adam Polak, and Lars Rohwedder. On minimizing tardy processing time, max-min skewed convolution, and triangular structured ILP s. In Proceedings of the 2023 ACM-SIAM Symposium on Discrete Algorithms, SODA 2023, Florence, Italy, January 22-25, 2023 , pages 2947--2960. SIAM , 2023

work page 2023

[19] [19]

B. J. Lageweg, J. K. Lenstra, and A. H. G. Rinnooy Kan. Minimizing maximum lateness on one machine: Computational experience and some applications. Management Science , 23(10):1151--1163, 1976

work page 1976

[20] [20]

Lawler and James M

Eugene L. Lawler and James M. Moore. A functional equation and its application to resource allocation and sequencing problems. Management Science , 16(1):77--84, 1969

work page 1969

[21] [21]

Lenstra, A.H.G

J.K. Lenstra, A.H.G. Rinnooy Kan, and P. Brucker. Complexity of machine scheduling problems. Annals of Discrete Mathematics , 1:343--362, 1977

work page 1977

[22] [22]

James M. Moore. An n job, one machine sequencing algorithm for minimizing the number of late jobs. Management Science , 15:102--109, 1968

work page 1968

[23] [23]

Knapsack and subset sum with small items

Adam Polak, Lars Rohwedder, and Karol W e grzycki. Knapsack and subset sum with small items. In Proc. of the 48th International Colloquium on Automata, Languages, and Programming (ICALP) , pages 106:1--106:19, 2021

work page 2021

[24] [24]

Wayne E. Smith. Various optimizers for single-stage production. Naval Research Logistics Quarterly , 3:59--66, 1956

work page 1956

[25] [25]

Quick minimization of tardy processing time on a single machine

Baruch Schieber and Pranav Sitaraman. Quick minimization of tardy processing time on a single machine. In Proc. of the 18th International Symposium on Algorithms and Data Structures ( WADS ) , pages 637--643, 2023

work page 2023