GPU accelerated variant of Schroeppel-Shamir's algorithm for solving the market split problem
Pith reviewed 2026-05-19 06:08 UTC · model grok-4.3
The pith
A hybrid CPU-GPU implementation of a Schroeppel-Shamir variant solves market split feasibility instances with up to 10 constraints and 90 variables.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By deriving a feasibility solver for the market split problem from Schroeppel-Shamir's meet-in-the-middle technique for one-dimensional subset sum and accelerating the multi-constraint verification step with GPUs, the algorithm can solve instances of size (9,80) in less than fifteen minutes and (10,90) in up to one day.
What carries the argument
Exhaustive enumeration of one-dimensional solutions combined with GPU batch evaluation of candidate solutions across multiple constraints.
If this is right
- Standard linear programming branch-and-cut solvers perform poorly on the market split problem compared to this approach.
- Instances with nine constraints and eighty variables become solvable in under fifteen minutes.
- Instances with ten constraints and ninety variables become solvable in up to one day.
- The method scales to larger feasibility versions than previously practical.
Where Pith is reading between the lines
- This technique of splitting the problem into one-dimensional enumerations might apply to other multi-dimensional subset sum variants.
- Further speedups could come from optimizing the GPU kernel or using multiple GPUs for even larger instances.
- Extending the approach to the optimization version of the market split problem could be a natural next step.
Load-bearing premise
That the number of one-dimensional solutions stays small enough and the GPU can evaluate them in batches without running out of memory or time for sizes up to ten constraints and ninety variables.
What would settle it
An instance with ten constraints and ninety variables that either takes more than one day or causes the GPU to run out of memory when using this algorithm.
read the original abstract
The market split problem (MSP), introduced by Cornuejols and Dawande (1998), is a challenging binary optimization problem that performs poorly on state-of-the-art linear programming-based branch-and-cut solvers. We present a novel algorithm for solving the feasibility version of this problem, derived from Schroeppel-Shamir's algorithm for the one-dimensional subset sum problem. Our approach is based on exhaustively enumerating one-dimensional solutions of MSP and utilizing GPUs to evaluate candidate solutions across the entire problem. The resulting hybrid CPU-GPU implementation efficiently solves instances with up to 10 constraints and 90 variables. We demonstrate the algorithm's performance on benchmark problems, solving instances of size (9, 80) in less than fifteen minutes and (10, 90) in up to one day.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript presents a GPU-accelerated variant of Schroeppel-Shamir's meet-in-the-middle algorithm for the feasibility version of the market split problem. It claims that the hybrid CPU-GPU implementation solves benchmark instances with up to 10 constraints and 90 variables, specifically reporting that (9,80) instances are solved in less than fifteen minutes and (10,90) instances in up to one day.
Significance. If the scaling claims hold with verifiable candidate-set sizes and no hidden exponential costs in the multi-constraint checks, the work would offer a practical advance for a class of binary optimization problems that perform poorly under standard LP-based solvers, by combining classical enumeration with GPU batch evaluation.
major comments (2)
- [Abstract] Abstract: The performance numbers for (9,80) and (10,90) instances are stated without any reported bound, measurement, or analysis of the size of the one-dimensional candidate sets produced by the Schroeppel-Shamir enumeration; this quantity is load-bearing for assessing whether GPU batch evaluation remains feasible at n=90 without memory blow-up.
- [§3 (Algorithm)] §3 (Algorithm): The description of extending the one-dimensional partial solutions to the remaining constraints via GPU evaluation provides no details on the data structures (hashing, sorting, or batching) used for consistency checks or any empirical count of surviving candidates, leaving open the possibility that exponential cost is reintroduced when the number of constraints reaches 10.
minor comments (2)
- [§2] The notation distinguishing the one-dimensional subset-sum subproblems from the full multi-constraint MSP could be introduced more explicitly in the background section to improve readability.
- [§4] Runtime tables or figures should report the number of runs, hardware specifications (GPU model, memory), and any observed variance to support the claimed wall-clock times.
Simulated Author's Rebuttal
We thank the referee for the careful reading and constructive feedback on our manuscript. We agree that additional details on candidate-set sizes and GPU data structures will improve clarity and verifiability. We address the major comments below and will revise the manuscript accordingly.
read point-by-point responses
-
Referee: [Abstract] Abstract: The performance numbers for (9,80) and (10,90) instances are stated without any reported bound, measurement, or analysis of the size of the one-dimensional candidate sets produced by the Schroeppel-Shamir enumeration; this quantity is load-bearing for assessing whether GPU batch evaluation remains feasible at n=90 without memory blow-up.
Authors: We agree that explicit bounds and measurements of the one-dimensional candidate-set sizes are necessary to substantiate the reported runtimes and GPU feasibility. In the revised manuscript we will add a new paragraph (or subsection) reporting both theoretical bounds (O(2^{n/2}) per half) and empirical sizes observed for the (9,80) and (10,90) benchmark instances, together with peak GPU memory usage during batch evaluation. This will confirm that the candidate sets remain within practical GPU memory limits at n=90. revision: yes
-
Referee: [§3 (Algorithm)] §3 (Algorithm): The description of extending the one-dimensional partial solutions to the remaining constraints via GPU evaluation provides no details on the data structures (hashing, sorting, or batching) used for consistency checks or any empirical count of surviving candidates, leaving open the possibility that exponential cost is reintroduced when the number of constraints reaches 10.
Authors: We acknowledge that the current description of the multi-constraint filtering step is insufficiently detailed. In the revision we will expand §3 with a precise account of the data structures and algorithms used on the GPU: sorted arrays of partial sums for the first constraint, followed by batched parallel prefix-sum and binary-search checks for the remaining constraints. We will also report empirical counts of candidates that survive each successive constraint check for the tested instances, demonstrating that the filtering remains sub-exponential in practice. revision: yes
Circularity Check
No circularity: empirical runtime reporting on external algorithm variant
full rationale
The paper presents a hybrid CPU-GPU implementation derived from the established Schroeppel-Shamir algorithm for subset sum, with performance claims based on measured runtimes for benchmark instances up to (10,90). No equations, fitted parameters, or self-referential derivations appear; the approach relies on exhaustive enumeration of one-dimensional solutions followed by GPU batch checks, reported as direct computational results rather than predictions that reduce to inputs by construction. The central claims are self-contained empirical observations on an external base algorithm with no load-bearing self-citations or ansatz smuggling.
Axiom & Free-Parameter Ledger
axioms (1)
- standard math Schroeppel-Shamir's algorithm correctly enumerates all one-dimensional subset sum solutions in the stated time bounds.
Reference graph
Works this paper leans on
-
[1]
Aardal et al. (1999). Market Split and Basis Reduction: T o- wards a Solution of the Cornu ´ejols-Dawande Instances. In Lecture Notes in Computer Science (pp. 1–16). Springer Berlin Heidelberg. https://doi.org/10.1007/3-540-48777-8_1
-
[2]
Cornu ´ejols, G., & Dawande, M. (1998). A Class of Hard Small 0–1 Pro- grams. In Lecture Notes in Computer Science (pp. 284–293). Springer Berlin Heidelberg. https://doi.org/10.1007/3-540-69346-7_22
-
[3]
Gurobi Optimization, LLC. (2023). Gurobi (Version 11). https://www.gurobi.com
work page 2023
-
[4]
Horowitz, E., & Sahni, S. (1974). Computing Partitions w ith Applica- tions to the Knapsack Problem. Journal of the ACM , 21(2), 277–292. https://doi.org/10.1145/321812.321823
-
[5]
Karp, R. M. (1972). Reducibility among Combinatorial Pr oblems. In Complexity of Computer Computations (pp. 85–103). Springer US. https://doi.org/10.1007/978-1-4684-2001-2_9
-
[6]
Koch et al. (2025). Quantum Optimization Benchmark Li- brary – The Intractable Decathlon (Version 1). arXiv. https://doi.org/10.48550/ARXIV.2504.03832
-
[7]
Schroeppel, R., & Shamir, A. (1981). A /u1D447 = /u1D442( 2/u1D45B/ 2) , /u1D446 = /u1D442( 2/u1D45B/ 4) Al- gorithm for Certain NP-Complete Problems. SIAM Journal on Computing , 10(3), 456–464. https://doi.org/10.1137/0210033
-
[8]
Vogel, H. (2012). Solving market split problems with heu ristical lattice reduction. Annals of Operations Research , 196(1), 581–590. https://doi.org/10.1007/s10479-012-1143-0
-
[9]
Wang et al. (2009). Solving the market split problem via b ranch-and-cut. In- ternational Journal of Mathematical Modelling and Numerical Optimisation, 1(1/2), 121. https://doi.org/10.1504/ijmmno.2009.030091 6
-
[10]
Wassermann, A. (2002). Attacking the Market Split Prob lem with Lattice Point Enumeration. Journal of Combinatorial Optimization , 6(1), 5–16. https://doi.org/10.1023/a:1013355015853
-
[11]
Williams, H. P. (1978). Model Building in Mathematical Programming. John Wiley & Sons Ltd
work page 1978
-
[12]
Wu et al. (2013). Solving the market split problem using a distributed computation approach. In 2013 IEEE International Conference on Information and Automation (pp. 1252–1257). https://doi.org/10.1109/icinfa.2013.6720486 7
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.