Decision-Focused Surrogate Modeling for Mixed-Integer Linear Optimization
Pith reviewed 2026-05-24 00:18 UTC · model grok-4.3
The pith
Decision-focused surrogate linear programs trained on MILP data predict near-optimal solutions while incorporating all original constraints for feasibility.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We develop a data-driven approach for constructing surrogate optimization models in the form of linear programs (LPs) that can be solved much more efficiently than the corresponding MILPs. We train these surrogate LPs in a decision-focused manner such that for different model inputs, they achieve the same or close to the same optimal solutions as the original MILPs. One key advantage of the proposed method is that it allows the incorporation of all the original MILP's linear constraints, which significantly increases the likelihood of obtaining feasible predicted solutions.
What carries the argument
Decision-focused training of surrogate LPs that embed all original MILP linear constraints to match optimal solutions.
If this is right
- Surrogate LPs solve much faster than MILPs, enabling real-time updates in online systems.
- Inclusion of all original linear constraints raises the rate of feasible predicted solutions.
- The method requires relatively little training data yet achieves high accuracy.
- It outperforms neural-network-based optimization proxies on the tested cases.
- The approach applies directly to time-critical decision-making problems that rely on MILPs.
Where Pith is reading between the lines
- Similar decision-focused surrogates could be built for other optimization classes such as nonlinear programs.
- The surrogates might serve as fast warm-starts or heuristics inside exact MILP solvers.
- Periodic retraining on new data could let the surrogates track shifting problem parameters.
- The data efficiency points to possible use in domains where generating large MILP datasets is costly.
Load-bearing premise
Training surrogate LPs decision-focused on MILP-generated data produces close optimal solutions for unseen inputs while the included constraints guarantee feasible outputs.
What would settle it
Solve the surrogate LP on a hold-out test set of inputs and compare its solutions to the true MILP optima; large objective gaps or infeasible points would disprove the claim.
Figures
read the original abstract
Mixed-integer optimization is at the core of many online decision-making systems that demand frequent updates of decisions in real time. However, due to their combinatorial nature, mixed-integer linear programs (MILPs) can be difficult to solve, rendering them often unsuitable for time-critical online applications. To address this challenge, we develop a data-driven approach for constructing surrogate optimization models in the form of linear programs (LPs) that can be solved much more efficiently than the corresponding MILPs. We train these surrogate LPs in a decision-focused manner such that for different model inputs, they achieve the same or close to the same optimal solutions as the original MILPs. One key advantage of the proposed method is that it allows the incorporation of all the original MILP's linear constraints, which significantly increases the likelihood of obtaining feasible predicted solutions. Results from two computational case studies indicate that this decision-focused surrogate modeling approach is highly data-efficient and provides very accurate predictions of the optimal solutions. In these examples, it outperforms more commonly used neural-network-based optimization proxies.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes training surrogate linear programs (LPs) for mixed-integer linear programs (MILPs) in a decision-focused manner so that the surrogates produce the same or close optimal solutions as the original MILPs for varying inputs. All original linear constraints are retained in the surrogates to guarantee feasibility of the predicted solutions. Two computational case studies are used to show that the approach is highly data-efficient, yields accurate predictions, and outperforms neural-network-based optimization proxies.
Significance. If the empirical results hold, the method offers a practical route to real-time optimization in settings where MILPs are too slow, with the built-in feasibility guarantee and decision-focused objective providing advantages over standard neural proxies. The data-efficiency claim, if substantiated, would be particularly valuable for applications with limited training data.
minor comments (2)
- [Abstract] Abstract: the claim of 'highly data-efficient' and 'very accurate predictions' is stated without any numerical metrics, number of training samples, or feasibility rates; adding a brief quantitative summary would strengthen the abstract.
- The description of the training procedure and data-generation process for the two case studies should include explicit statements of the loss function, optimizer, and how test instances were sampled relative to training instances to allow replication.
Simulated Author's Rebuttal
We thank the referee for the positive assessment of our work and the recommendation for minor revision. No specific major comments were provided in the report.
Circularity Check
No significant circularity detected
full rationale
The paper's core contribution is an empirical, data-driven construction of surrogate LPs trained decision-focused on MILP-generated instances, with feasibility enforced by retaining all original linear constraints. Performance claims rest on two external computational case studies measuring accuracy and data efficiency against NN proxies on unseen inputs. No derivation, equation, or self-citation reduces the reported outcomes to a definitional tautology or fitted input renamed as prediction; the method is validated externally rather than by internal self-reference.
Axiom & Free-Parameter Ledger
axioms (2)
- standard math Linear programs can be solved efficiently compared with MILPs
- domain assumption Training data generated from the original MILP is representative of future inputs
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We train these surrogate LPs in a decision-focused manner such that for different model inputs, they achieve the same or close to the same optimal solutions as the original MILPs... The DFSM problem... is a bilevel optimization problem... reformulate... KKT... exact penalty... block coordinate descent
-
IndisputableMonolith/Foundation/AbsoluteFloorClosure.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Results from two computational case studies indicate that this decision-focused surrogate modeling approach is highly data-efficient...
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Brandon Amos and J. Zico Kolter. Optnet: Differentiable optimization as a layer in neural networks. 3 2017
work page 2017
-
[2]
A learning-based algorithm to quickly compute good primal solutions for stochastic integer programs
Yoshua Bengio, Emma Frejinger, Andrea Lodi, Rahul Patel, and Sriram Sankaranarayanan. A learning-based algorithm to quickly compute good primal solutions for stochastic integer programs. In Integration of Constraint Programming, Artificial Intelligence, and Operations Research: 17th International Conference, CPAIOR 2020, Vienna, Austria, September 21--24,...
work page 2020
-
[3]
Machine learning for combinatorial optimization: a methodological tour d’horizon
Yoshua Bengio, Andrea Lodi, and Antoine Prouvost. Machine learning for combinatorial optimization: a methodological tour d’horizon. European Journal of Operational Research, 290 0 (2): 0 405--421, 2021
work page 2021
-
[4]
Online mixed-integer optimization in milliseconds
Dimitris Bertsimas and Bartolomeo Stellato. Online mixed-integer optimization in milliseconds. INFORMS Journal on Computing, 34 0 (4): 0 2229--2248, 2022
work page 2022
-
[5]
Advances in surrogate based modeling, feasibility analysis, and optimization: A review
Atharv Bhosekar and Marianthi Ierapetritou. Advances in surrogate based modeling, feasibility analysis, and optimization: A review . Computers and Chemical Engineering, 108: 0 250--267, 2018
work page 2018
-
[6]
Steven Chen, Kelsey Saulnier, Nikolay Atanasov, Daniel D. Lee, Vijay Kumar, George J. Pappas, and Manfred Morari. Approximating Explicit Model Predictive Control Using Constrained Neural Networks . In Proceedings of the American Control Conference, pages 1520--1527. AACC, 2018
work page 2018
-
[7]
Machine learning for cutting planes in integer programming: A survey
Arnaud Deza and Elias B Khalil. Machine learning for cutting planes in integer programming: A survey. arXiv preprint arXiv:2302.09166, 2023
-
[8]
Task-based End-to-End Model Learning in Stochastic Optimization
Priya L Donti, Brandon Amos, and J Zico Kolter. Task-based End-to-End Model Learning in Stochastic Optimization . In Advances in Neural Information Processing Systems 30, 2017
work page 2017
-
[9]
Jump: A modeling language for mathematical optimization
Iain Dunning, Joey Huchette, and Miles Lubin. Jump: A modeling language for mathematical optimization. SIAM Review, 59 0 (2): 0 295--320, 2017
work page 2017
-
[10]
Adam N. Elmachtoub and Paul Grigas. Smart "Predict, then Optimize" . Management Science, 68 0 (1): 0 9--26, 2022
work page 2022
-
[11]
MIPaaL: Mixed integer program as a layer
Aaron Ferber, Bryan Wilder, Bistra Dilkina, and Milind Tambe. MIPaaL: Mixed integer program as a layer . In AAAI 2020 - 34th AAAI Conference on Artificial Intelligence, pages 1504--1511, 2020
work page 2020
-
[12]
Surco: Learning linear surrogates for combinatorial nonlinear optimization problems
Aaron M Ferber, Taoan Huang, Daochen Zha, Martin Schubert, Benoit Steiner, Bistra Dilkina, and Yuandong Tian. Surco: Learning linear surrogates for combinatorial nonlinear optimization problems. In International Conference on Machine Learning, pages 10034--10052. PMLR, 2023
work page 2023
-
[13]
Ferdinando Fioretto, Terrence W. K. Mak, and Pascal van Hentenryck. Predicting AC optimal power flows: Combining deep learning and Lagrangian dual methods . In Proceedings of the AAAI Conference on Artificial Intelligence, pages 630--637, 2020
work page 2020
-
[14]
Rishabh Gupta and Qi Zhang. Efficient learning of decision-making models: A penalty block coordinate descent algorithm for data-driven inverse optimization. Computers & Chemical Engineering, page 108123, 2022
work page 2022
-
[15]
Data-driven decision-focused surrogate modeling
Rishabh Gupta and Qi Zhang. Data-driven decision-focused surrogate modeling. AIChE Journal, page e18338, 2024
work page 2024
-
[16]
Learning to select cuts for efficient mixed-integer programming
Zeren Huang, Kerong Wang, Furui Liu, Hui-Ling Zhen, Weinan Zhang, Mingxuan Yuan, Jianye Hao, Yong Yu, and Jun Wang. Learning to select cuts for efficient mixed-integer programming. Pattern Recognition, 123: 0 108353, 2022
work page 2022
-
[17]
Warm-starting constraint generation for mixed-integer optimization: A machine learning approach
Asunci \'o n Jim \'e nez-Cordero, Juan Miguel Morales, and Salvador Pineda. Warm-starting constraint generation for mixed-integer optimization: A machine learning approach. Knowledge-Based Systems, 253: 0 109570, 2022
work page 2022
-
[18]
Efficient Representation and Approximation of Model Predictive Control Laws via Deep Learning
Benjamin Karg and Sergio Lucia. Efficient Representation and Approximation of Model Predictive Control Laws via Deep Learning . IEEE Transactions on Cybernetics, 50 0 (9): 0 3866--3878, 2020
work page 2020
-
[19]
Learning to branch in mixed integer programming
Elias Khalil, Pierre Le Bodic, Le Song, George Nemhauser, and Bistra Dilkina. Learning to branch in mixed integer programming. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 30, 2016
work page 2016
-
[20]
Fast Approximations for Job Shop Scheduling: A Lagrangian Dual Deep Learning Method
James Kotary, Ferdinando Fioretto, and Pascal Van Hentenryck . Fast Approximations for Job Shop Scheduling: A Lagrangian Dual Deep Learning Method . In Proceedings of the AAAI Conference on Artificial Intelligence, 2022
work page 2022
-
[21]
Pratyush Kumar, James B. Rawlings, and Stephen J. Wright. Industrial, large-scale model predictive control with structured neural networks . Computers and Chemical Engineering, 150: 0 107291, 2021
work page 2021
-
[22]
On learning and branching: a survey
Andrea Lodi and Giulia Zarpellon. On learning and branching: a survey. Top, 25 0 (2): 0 207--236, 2017
work page 2017
-
[23]
Jayanta Mandi, Emir Demirovi \' c , Peter J. Stuckey, and Tias Guns. Smart predict-and-optimize for hard combinatorial optimization problems . In AAAI 2020 - 34th AAAI Conference on Artificial Intelligence, pages 1603--1610, 2020
work page 2020
-
[24]
Chemical Production Scheduling: Mixed-Integer Programming Models and Methods
Christos T Maravelias. Chemical Production Scheduling: Mixed-Integer Programming Models and Methods. Cambridge University Press, 2021
work page 2021
-
[25]
Cutting planes in integer and mixed integer programming
Hugues Marchand, Alexander Martin, Robert Weismantel, and Laurence Wolsey. Cutting planes in integer and mixed integer programming. Discrete Applied Mathematics, 123 0 (1-3): 0 397--446, 2002
work page 2002
-
[26]
DeepOPF: Deep neural network for DC optimal power flow
Xiang Pan, Tianyu Zhao, and Minghua Chen. DeepOPF: Deep neural network for DC optimal power flow . In Proceedings of the 2019 IEEE International Conference on Communications, Control, and Computing Technologies for Smart Grids (SmartGridComm). IEEE, 2019
work page 2019
-
[27]
Joel A. Paulson and Ali Mesbah. Approximate Closed-Loop Robust Model Predictive Control with Guaranteed Stability and Constraint Satisfaction . IEEE Control Systems Letters, 4 0 (3): 0 719--724, 2020
work page 2020
-
[28]
Theory of linear and integer programming
Alexander Schrijver. Theory of linear and integer programming. John Wiley & Sons, 1998
work page 1998
-
[29]
Learning primal heuristics for mixed integer programs
Yunzhuang Shen, Yuan Sun, Andrew Eberhard, and Xiaodong Li. Learning primal heuristics for mixed integer programs. In 2021 international joint conference on neural networks (ijcnn), pages 1--8. IEEE, 2021
work page 2021
-
[30]
Haoran Sun, Xiangyi Chen, Qingjiang Shi, Mingyi Hong, Xiao Fu, and Nicholas D. Sidiropoulos. Learning to Optimize: Training Deep Neural Networks for Interference Management . IEEE Transactions on Signal Processing, 66 0 (20): 0 5438--5453, 2018
work page 2018
-
[31]
A simple effective heuristic for embedded mixed-integer quadratic programming
Reza Takapoui, Nicholas Moehle, Stephen Boyd, and Alberto Bemporad. A simple effective heuristic for embedded mixed-integer quadratic programming. 9 2015
work page 2015
-
[32]
Reinforcement learning for integer programming: Learning to cut
Yunhao Tang, Shipra Agrawal, and Yuri Faenza. Reinforcement learning for integer programming: Learning to cut. In International conference on machine learning, pages 9367--9376. PMLR, 2020
work page 2020
-
[33]
Combining Deep Learning and Optimization for Preventive Security-Constrained DC Optimal Power Flow
Alexandre Velloso and Pascal Van Hentenryck . Combining Deep Learning and Optimization for Preventive Security-Constrained DC Optimal Power Flow . IEEE Transactions on Power Systems, 36 0 (4): 0 3618--3628, 2021
work page 2021
-
[34]
Melding the data-decisions pipeline: Decision-focused learning for combinatorial optimization
Bryan Wilder, Bistra Dilkina, and Milind Tambe. Melding the data-decisions pipeline: Decision-focused learning for combinatorial optimization . In 33rd AAAI Conference on Artificial Intelligence, AAAI 2019, 31st Innovative Applications of Artificial Intelligence Conference, IAAI 2019 and the 9th AAAI Symposium on Educational Advances in Artificial Intelli...
work page 2019
-
[35]
Ahmed S. Zamzam and Kyri Baker. Learning optimal solutions for extremely fast ac optimal power flow . Proceedings of the 2020 IEEE International Conference on Communications, Control, and Computing Technologies for Smart Grids (SmartGridComm), 2020
work page 2020
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.