Provably Data-driven Lagrangian Relaxation for Mixed Integer Linear Programming

Anh Tuan Nguyen; Tung Quoc Le; Viet Anh Nguyen

arxiv: 2605.19052 · v1 · pith:43XBESTLnew · submitted 2026-05-18 · 📊 stat.ML · cs.LG

Provably Data-driven Lagrangian Relaxation for Mixed Integer Linear Programming

Tung Quoc Le , Anh Tuan Nguyen , Viet Anh Nguyen This is my paper

Pith reviewed 2026-05-20 07:21 UTC · model grok-4.3

classification 📊 stat.ML cs.LG

keywords Lagrangian relaxationmixed integer linear programmingdata-driven algorithm designstochastic gradient ascentgeneralization boundsminimax rateslearning to warm-start

0 comments

The pith

Stochastic gradient ascent with averaging learns Lagrangian multipliers for MILPs at the minimax optimal rate of Theta(s over sqrt(N)).

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper frames the selection of Lagrangian multipliers for mixed integer linear programs as a statistical learning task over a distribution of problem instances. It first derives a generalization bound of order s to the 1.5 over square root N on the quality of learned multipliers and establishes a matching lower bound of order s over square root N. The central result shows that stochastic gradient ascent with averaging attains the tight rate Theta(s over sqrt(N)), closing the gap between the upper and lower bounds. The analysis is then extended to a learning-to-warm-start variant that reaches the faster rate Theta(s over N). These guarantees supply the missing theoretical foundation for data-driven Lagrangian relaxation on decomposable large-scale problems.

Core claim

The paper's central claim is that stochastic gradient ascent with averaging, when applied to learn Lagrangian multipliers over a distribution of MILP instances, achieves the minimax optimal statistical rate of Theta(s / sqrt(N)), where s denotes the number of coupling constraints and N the sample size. This is established by first proving a generalization bound of O(s^{1.5}/sqrt(N)), deriving a matching lower bound of Omega(s/sqrt(N)), and then constructing the algorithm that closes the gap. The analysis is extended to show a faster Theta(s/N) rate in the learning-to-warm-start setting.

What carries the argument

Averaged stochastic gradient ascent applied to the empirical Lagrangian dual over sampled problem instances, which learns the multipliers with provable statistical guarantees.

If this is right

Learned multipliers yield provably good dual bounds that can tighten branch-and-bound pruning for decomposable MILPs.
Data-driven Lagrangian relaxation now rests on rigorous generalization and convergence guarantees rather than purely empirical performance.
The learning-to-warm-start extension achieves a strictly faster rate than direct multiplier prediction from scratch.
The same framework applies to standard benchmark classes such as vehicle routing and unit commitment problems.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Similar statistical analyses could be carried out for other dual-based decomposition methods beyond Lagrangian relaxation.
Future work might relax the i.i.d. sampling assumption to handle sequences of related but dependent problem instances.
Direct numerical checks of the predicted scaling on standard MILP test libraries would provide an immediate test of the rates.

Load-bearing premise

The MILP problem instances are independently and identically distributed samples from some fixed but unknown distribution.

What would settle it

Running averaged stochastic gradient ascent on a large collection of i.i.d. MILP instances and finding that the observed error does not scale as Theta(s/sqrt(N)) would falsify the optimality claim.

read the original abstract

Lagrangian Relaxation (LR) is a powerful technique for solving large-scale Mixed Integer Linear Programming (MILP), particularly those with decomposable structures, such as vehicle routing or unit commitment problems. By relaxing the coupling constraints, LR enables parallel subproblem solving and often yields tighter dual bounds than standard linear programming relaxations, which is crucial for efficient branch-and-bound pruning. While recent empirical work has shown promising results using machine learning to predict these multipliers, a theoretical understanding of such methods remains an open question. In this work, we bridge this gap by analyzing the problem of learning LR through the lens of Data-driven Algorithm Design, i.e., a statistical learning problem over a distribution of problem instances. Our contributions are as follows: first, we derive a generalization bound of $\mathcal{O}(s^{1.5}/\sqrt{N})$ for the learned multipliers, where $s$ is the number of coupling constraints and $N$ is the sample size. Second, we provide a minimax lower-bound of $\Omega(s/\sqrt{N})$, proving that a linear dependency is unavoidable. Third, we constructively close this theoretical gap by proving that Stochastic Gradient Ascent (SGA) with averaging achieves the minimax optimal rate $\Theta(s/\sqrt{N})$. Finally, we extend our framework to the learning-to-warm-start setting, proving that it achieves a fast, minimax-optimal rate of $\Theta(s/N)$ and establishing a theoretical advantage over direct multiplier prediction.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper supplies the first matching generalization and minimax bounds for data-driven Lagrangian multiplier learning in MILPs, with averaged SGA hitting the optimal rate.

read the letter

The main point is that the authors take an empirical technique for predicting multipliers in Lagrangian relaxation and give it non-asymptotic theory. They model the setting as statistical learning over a distribution of MILP instances, derive an O(s^{1.5}/sqrt(N)) generalization bound, prove an Omega(s/sqrt(N)) minimax lower bound, and show that stochastic gradient ascent with averaging attains the tight Theta(s/sqrt(N)) rate. The warm-start extension reaching Theta(s/N) is a useful extra result that highlights an advantage over direct prediction.

Referee Report

0 major / 2 minor

Summary. The paper frames learning Lagrangian multipliers for decomposable MILPs as a statistical learning problem over a distribution of problem instances. It derives a generalization bound of O(s^{1.5}/√N) on the learned multipliers, establishes a minimax lower bound of Ω(s/√N), and constructively shows that averaged Stochastic Gradient Ascent attains the optimal rate Θ(s/√N). The framework is extended to the learning-to-warm-start setting, where a faster minimax-optimal rate of Θ(s/N) is proved.

Significance. If the results hold, this work is significant for data-driven algorithm design in optimization: it supplies the first rigorous statistical guarantees for data-driven Lagrangian relaxation, with matching upper and lower bounds derived from standard convex stochastic optimization tools. The constructive SGA result and the warm-start extension provide both theoretical closure and practical algorithmic guidance for large-scale MILPs such as vehicle routing and unit commitment.

minor comments (2)

[Abstract] Abstract: the stated generalization bound O(s^{1.5}/√N) and the optimal rate Θ(s/√N) differ by an s^{0.5} factor; a brief remark on whether this gap is an artifact of the analysis technique or inherent to the multiplier prediction problem would aid readability.
[Section 2 (or wherever the statistical learning setup is formalized)] The i.i.d. assumption on problem instances is central to the generalization bounds; the manuscript should explicitly state any additional regularity conditions (e.g., bounded subproblem gradients or instance normalization) required for the SGA convergence analysis.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive summary, recognition of significance, and recommendation of minor revision. The report correctly captures our main results on the O(s^{1.5}/√N) generalization bound, the matching Ω(s/√N) minimax lower bound, the Θ(s/√N) rate attained by averaged SGA, and the faster Θ(s/N) rate for the warm-start setting.

Circularity Check

0 steps flagged

No significant circularity identified

full rationale

The derivation applies standard statistical learning theory and convex stochastic optimization to the expected Lagrangian dual over an i.i.d. distribution of MILP instances. The O(s^{1.5}/√N) generalization bound, the Ω(s/√N) minimax lower bound, and the constructive proof that averaged SGA attains Θ(s/√N) are obtained from classical tools (e.g., Rademacher complexity or covering-number arguments for the dual function class) without any reduction of a claimed prediction to a fitted quantity by construction. The warm-start extension to Θ(s/N) follows the same independent analysis. No self-citation is load-bearing for the central rates, and the argument remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

Analysis rests on standard i.i.d. sampling and concentration tools from statistical learning theory; no new entities or fitted constants are introduced beyond the problem dimension s and sample size N.

axioms (2)

domain assumption Problem instances are drawn i.i.d. from a fixed but unknown distribution
Enables application of statistical learning generalization bounds to the multiplier selection problem.
domain assumption Relaxed subproblems admit efficient parallel solution
Required for the practical motivation and parallel computation model underlying LR.

pith-pipeline@v0.9.0 · 5798 in / 1278 out tokens · 34577 ms · 2026-05-20T07:21:36.618946+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/AbsoluteFloorClosure.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We derive a generalization bound of O(s^{1.5}/√N) … minimax lower-bound of Ω(s/√N) … SGA with averaging achieves Θ(s/√N)
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

u(π,P) = min … g(π,P)=b−Ax*(π,P) … ∥g∥₂ ≤ 2B√s

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

86 extracted references · 86 canonical work pages · 1 internal anchor

[1]

Conference on Learning Theory , pages=

Generalization bounds for data-driven numerical linear algebra , author=. Conference on Learning Theory , pages=. 2022 , organization=

work page 2022
[2]

Springer Series in Statistics , year=

Convergence of Stochastic Processes , author=. Springer Series in Statistics , year=

work page
[3]

International Conference on Artificial Intelligence and Statistics , pages=

Improved generalization bound and learning of sparsity patterns for data-driven low-rank approximation , author=. International Conference on Artificial Intelligence and Statistics , pages=. 2023 , organization=

work page 2023
[4]

2003 , publisher=

Linear Programming: Methods and Applications , author=. 2003 , publisher=

work page 2003
[5]

An application of the

Decell, Jr, Henry P , journal=. An application of the. 1965 , publisher=

work page 1965
[6]

International Conference on Machine Learning , pages=

Refined bounds for algorithm configuration: The knife-edge of dual class approximability , author=. International Conference on Machine Learning , pages=. 2020 , organization=

work page 2020
[7]

Advances in Neural Information Processing Systems , volume=

Generalization bound and learning methods for data-driven projections in linear programming , author=. Advances in Neural Information Processing Systems , volume=

work page
[8]

Forty-second International Conference on Machine Learning , year=

Learning to Generate Projections for Reducing Dimensionality of Heterogeneous Linear Programming Problems , author=. Forty-second International Conference on Machine Learning , year=

work page
[9]

2009 , publisher=

Optimal quadratic programming algorithms: with applications to variational inequalities , author=. 2009 , publisher=

work page 2009
[10]

arXiv preprint arXiv:2202.00665 , year=

Tutorial on amortized optimization for learning to optimize over continuous domains (2022) , author=. arXiv preprint arXiv:2202.00665 , year=

work page arXiv 2022
[11]

Mathematical Programming Computation , volume=

Parallelizing the dual revised simplex method , author=. Mathematical Programming Computation , volume=. 2018 , publisher=

work page 2018
[12]

Journal of Machine Learning Research , volume=

Faster randomized interior point methods for tall/wide linear programs , author=. Journal of Machine Learning Research , volume=

work page
[13]

Advances in Neural Information Processing Systems , volume=

Practical large-scale linear programming using primal-dual hybrid gradient , author=. Advances in Neural Information Processing Systems , volume=

work page
[14]

Mathematical Programming , volume=

Random projections for quadratic programs , author=. Mathematical Programming , volume=. 2020 , publisher=

work page 2020
[15]

Random projections for quadratic programs over a

Vu, Ky and Poirion, Pierre-Louis and d’Ambrosio, Claudia and Liberti, Leo , booktitle=. Random projections for quadratic programs over a. 2019 , organization=

work page 2019
[16]

Mathematics of Operations Research , volume=

Random projections for linear programming , author=. Mathematics of Operations Research , volume=. 2018 , publisher=

work page 2018
[17]

2001 , publisher=

Foundations of Mathematical Economics , author=. 2001 , publisher=

work page 2001
[18]

2012 , publisher=

Convergence of Stochastic Processes , author=. 2012 , publisher=

work page 2012
[19]

arXiv preprint arXiv:2011.07177 , year=

Data-driven algorithm design , author=. arXiv preprint arXiv:2011.07177 , year=

work page arXiv 2011
[20]

Communications of the ACM , volume=

Data-driven algorithm design , author=. Communications of the ACM , volume=. 2020 , publisher=

work page 2020
[21]

Advances in Neural Information Processing Systems , volume=

Learning-based low-rank approximations , author=. Advances in Neural Information Processing Systems , volume=

work page
[22]

Learning the Positions in

Li, Yi and Lin, Honghao and Liu, Simin and Vakilian, Ali and Woodruff, David , booktitle=. Learning the Positions in

work page
[23]

Advances in Neural Information Processing Systems , volume=

Sample complexity of automated mechanism design , author=. Advances in Neural Information Processing Systems , volume=

work page
[24]

International Conference on Machine Learning , pages=

Learning to branch , author=. International Conference on Machine Learning , pages=. 2018 , organization=

work page 2018
[25]

Advances in Neural Information Processing Systems , volume=

Learning cut generating functions for integer programming , author=. Advances in Neural Information Processing Systems , volume=

work page
[26]

Bounding the

Goldberg, Paul and Jerrum, Mark , booktitle=. Bounding the

work page
[27]

2006 , publisher=

Numerical Optimization , author=. 2006 , publisher=

work page 2006
[28]

Soviet Math

Iterative solution of problems of linear and quadratic programming , author=. Soviet Math. Dokl. , volume=

work page
[29]

Operations Research , volume=

Column-randomized linear programs: Performance guarantees and applications , author=. Operations Research , volume=. 2025 , publisher=

work page 2025
[30]

Linear Algebra and its Applications , volume=

Random projections of linear and semidefinite problems with linear inequalities , author=. Linear Algebra and its Applications , volume=. 2023 , publisher=

work page 2023
[31]

Journal of Machine Learning Research , volume=

Learning to optimize: A primer and a benchmark , author=. Journal of Machine Learning Research , volume=

work page
[32]

Foundations and Trends

Tutorial on Amortized Optimization , author=. Foundations and Trends. 2023 , publisher=

work page 2023
[33]

European Journal of Operational Research , volume=

Machine learning for combinatorial optimization: a methodological tour d’horizon , author=. European Journal of Operational Research , volume=. 2021 , publisher=

work page 2021
[34]

Journal of Combinatorial Theory, Series A , volume=

On the density of families of sets , author=. Journal of Combinatorial Theory, Series A , volume=. 1972 , publisher=

work page 1972
[35]

2012 , publisher=

Matrix Analysis , author=. 2012 , publisher=

work page 2012
[36]

Proceedings of the Seventh Annual Conference on Computational Learning Theory , pages=

Fat-shattering and the learnability of real-valued functions , author=. Proceedings of the Seventh Annual Conference on Computational Learning Theory , pages=

work page
[37]

Advances in Neural Information Processing Systems , volume=

Sample complexity of algorithm selection using neural networks and its applications to branch-and-cut , author=. Advances in Neural Information Processing Systems , volume=

work page
[38]

2009 , publisher=

Neural Network Learning: Theoretical Foundations , author=. 2009 , publisher=

work page 2009
[39]

Transactions of the American Mathematical Society , volume=

Lower bounds for approximation by nonlinear manifolds , author=. Transactions of the American Mathematical Society , volume=. 1968 , publisher=

work page 1968
[40]

Econometrica , volume=

Envelope theorems for arbitrary choice sets , author=. Econometrica , volume=. 2002 , publisher=

work page 2002
[41]

Advances in Neural Information Processing Systems , volume=

Provably tuning the ElasticNet across instances , author=. Advances in Neural Information Processing Systems , volume=

work page
[42]

Advances in Neural Information Processing Systems , volume=

New bounds for hyperparameter tuning of regression problems across instances , author=. Advances in Neural Information Processing Systems , volume=

work page
[43]

Algorithm Configuration for Structured

Balcan, Maria Florina and Nguyen, Anh Tuan and Sharma, Dravyansh , journal=. Algorithm Configuration for Structured

work page
[44]

Advances in Neural Information Processing Systems , volume=

Sample complexity of data-driven tuning of model hyperparameters in neural networks with structured parameter-dependent dual function , author=. Advances in Neural Information Processing Systems , volume=

work page
[45]

The vehicle routing problem , pages=

An overview of vehicle routing problems , author=. The vehicle routing problem , pages=. 2002 , publisher=

work page 2002
[46]

Approaches to Integer Programming , pages=

Lagrangean relaxation for integer programming , author=. Approaches to Integer Programming , pages=. 2009 , publisher=

work page 2009
[47]

Predicting

Demelas, Francesco and Le Roux, Joseph and Lacroix, Mathieu and Parmentier, Axel , booktitle=. Predicting

work page
[48]

Beasley, John E , journal=. A. 1990 , publisher=

work page 1990
[49]

On the computational complexity and geometry of the first-order theory of the reals

James Renegar , abstract =. On the computational complexity and geometry of the first-order theory of the reals. Part I: Introduction. Preliminaries. The geometry of semi-algebraic sets. The decision problem for the existential theory of the reals , journal =. 1992 , issn =. doi:https://doi.org/10.1016/S0747-7171(10)80003-3 , url =

work page doi:10.1016/s0747-7171(10)80003-3 1992
[50]

Algorithms in Real Algebraic Geometry (Algorithms and Computation in Mathematics) , year =

Basu, Saugata and Pollack, Richard and Roy, Marie-Fran. Algorithms in Real Algebraic Geometry (Algorithms and Computation in Mathematics) , year =

work page
[51]

2019 , publisher=

High-dimensional Statistics: A Non-asymptotic Viewpoint , author=. 2019 , publisher=

work page 2019
[52]

International Conference on Machine Learning , pages=

Provable guarantees for gradient-based meta-learning , author=. International Conference on Machine Learning , pages=. 2019 , organization=

work page 2019
[53]

Introduction to Nonparametric Estimation , pages=

Nonparametric estimators , author=. Introduction to Nonparametric Estimation , pages=. 2008 , publisher=

work page 2008
[54]

Rademacher and

Bartlett, Peter L and Mendelson, Shahar , journal=. Rademacher and

work page
[55]

Probability in

Ledoux, Michel and Talagrand, Michel , volume=. Probability in. 1991 , publisher=

work page 1991
[56]

2014 , publisher=

Understanding Machine Learning: From Theory to Algorithms , author=. 2014 , publisher=

work page 2014
[57]

Integer Programming , pages=

Integer programming models , author=. Integer Programming , pages=. 2014 , publisher=

work page 2014
[58]

2020 , publisher=

Integer Programming , author=. 2020 , publisher=

work page 2020
[59]

AIChE Journal , volume=

Enterprise-wide optimization: A new frontier in process systems engineering , author=. AIChE Journal , volume=. 2005 , publisher=

work page 2005
[60]

IEEE Transactions on Power Systems , volume=

A computationally efficient mixed-integer linear formulation for the thermal unit commitment problem , author=. IEEE Transactions on Power Systems , volume=. 2006 , publisher=

work page 2006
[61]

European Journal of Operational Research , volume=

The vehicle routing problem: An overview of exact and approximate algorithms , author=. European Journal of Operational Research , volume=. 1992 , publisher=

work page 1992
[62]

Handbook of Applied Optimization , volume=

Branch-and-cut algorithms for combinatorial optimization problems , author=. Handbook of Applied Optimization , volume=. 2002 , publisher=

work page 2002
[63]

2014 , publisher=

Vehicle Routing: Problems, Methods, and Applications , author=. 2014 , publisher=

work page 2014
[64]

A solution to the unit commitment problem--

Saravanan, Balasubramanian and Das, Siddharth and Sikri, Surbhi and Kothari, Dwarkadas Pralhaddas , journal=. A solution to the unit commitment problem--. 2013 , publisher=

work page 2013
[65]

Fisher, Marshall L , journal=. The. 1981 , publisher=

work page 1981
[66]

Proceedings of the AAAI Conference on Artificial Intelligence , volume=

Provably data-driven projection method for quadratic programming , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=

work page
[67]

2024 , publisher=

Van Hentenryck, Pascal and Dalmeijer, Kevin , journal=. 2024 , publisher=

work page 2024
[68]

Learning for dynamics and control conference , pages=

End-to-end learning to warm-start for real-time quadratic optimization , author=. Learning for dynamics and control conference , pages=. 2023 , organization=

work page 2023
[69]

Attention, Learn to Solve Routing Problems!

Attention, learn to solve routing problems! , author=. arXiv preprint arXiv:1803.08475 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[70]

Journal of Machine Learning Research , volume=

Combinatorial optimization and reasoning with graph neural networks , author=. Journal of Machine Learning Research , volume=

work page
[71]

Computational Combinatorial Optimization: Optimal or Provably Near-optimal Solutions , pages=

Lagrangian relaxation , author=. Computational Combinatorial Optimization: Optimal or Provably Near-optimal Solutions , pages=. 2001 , publisher=

work page 2001
[72]

Advances in Neural Information Processing Systems , volume=

Sample complexity of tree search configuration: Cutting planes and beyond , author=. Advances in Neural Information Processing Systems , volume=

work page
[73]

Structural analysis of branch-and-cut and the learnability of

Balcan, Maria-Florina F and Prasad, Siddharth and Sandholm, Tuomas and Vitercik, Ellen , journal=. Structural analysis of branch-and-cut and the learnability of

work page
[74]

The Thirty-ninth Annual Conference on Neural Information Processing Systems , year=

Generalization Guarantees for Learning Score-Based Branch-and-Cut Policies in Integer Programming , author=. The Thirty-ninth Annual Conference on Neural Information Processing Systems , year=

work page
[75]

Communications of the ACM , volume=

Algorithms with predictions , author=. Communications of the ACM , volume=. 2022 , publisher=

work page 2022
[76]

Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing , pages=

How much data is sufficient to learn high-performing algorithms? generalization guarantees for data-driven algorithm design , author=. Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing , pages=

work page
[77]

2021 IEEE congress on evolutionary computation (CEC) , pages=

Hyperparameter optimization: Comparing genetic algorithm against grid search and bayesian optimization , author=. 2021 IEEE congress on evolutionary computation (CEC) , pages=. 2021 , organization=

work page 2021
[78]

Advances in Neural Information Processing Systems , volume=

Learning predictions for algorithms with predictions , author=. Advances in Neural Information Processing Systems , volume=

work page
[79]

Automatica , volume=

The explicit linear quadratic regulator for constrained systems , author=. Automatica , volume=. 2002 , publisher=

work page 2002
[80]

On a modification of

Bernstein, Sergei , journal=. On a modification of

work page

Showing first 80 references.

[1] [1]

Conference on Learning Theory , pages=

Generalization bounds for data-driven numerical linear algebra , author=. Conference on Learning Theory , pages=. 2022 , organization=

work page 2022

[2] [2]

Springer Series in Statistics , year=

Convergence of Stochastic Processes , author=. Springer Series in Statistics , year=

work page

[3] [3]

International Conference on Artificial Intelligence and Statistics , pages=

Improved generalization bound and learning of sparsity patterns for data-driven low-rank approximation , author=. International Conference on Artificial Intelligence and Statistics , pages=. 2023 , organization=

work page 2023

[4] [4]

2003 , publisher=

Linear Programming: Methods and Applications , author=. 2003 , publisher=

work page 2003

[5] [5]

An application of the

Decell, Jr, Henry P , journal=. An application of the. 1965 , publisher=

work page 1965

[6] [6]

International Conference on Machine Learning , pages=

Refined bounds for algorithm configuration: The knife-edge of dual class approximability , author=. International Conference on Machine Learning , pages=. 2020 , organization=

work page 2020

[7] [7]

Advances in Neural Information Processing Systems , volume=

Generalization bound and learning methods for data-driven projections in linear programming , author=. Advances in Neural Information Processing Systems , volume=

work page

[8] [8]

Forty-second International Conference on Machine Learning , year=

Learning to Generate Projections for Reducing Dimensionality of Heterogeneous Linear Programming Problems , author=. Forty-second International Conference on Machine Learning , year=

work page

[9] [9]

2009 , publisher=

Optimal quadratic programming algorithms: with applications to variational inequalities , author=. 2009 , publisher=

work page 2009

[10] [10]

arXiv preprint arXiv:2202.00665 , year=

Tutorial on amortized optimization for learning to optimize over continuous domains (2022) , author=. arXiv preprint arXiv:2202.00665 , year=

work page arXiv 2022

[11] [11]

Mathematical Programming Computation , volume=

Parallelizing the dual revised simplex method , author=. Mathematical Programming Computation , volume=. 2018 , publisher=

work page 2018

[12] [12]

Journal of Machine Learning Research , volume=

Faster randomized interior point methods for tall/wide linear programs , author=. Journal of Machine Learning Research , volume=

work page

[13] [13]

Advances in Neural Information Processing Systems , volume=

Practical large-scale linear programming using primal-dual hybrid gradient , author=. Advances in Neural Information Processing Systems , volume=

work page

[14] [14]

Mathematical Programming , volume=

Random projections for quadratic programs , author=. Mathematical Programming , volume=. 2020 , publisher=

work page 2020

[15] [15]

Random projections for quadratic programs over a

Vu, Ky and Poirion, Pierre-Louis and d’Ambrosio, Claudia and Liberti, Leo , booktitle=. Random projections for quadratic programs over a. 2019 , organization=

work page 2019

[16] [16]

Mathematics of Operations Research , volume=

Random projections for linear programming , author=. Mathematics of Operations Research , volume=. 2018 , publisher=

work page 2018

[17] [17]

2001 , publisher=

Foundations of Mathematical Economics , author=. 2001 , publisher=

work page 2001

[18] [18]

2012 , publisher=

Convergence of Stochastic Processes , author=. 2012 , publisher=

work page 2012

[19] [19]

arXiv preprint arXiv:2011.07177 , year=

Data-driven algorithm design , author=. arXiv preprint arXiv:2011.07177 , year=

work page arXiv 2011

[20] [20]

Communications of the ACM , volume=

Data-driven algorithm design , author=. Communications of the ACM , volume=. 2020 , publisher=

work page 2020

[21] [21]

Advances in Neural Information Processing Systems , volume=

Learning-based low-rank approximations , author=. Advances in Neural Information Processing Systems , volume=

work page

[22] [22]

Learning the Positions in

Li, Yi and Lin, Honghao and Liu, Simin and Vakilian, Ali and Woodruff, David , booktitle=. Learning the Positions in

work page

[23] [23]

Advances in Neural Information Processing Systems , volume=

Sample complexity of automated mechanism design , author=. Advances in Neural Information Processing Systems , volume=

work page

[24] [24]

International Conference on Machine Learning , pages=

Learning to branch , author=. International Conference on Machine Learning , pages=. 2018 , organization=

work page 2018

[25] [25]

Advances in Neural Information Processing Systems , volume=

Learning cut generating functions for integer programming , author=. Advances in Neural Information Processing Systems , volume=

work page

[26] [26]

Bounding the

Goldberg, Paul and Jerrum, Mark , booktitle=. Bounding the

work page

[27] [27]

2006 , publisher=

Numerical Optimization , author=. 2006 , publisher=

work page 2006

[28] [28]

Soviet Math

Iterative solution of problems of linear and quadratic programming , author=. Soviet Math. Dokl. , volume=

work page

[29] [29]

Operations Research , volume=

Column-randomized linear programs: Performance guarantees and applications , author=. Operations Research , volume=. 2025 , publisher=

work page 2025

[30] [30]

Linear Algebra and its Applications , volume=

Random projections of linear and semidefinite problems with linear inequalities , author=. Linear Algebra and its Applications , volume=. 2023 , publisher=

work page 2023

[31] [31]

Journal of Machine Learning Research , volume=

Learning to optimize: A primer and a benchmark , author=. Journal of Machine Learning Research , volume=

work page

[32] [32]

Foundations and Trends

Tutorial on Amortized Optimization , author=. Foundations and Trends. 2023 , publisher=

work page 2023

[33] [33]

European Journal of Operational Research , volume=

Machine learning for combinatorial optimization: a methodological tour d’horizon , author=. European Journal of Operational Research , volume=. 2021 , publisher=

work page 2021

[34] [34]

Journal of Combinatorial Theory, Series A , volume=

On the density of families of sets , author=. Journal of Combinatorial Theory, Series A , volume=. 1972 , publisher=

work page 1972

[35] [35]

2012 , publisher=

Matrix Analysis , author=. 2012 , publisher=

work page 2012

[36] [36]

Proceedings of the Seventh Annual Conference on Computational Learning Theory , pages=

Fat-shattering and the learnability of real-valued functions , author=. Proceedings of the Seventh Annual Conference on Computational Learning Theory , pages=

work page

[37] [37]

Advances in Neural Information Processing Systems , volume=

Sample complexity of algorithm selection using neural networks and its applications to branch-and-cut , author=. Advances in Neural Information Processing Systems , volume=

work page

[38] [38]

2009 , publisher=

Neural Network Learning: Theoretical Foundations , author=. 2009 , publisher=

work page 2009

[39] [39]

Transactions of the American Mathematical Society , volume=

Lower bounds for approximation by nonlinear manifolds , author=. Transactions of the American Mathematical Society , volume=. 1968 , publisher=

work page 1968

[40] [40]

Econometrica , volume=

Envelope theorems for arbitrary choice sets , author=. Econometrica , volume=. 2002 , publisher=

work page 2002

[41] [41]

Advances in Neural Information Processing Systems , volume=

Provably tuning the ElasticNet across instances , author=. Advances in Neural Information Processing Systems , volume=

work page

[42] [42]

Advances in Neural Information Processing Systems , volume=

New bounds for hyperparameter tuning of regression problems across instances , author=. Advances in Neural Information Processing Systems , volume=

work page

[43] [43]

Algorithm Configuration for Structured

Balcan, Maria Florina and Nguyen, Anh Tuan and Sharma, Dravyansh , journal=. Algorithm Configuration for Structured

work page

[44] [44]

Advances in Neural Information Processing Systems , volume=

Sample complexity of data-driven tuning of model hyperparameters in neural networks with structured parameter-dependent dual function , author=. Advances in Neural Information Processing Systems , volume=

work page

[45] [45]

The vehicle routing problem , pages=

An overview of vehicle routing problems , author=. The vehicle routing problem , pages=. 2002 , publisher=

work page 2002

[46] [46]

Approaches to Integer Programming , pages=

Lagrangean relaxation for integer programming , author=. Approaches to Integer Programming , pages=. 2009 , publisher=

work page 2009

[47] [47]

Predicting

Demelas, Francesco and Le Roux, Joseph and Lacroix, Mathieu and Parmentier, Axel , booktitle=. Predicting

work page

[48] [48]

Beasley, John E , journal=. A. 1990 , publisher=

work page 1990

[49] [49]

On the computational complexity and geometry of the first-order theory of the reals

James Renegar , abstract =. On the computational complexity and geometry of the first-order theory of the reals. Part I: Introduction. Preliminaries. The geometry of semi-algebraic sets. The decision problem for the existential theory of the reals , journal =. 1992 , issn =. doi:https://doi.org/10.1016/S0747-7171(10)80003-3 , url =

work page doi:10.1016/s0747-7171(10)80003-3 1992

[50] [50]

Algorithms in Real Algebraic Geometry (Algorithms and Computation in Mathematics) , year =

Basu, Saugata and Pollack, Richard and Roy, Marie-Fran. Algorithms in Real Algebraic Geometry (Algorithms and Computation in Mathematics) , year =

work page

[51] [51]

2019 , publisher=

High-dimensional Statistics: A Non-asymptotic Viewpoint , author=. 2019 , publisher=

work page 2019

[52] [52]

International Conference on Machine Learning , pages=

Provable guarantees for gradient-based meta-learning , author=. International Conference on Machine Learning , pages=. 2019 , organization=

work page 2019

[53] [53]

Introduction to Nonparametric Estimation , pages=

Nonparametric estimators , author=. Introduction to Nonparametric Estimation , pages=. 2008 , publisher=

work page 2008

[54] [54]

Rademacher and

Bartlett, Peter L and Mendelson, Shahar , journal=. Rademacher and

work page

[55] [55]

Probability in

Ledoux, Michel and Talagrand, Michel , volume=. Probability in. 1991 , publisher=

work page 1991

[56] [56]

2014 , publisher=

Understanding Machine Learning: From Theory to Algorithms , author=. 2014 , publisher=

work page 2014

[57] [57]

Integer Programming , pages=

Integer programming models , author=. Integer Programming , pages=. 2014 , publisher=

work page 2014

[58] [58]

2020 , publisher=

Integer Programming , author=. 2020 , publisher=

work page 2020

[59] [59]

AIChE Journal , volume=

Enterprise-wide optimization: A new frontier in process systems engineering , author=. AIChE Journal , volume=. 2005 , publisher=

work page 2005

[60] [60]

IEEE Transactions on Power Systems , volume=

A computationally efficient mixed-integer linear formulation for the thermal unit commitment problem , author=. IEEE Transactions on Power Systems , volume=. 2006 , publisher=

work page 2006

[61] [61]

European Journal of Operational Research , volume=

The vehicle routing problem: An overview of exact and approximate algorithms , author=. European Journal of Operational Research , volume=. 1992 , publisher=

work page 1992

[62] [62]

Handbook of Applied Optimization , volume=

Branch-and-cut algorithms for combinatorial optimization problems , author=. Handbook of Applied Optimization , volume=. 2002 , publisher=

work page 2002

[63] [63]

2014 , publisher=

Vehicle Routing: Problems, Methods, and Applications , author=. 2014 , publisher=

work page 2014

[64] [64]

A solution to the unit commitment problem--

Saravanan, Balasubramanian and Das, Siddharth and Sikri, Surbhi and Kothari, Dwarkadas Pralhaddas , journal=. A solution to the unit commitment problem--. 2013 , publisher=

work page 2013

[65] [65]

Fisher, Marshall L , journal=. The. 1981 , publisher=

work page 1981

[66] [66]

Proceedings of the AAAI Conference on Artificial Intelligence , volume=

Provably data-driven projection method for quadratic programming , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=

work page

[67] [67]

2024 , publisher=

Van Hentenryck, Pascal and Dalmeijer, Kevin , journal=. 2024 , publisher=

work page 2024

[68] [68]

Learning for dynamics and control conference , pages=

End-to-end learning to warm-start for real-time quadratic optimization , author=. Learning for dynamics and control conference , pages=. 2023 , organization=

work page 2023

[69] [69]

Attention, Learn to Solve Routing Problems!

Attention, learn to solve routing problems! , author=. arXiv preprint arXiv:1803.08475 , year=

work page internal anchor Pith review Pith/arXiv arXiv

[70] [70]

Journal of Machine Learning Research , volume=

Combinatorial optimization and reasoning with graph neural networks , author=. Journal of Machine Learning Research , volume=

work page

[71] [71]

Computational Combinatorial Optimization: Optimal or Provably Near-optimal Solutions , pages=

Lagrangian relaxation , author=. Computational Combinatorial Optimization: Optimal or Provably Near-optimal Solutions , pages=. 2001 , publisher=

work page 2001

[72] [72]

Advances in Neural Information Processing Systems , volume=

Sample complexity of tree search configuration: Cutting planes and beyond , author=. Advances in Neural Information Processing Systems , volume=

work page

[73] [73]

Structural analysis of branch-and-cut and the learnability of

Balcan, Maria-Florina F and Prasad, Siddharth and Sandholm, Tuomas and Vitercik, Ellen , journal=. Structural analysis of branch-and-cut and the learnability of

work page

[74] [74]

The Thirty-ninth Annual Conference on Neural Information Processing Systems , year=

Generalization Guarantees for Learning Score-Based Branch-and-Cut Policies in Integer Programming , author=. The Thirty-ninth Annual Conference on Neural Information Processing Systems , year=

work page

[75] [75]

Communications of the ACM , volume=

Algorithms with predictions , author=. Communications of the ACM , volume=. 2022 , publisher=

work page 2022

[76] [76]

Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing , pages=

How much data is sufficient to learn high-performing algorithms? generalization guarantees for data-driven algorithm design , author=. Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing , pages=

work page

[77] [77]

2021 IEEE congress on evolutionary computation (CEC) , pages=

Hyperparameter optimization: Comparing genetic algorithm against grid search and bayesian optimization , author=. 2021 IEEE congress on evolutionary computation (CEC) , pages=. 2021 , organization=

work page 2021

[78] [78]

Advances in Neural Information Processing Systems , volume=

Learning predictions for algorithms with predictions , author=. Advances in Neural Information Processing Systems , volume=

work page

[79] [79]

Automatica , volume=

The explicit linear quadratic regulator for constrained systems , author=. Automatica , volume=. 2002 , publisher=

work page 2002

[80] [80]

On a modification of

Bernstein, Sergei , journal=. On a modification of

work page