ICNN-enhanced 2SP: Leveraging input convex neural networks for solving two-stage stochastic programming

Fabricio Oliveira; Jan Kronqvist; Yu Liu

arxiv: 2505.05261 · v3 · submitted 2025-05-08 · 🧮 math.OC · cs.LG

ICNN-enhanced 2SP: Leveraging input convex neural networks for solving two-stage stochastic programming

Yu Liu , Fabricio Oliveira , Jan Kronqvist This is my paper

Pith reviewed 2026-05-22 16:25 UTC · model grok-4.3

classification 🧮 math.OC cs.LG

keywords two-stage stochastic programminginput convex neural networksrecourse function approximationlinear programming formulationmixed-integer programmingconvex optimizationdecision making under uncertainty

0 comments

The pith

Input convex neural networks turn recourse approximations in two-stage stochastic programming into exact linear programs, eliminating integer variables.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows how input convex neural networks can approximate the recourse function inside two-stage stochastic programs. Their built-in convexity lets the surrogate be represented exactly with linear constraints, so the full problem solves as a linear program. Earlier neural methods needed mixed-integer programming and ran slowly. This change cuts solution times substantially on larger instances while keeping or improving the quality of the first-stage decisions. Anyone solving planning problems with uncertain future costs or demands would see the practical difference in what sizes of problem become feasible.

Core claim

By using input convex neural networks instead of ordinary neural networks to model the recourse function, the two-stage stochastic program can be embedded exactly into a linear programming formulation. The architectural convexity removes the need for integer variables required by mixed-integer programming embeddings. On benchmark problems the approach delivers solution times up to 100 times faster than MIP baselines while producing first-stage solutions of comparable or better quality.

What carries the argument

Input Convex Neural Networks (ICNNs), whose architecture enforces convexity so that their output can be expressed exactly as a set of linear inequalities inside the overall optimization model.

If this is right

Solution times decrease sharply as the number of scenarios or decision variables grows.
Training time for the ICNN surrogate remains close to that of a standard neural network.
Validation accuracy of the ICNN matches that of ordinary networks on the tested problems.
The largest speedups appear on the most computationally demanding benchmark instances.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same convexity-enforcing idea could be tested on other convex stochastic or robust optimization models that currently rely on non-convex surrogates.
Combining the ICNN embedding with Benders decomposition or column generation might further scale the method to problems with very large uncertainty sets.
Real-time applications such as dynamic pricing or energy dispatch under demand uncertainty could become solvable at higher resolution.

Load-bearing premise

The recourse functions in the target problems are convex enough that an ICNN can be trained to approximate them with error low enough to preserve good first-stage decisions.

What would settle it

On a collection of instances whose true recourse functions are known to be convex, compare the first-stage objective values obtained from the ICNN-linear program against those from an exact MIP solver; a consistent gap larger than the reported validation error would falsify the performance claim.

Figures

Figures reproduced from arXiv: 2505.05261 by Fabricio Oliveira, Jan Kronqvist, Yu Liu.

**Figure 2.** Figure 2: Architecture of a fully connected ICNN. 2.2. ICNN architectures We consider a fully connected K-layer ICNN, as illustrated in [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗

read the original abstract

Two-stage stochastic programming (2SP) offers a basic framework for modelling decision-making under uncertainty, yet scalability remains a challenge due to the computational complexity of recourse function evaluation. Existing learning-based methods like Neural Two-Stage Stochastic Programming (Neur2SP) employ neural networks (NNs) as recourse function surrogates but rely on computationally intensive mixed-integer programming (MIP) formulations. We propose ICNN-enhanced 2SP, a method that leverages Input Convex Neural Networks (ICNNs) to exploit linear programming (LP) representability in convex 2SP problems. By architecturally enforcing convexity and enabling exact inference through LP, our approach eliminates the need for integer variables inherent to the conventional MIP-based formulation while retaining an exact embedding of the ICNN surrogate within the 2SP framework. This results in a more computationally efficient alternative, and we show that good solution quality can be maintained. Comprehensive experiments reveal that ICNNs incur only marginally longer training times while achieving validation accuracy on par with their standard NN counterparts. Across benchmark problems, ICNN-enhanced 2SP often exhibits considerably faster solution times than the MIP-based formulations while preserving solution quality, with these advantages becoming significantly more pronounced as problem scale increases. For the most challenging instances, the method achieves speedups of up to 100$\times$ and solution quality superior to MIP-based formulations.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

ICNNs let them embed neural recourse surrogates as LPs instead of MIPs, which cuts solve times sharply on the benchmarks while holding solution quality steady.

read the letter

The central move here is swapping standard neural nets for input-convex ones so the recourse surrogate becomes exactly representable as an LP. That removes the integer variables that Neur2SP-style methods carry, and the experiments show the payoff in wall-clock time, especially once the instances get larger. Training cost stays comparable to ordinary networks and validation error does not degrade noticeably, which is the practical win they highlight.

Referee Report

2 major / 2 minor

Summary. The paper proposes ICNN-enhanced 2SP, a method that replaces standard neural network surrogates for recourse functions in two-stage stochastic programming with Input Convex Neural Networks. By architecturally enforcing convexity, the ICNN surrogate can be exactly embedded as linear constraints in an LP formulation of the first-stage problem, eliminating the integer variables required by MIP-based embeddings used in prior work such as Neur2SP. Experiments on benchmark problems report that ICNN training accuracy is comparable to standard NNs, solution quality is preserved or improved, and solution times are substantially faster (up to 100x on large instances) as problem scale increases.

Significance. If the central performance claims are substantiated, the approach would offer a scalable alternative for convex 2SP by converting a hard MIP into a faster LP while retaining decision quality. The work correctly exploits known ICNN properties for convexity and LP representability, and the reported speedups on challenging instances suggest practical impact for large-scale stochastic optimization.

major comments (2)

[Experiments] Experimental section: the claim that 'good solution quality can be maintained' and is 'superior to MIP-based formulations' on the most challenging instances requires explicit post-hoc evaluation of the true expected recourse cost (using the exact solver on the first-stage decisions obtained from the ICNN-LP) together with statistical significance tests or confidence intervals across replications. Surrogate validation accuracy alone does not guarantee that approximation error in regions visited by the optimizer leaves first-stage optimality intact.
[Method] Formulation section: the method assumes the recourse functions are sufficiently convex for an ICNN to achieve low approximation error in the relevant domain; the paper should state this assumption explicitly, provide a diagnostic (e.g., convexity check or residual analysis on validation points near optimal first-stage solutions), and discuss the risk that violation of the assumption could degrade true objective values even when surrogate metrics appear acceptable.

minor comments (2)

[Abstract] Abstract and experiments: replace the qualitative phrase 'on par with' by reporting concrete metrics (e.g., mean squared error or relative error on the validation set) and reference the corresponding table or figure.
[Formulation] Notation: define the precise mapping from ICNN weights/biases to the linear inequalities that embed the surrogate inside the first-stage LP; a small illustrative example would improve clarity.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed comments. We address each major comment below and indicate the revisions we will incorporate to strengthen the manuscript.

read point-by-point responses

Referee: [Experiments] Experimental section: the claim that 'good solution quality can be maintained' and is 'superior to MIP-based formulations' on the most challenging instances requires explicit post-hoc evaluation of the true expected recourse cost (using the exact solver on the first-stage decisions obtained from the ICNN-LP) together with statistical significance tests or confidence intervals across replications. Surrogate validation accuracy alone does not guarantee that approximation error in regions visited by the optimizer leaves first-stage optimality intact.

Authors: We agree that surrogate validation accuracy alone is insufficient to fully substantiate claims of maintained or superior solution quality. In the revised manuscript we will add explicit post-hoc evaluations of the true expected recourse cost by applying the exact solver to the first-stage decisions obtained from the ICNN-LP formulation. We will also report statistical significance tests or confidence intervals computed across multiple independent replications. These additions will directly address whether approximation error in optimizer-visited regions preserves first-stage optimality and will allow a more rigorous comparison with MIP-based baselines. revision: yes
Referee: [Method] Formulation section: the method assumes the recourse functions are sufficiently convex for an ICNN to achieve low approximation error in the relevant domain; the paper should state this assumption explicitly, provide a diagnostic (e.g., convexity check or residual analysis on validation points near optimal first-stage solutions), and discuss the risk that violation of the assumption could degrade true objective values even when surrogate metrics appear acceptable.

Authors: We acknowledge the value of making the convexity assumption fully explicit. Although the manuscript already frames the method for convex 2SP problems, we will add a dedicated statement of the assumption in the formulation section. We will further include a diagnostic (residual analysis on validation points near candidate optimal first-stage solutions) and a discussion of the risks if the assumption is violated, including potential degradation of true objective values. These changes will clarify the scope and limitations of the approach. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected in derivation chain

full rationale

The paper proposes embedding ICNN surrogates into the first-stage LP of 2SP problems by exploiting the known convexity-preserving architecture of ICNNs, which permits an exact LP representation without integer variables. This step rests on standard properties of ICNNs established in prior independent literature rather than any self-definition, fitted parameter renamed as prediction, or self-citation chain that reduces the claimed speedup or solution quality to the inputs by construction. Empirical speedups and maintained solution quality are reported from benchmark experiments and are not asserted tautologically; the derivation therefore remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that recourse functions are convex and on the standard training procedure for ICNNs; no new entities are postulated and the only free parameters are the usual neural-network weights and hyperparameters.

free parameters (1)

ICNN weights and biases
Standard trainable parameters of the neural network surrogate, fitted to recourse data during training.

axioms (1)

domain assumption Recourse function is convex in the first-stage decision for the problems considered
Invoked to justify that an ICNN can serve as an exact convex surrogate embeddable in LP.

pith-pipeline@v0.9.0 · 5779 in / 1284 out tokens · 88441 ms · 2026-05-22T16:25:25.451568+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

By architecturally enforcing convexity and enabling exact inference through LP, our approach eliminates the need for integer variables... ICNNs are architecturally constrained to ensure convexity with respect to their inputs, enabling exact inference through LP formulations
IndisputableMonolith/Foundation/AlphaCoordinateFixation.lean costAlphaLog_high_calibrated_iff unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Q(x, ξs) is the pointwise maximum of affine functions... Thus Q(x) is convex in x

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Relaxation-Informed Training of Neural Network Surrogate Models
math.OC 2026-04 conditional novelty 7.0

Regularizers that penalize big-M constants, unstable neurons, and per-sample LP relaxation gaps during neural network training reduce MILP solve times by up to four orders of magnitude while preserving surrogate accuracy.
SOC-ICNN: From Polyhedral to Conic Geometry for Learning Convex Surrogate Functions
cs.LG 2026-04 unverdicted novelty 7.0

SOC-ICNN generalizes ReLU-based ICNNs to SOCP, strictly expanding the class of representable convex functions while preserving similar forward-pass complexity.

Reference graph

Works this paper leans on

47 extracted references · 47 canonical work pages · cited by 2 Pith papers

[1]

Römisch, S

W. Römisch, S. Vigerske, Recent Progress in Two-stage Mixed-integer Stochastic Programming with Applications to Power Production Planning, in: P. M. Pardalos, S. Rebennack, M. V. F. 23 Pereira, N. A. Iliadis (Eds.), Handbook of Power Systems I, Springer, Berlin, Heidelberg, 2010, pp. 177–208. doi:10.1007/978-3-642-02493-1_8

work page doi:10.1007/978-3-642-02493-1_8 2010
[2]

C. S. Khor, A. Elkamel, K. Ponnambalam, P. L. Douglas, Two-stage stochastic programming with fixed recourse via scenario planning with economic and operational risk management for petroleum refinery planning under uncertainty, Chemical Engineering and Processing: Process Intensification 47 (2008) 1744–1764. doi:10.1016/j.cep.2007.09.016

work page doi:10.1016/j.cep.2007.09.016 2008
[3]

doi:10.1007/BF01580086

A.Shapiro, T.Homem-deMello, Asimulation-basedapproachtotwo-stagestochasticprogram- ming with recourse, Mathematical Programming 81 (1998) 301–325. doi:10.1007/BF01580086

work page doi:10.1007/bf01580086 1998
[4]

Heitsch, W

H. Heitsch, W. Römisch, Scenario Reduction Algorithms in Stochastic Programming, Com- putational Optimization and Applications 24 (2003) 187–206. doi:10.1023/A:1021805924152

work page doi:10.1023/a:1021805924152 2003
[5]

H. D. Sherali, B. M. Fraticelli, A modification of Benders’ decomposition algorithm for discrete subproblems: An approach for stochastic programs with integer recourse, Journal of Global Optimization 22 (2002) 319–342. doi:10.1023/A:1013827731218

work page doi:10.1023/a:1013827731218 2002
[6]

C. C. Carøe, R. Schultz, Dual decomposition in stochastic integer programming, Operations Research Letters 24 (1999) 37–45. doi:10.1016/S0167-6377(98)00050-9

work page doi:10.1016/s0167-6377(98)00050-9 1999
[7]

Küçükyavuz, S

S. Küçükyavuz, S. Sen, An Introduction to Two-Stage Stochastic Mixed-Integer Programming, in: Leading Developments from INFORMS Communities, INFORMS TutORials in Operations Research, INFORMS, 2017, pp. 1–27. doi:10.1287/educ.2017.0171

work page doi:10.1287/educ.2017.0171 2017
[8]

Varga, E

J. Varga, E. Karlsson, G. R. Raidl, E. Rönnberg, F. Lindsten, T. Rodemann, Speeding Up Logic-Based Benders Decomposition by Strengthening Cuts with Graph Neural Networks, in: G. Nicosia, V. Ojha, E. La Malfa, G. La Malfa, P. M. Pardalos, R. Umeton (Eds.), Machine Learning, Optimization, and Data Science, Springer Nature Switzerland, Cham, 2024, pp. 24–38....

work page doi:10.1007/978-3-031-53969-5_3 2024
[9]

H. Dai, Y. Xue, Z. Syed, D. Schuurmans, B. Dai, Neural Stochastic Dual Dynamic Program- ming, 2021. doi:10.48550/arXiv.2112.00874

work page doi:10.48550/arxiv.2112.00874 2021
[10]

H. Bae, J. Lee, W. C. Kim, Y. Lee, Deep Value Function Networks for Large-Scale Multi- stage Stochastic Programs, in: Proceedings of The 26th International Conference on Artificial Intelligence and Statistics, PMLR, 2023, pp. 11267–11287. URL:https://proceedings.mlr. press/v206/bae23a.html

work page 2023
[11]

Larsen, E

E. Larsen, E. Frejinger, B. Gendron, A. Lodi, Fast Continuous and Integer L-Shaped Heuristics Through Supervised Learning, INFORMS Journal on Computing 36 (2024) 203–223. doi:10. 1287/ijoc.2022.0175. 24

work page arXiv 2024
[12]

Anh-Nguyen, J

T. Anh-Nguyen, J. Huchette, C. Tjandraatmadja, Learning Generalized Linear Programming Value Functions, in: A. Globerson, L. Mackey, D. Belgrave, A. Fan, U. Paquet, J. Tom- czak, C. Zhang (Eds.), Advances in Neural Information Processing Systems, volume 37, Cur- ran Associates, Inc., 2024, pp. 135437–135463. URL:https://dl.acm.org/doi/abs/10.5555/ 3737916.3742221

work page arXiv 2024
[13]

Bengio, E

Y. Bengio, E. Frejinger, A. Lodi, R. Patel, S. Sankaranarayanan, A Learning-Based Al- gorithm to Quickly Compute Good Primal Solutions for Stochastic Integer Programs, in: E. Hebrard, N. Musliu (Eds.), Integration of Constraint Programming, Artificial Intelli- gence, and Operations Research, Springer International Publishing, Cham, 2020, pp. 99–111. doi:1...

work page doi:10.1007/978-3-030-58942-4_7 2020
[14]

Y. Wu, W. Song, Z. Cao, J. Zhang, Learning Scenario Representation for Solving Two-stage Stochastic Integer Programs, in: Proceedings of the 10th International Conference on Learning Representations, 2021. URL:https://openreview.net/forum?id=06Wy2BtxXrz

work page 2021
[15]

Y. Wu, Y. Zhang, Z. Liang, J. Cheng, HGCN2SP: Hierarchical Graph Convolutional Network for Two-Stage Stochastic Programming, in: Proceedings of the 41st International Conference on Machine Learning, 2024, pp. 53999 – 54014. doi:10.5555/3692070.3694285

work page doi:10.5555/3692070.3694285 2024
[16]

C. D. Hubbs, H. D. Perez, O. Sarwar, N. V. Sahinidis, I. E. Grossmann, J. M. Wassick, OR- Gym: A Reinforcement Learning Library for Operations Research Problems, 2020. doi:10. 48550/arXiv.2008.06319

work page arXiv 2020
[17]

Yilmaz, İ

D. Yilmaz, İ. E. Büyüktahtakın, A deep reinforcement learning framework for solving two-stage stochastic programs, Optimization Letters 18 (2024) 1993–2020. doi:10.1007/ s11590-023-02009-5

work page 2024
[18]

Stranieri, E

F. Stranieri, E. Fadda, F. Stella, Combining deep reinforcement learning and multi-stage stochastic programming to address the supply chain inventory management problem, Interna- tionalJournalofProductionEconomics268(2024)109099.doi:10.1016/j.ijpe.2023.109099

work page doi:10.1016/j.ijpe.2023.109099 2024
[19]

Dumouchelle, R

J. Dumouchelle, R. Patel, E. B. Khalil, M. Bodur, Neur2SP: Neural Two-Stage Stochastic Programming, 2022. doi:10.48550/arXiv.2205.12006

work page doi:10.48550/arxiv.2205.12006 2022
[20]

Fischetti, J

M. Fischetti, J. Jo, Deep neural networks and mixed integer linear optimization, Constraints 23 (2018) 296–309. doi:10.1007/s10601-018-9285-6

work page doi:10.1007/s10601-018-9285-6 2018
[21]

Kronqvist, B

J. Kronqvist, B. Li, J. Rolfes, S. Zhao, Alternating Mixed-Integer Programming and Neu- ral Network Training for Approximating Stochastic Two-Stage Problems, in: G. Nicosia, V. Ojha, E. La Malfa, G. La Malfa, P. M. Pardalos, R. Umeton (Eds.), Machine Learn- ing, Optimization, and Data Science, Springer Nature Switzerland, Cham, 2024, pp. 124–139. doi:10.1...

work page doi:10.1007/978-3-031-53966-4_10 2024
[22]

Y. Liu, F. Oliveira, Simulator-based surrogate optimisation employing adaptive uncertainty- aware sampling, Computers & Chemical Engineering 201 (2025) 109243. doi:10.1016/j. compchemeng.2025.109243

work page doi:10.1016/j 2025
[23]

B. Amos, L. Xu, J. Z. Kolter, Input Convex Neural Networks, in: Proceedings of the 34th International Conference on Machine Learning, PMLR, 2017, pp. 146–155. URL:https:// proceedings.mlr.press/v70/amos17b.html

work page 2017
[24]

Duchesne, Q

L. Duchesne, Q. Louveaux, L. Wehenkel, Supervised learning of convex piecewise linear ap- proximations of optimization problems, in: Proceedings of the 29th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN), 2021, pp. 493–498. doi:10.14428/esann/2021.ES2021-74

work page doi:10.14428/esann/2021.es2021-74 2021
[25]

Agrawal, R

S. Agrawal, R. Jia, Learning in Structured MDPs with Convex Cost Functions: Improved Regret Bounds for Inventory Management, in: Proceedings of the 2019 ACM Conference on Economics and Computation, EC ’19, Association for Computing Machinery, New York, NY, USA, 2019, pp. 743–744. doi:10.1145/3328526.3329565

work page doi:10.1145/3328526.3329565 2019
[26]

Zhong, X

H. Zhong, X. Yan, Z. Tan, Real-Time Distributed Economic Dispatch Adapted to General Convex Cost Functions: A Secant Approximation-Based Method, IEEE Transactions on Smart Grid 12 (2021) 2089–2101. doi:10.1109/TSG.2020.3049054

work page doi:10.1109/tsg.2020.3049054 2021
[27]

Luxenberg, S

E. Luxenberg, S. Boyd, Portfolio construction with Gaussian mixture returns and exponential utility via convex optimization, Optimization and Engineering 25 (2024) 555–574. doi:10. 1007/s11081-023-09814-y

work page 2024
[28]

Y. Chen, Y. Shi, B. Zhang, Data-Driven Optimal Voltage Regulation Using Input Convex Neural Networks, Electric Power Systems Research 189 (2020) 106741. doi:10.1016/j.epsr. 2020.106741

work page doi:10.1016/j.epsr 2020
[29]

Rosemberg, M

A. Rosemberg, M. Tanneau, B. Fanzeres, J. Garcia, P. Van Hentenryck, Learning Optimal Power Flow value functions with input-convex neural networks, Electric Power Systems Re- search 235 (2024) 110643. doi:10.1016/j.epsr.2024.110643

work page doi:10.1016/j.epsr.2024.110643 2024
[30]

S. Yang, B. W. Bequette, Optimization-based control using input convex neural networks, Computers & Chemical Engineering 144 (2021) 107143. doi:10.1016/j.compchemeng.2020. 107143

work page doi:10.1016/j.compchemeng.2020 2021
[31]

Bünning, A

F. Bünning, A. Schalbetter, A. Aboudonia, M. H. d. Badyn, P. Heer, J. Lygeros, Input Convex Neural Networks for Building MPC, in: Proceedings of the 3rd Conference on Learning for Dynamics and Control, PMLR, 2021, pp. 251–262. URL:https://proceedings.mlr.press/ v144/bunning21a.html. 26

work page 2021
[32]

Jiang, C

K. Jiang, C. Hu, F. Yan, Path-following control of autonomous ground vehicles based on input convex neural networks, Proceedings of the Institution of Mechanical Engineers, Part D: Journal of Automobile Engineering 236 (2022) 2806–2816. doi:10.1177/09544070221114690

work page doi:10.1177/09544070221114690 2022
[33]

W. Wang, H. Zhang, Y. Wang, Y. Tian, Z. Wu, Fast Explicit Machine Learning-Based Model Predictive Control of Nonlinear Processes Using Input Convex Neural Networks, Industrial & Engineering Chemistry Research 63 (2024) 17279–17293. doi:10.1021/acs.iecr.4c02257

work page doi:10.1021/acs.iecr.4c02257 2024
[34]

S. P. Boyd, L. Vandenberghe, Convex optimization, Cambridge university press, 2004. doi:10. 1017/CBO9780511804441

work page 2004
[35]

J. R. Birge, F. Louveaux, Introduction to Stochastic Programming, Springer Series in Op- erations Research and Financial Engineering, Springer, New York, NY, 2011. doi:10.1007/ 978-1-4614-0237-4

work page 2011
[36]

H. J. M. Peters, P. P. Wakker, Convex functions on non-convex domains, Economics Letters 22 (1986) 251–255. doi:10.1016/0165-1765(86)90242-9

work page doi:10.1016/0165-1765(86)90242-9 1986
[37]

Hassanzadeh, T

A. Hassanzadeh, T. K. Ralphs, On the Value Function of a Mixed Integer Linear Optimization Problem and an Algorithm for its Construction, 2014. URL:https://optimization-online. org/?p=12986

work page 2014
[38]

Patel, J

R. Patel, J. Dumouchelle, Khalil-research/Neur2SP, 2022. URL:https://github.com/ khalil-research/Neur2SP

work page 2022
[39]

URL:https://www

Gurobi Optimization, LLC, Gurobi Optimizer Reference Manual, 2024. URL:https://www. gurobi.com

work page 2024
[40]

Paszke, S

A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, A. Desmaison, A. Kopf, E. Yang, Z. DeVito, M. Raison, A. Te- jani, S. Chilamkurthy, B. Steiner, L. Fang, J. Bai, S. Chintala, PyTorch: An imperative style, high-performance deep learning library, in: H. Wallach, H. Larochelle, A. Beygelzimer, F. ...

work page arXiv 2019
[41]

Pedregosa, G

F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, Édouard Duchesnay, Scikit-learn: Machine learning in python, Journal of Machine Learning Research 12 (2011) 2825–2830. URL:http://jmlr.org/papers/v12/pedregosa11a. html. 27

work page 2011
[42]

Cornuejols, R

G. Cornuejols, R. Sridharan, J. M. Thizy, A comparison of heuristics and relaxations for the capacitated plant location problem, European Journal of Operational Research 50 (1991) 280–297. doi:10.1016/0377-2217(91)90261-S

work page doi:10.1016/0377-2217(91)90261-s 1991
[43]

Ntaimo, S

L. Ntaimo, S. Sen, The Million-Variable “March” for Stochastic Combinatorial Optimization, Journal of Global Optimization 32 (2005) 385–400. doi:10.1007/s10898-004-5910-6

work page doi:10.1007/s10898-004-5910-6 2005
[44]

Schultz, L

R. Schultz, L. Stougie, M. H. van der Vlerk, Solving stochastic programs with integer recourse by enumeration: A framework using Gröbner basis, Mathematical Programming 83 (1998) 229–252. doi:10.1007/BF02680560

work page doi:10.1007/bf02680560 1998
[45]

Kronqvist, R

J. Kronqvist, R. Misener, C. Tsay, P-split formulations: a class of intermediate formulations between big-M and convex hull for disjunctive constraints, Mathematical Programming (2025). doi:10.1007/s10107-025-02232-1

work page doi:10.1007/s10107-025-02232-1 2025
[46]

Ahmed, R

S. Ahmed, R. Garcia, N. Kong, L. Ntaimo, G. Perboli, F. Qiu, S. Sen, SIPLIB: A Stochastic Integer Programming Test Problem Library, 2015. URL:https://www2.isye.gatech.edu/ ~sahmed/siplib/

work page 2015
[47]

URL:https://docs.gurobi.com/projects/optimizer/en/current/reference/ parameters.html

Gurobi Optimization, LLC, Parameter Reference - Gurobi Optimizer Reference Man- ual, 2025. URL:https://docs.gurobi.com/projects/optimizer/en/current/reference/ parameters.html. 28

work page 2025

[1] [1]

Römisch, S

W. Römisch, S. Vigerske, Recent Progress in Two-stage Mixed-integer Stochastic Programming with Applications to Power Production Planning, in: P. M. Pardalos, S. Rebennack, M. V. F. 23 Pereira, N. A. Iliadis (Eds.), Handbook of Power Systems I, Springer, Berlin, Heidelberg, 2010, pp. 177–208. doi:10.1007/978-3-642-02493-1_8

work page doi:10.1007/978-3-642-02493-1_8 2010

[2] [2]

C. S. Khor, A. Elkamel, K. Ponnambalam, P. L. Douglas, Two-stage stochastic programming with fixed recourse via scenario planning with economic and operational risk management for petroleum refinery planning under uncertainty, Chemical Engineering and Processing: Process Intensification 47 (2008) 1744–1764. doi:10.1016/j.cep.2007.09.016

work page doi:10.1016/j.cep.2007.09.016 2008

[3] [3]

doi:10.1007/BF01580086

A.Shapiro, T.Homem-deMello, Asimulation-basedapproachtotwo-stagestochasticprogram- ming with recourse, Mathematical Programming 81 (1998) 301–325. doi:10.1007/BF01580086

work page doi:10.1007/bf01580086 1998

[4] [4]

Heitsch, W

H. Heitsch, W. Römisch, Scenario Reduction Algorithms in Stochastic Programming, Com- putational Optimization and Applications 24 (2003) 187–206. doi:10.1023/A:1021805924152

work page doi:10.1023/a:1021805924152 2003

[5] [5]

H. D. Sherali, B. M. Fraticelli, A modification of Benders’ decomposition algorithm for discrete subproblems: An approach for stochastic programs with integer recourse, Journal of Global Optimization 22 (2002) 319–342. doi:10.1023/A:1013827731218

work page doi:10.1023/a:1013827731218 2002

[6] [6]

C. C. Carøe, R. Schultz, Dual decomposition in stochastic integer programming, Operations Research Letters 24 (1999) 37–45. doi:10.1016/S0167-6377(98)00050-9

work page doi:10.1016/s0167-6377(98)00050-9 1999

[7] [7]

Küçükyavuz, S

S. Küçükyavuz, S. Sen, An Introduction to Two-Stage Stochastic Mixed-Integer Programming, in: Leading Developments from INFORMS Communities, INFORMS TutORials in Operations Research, INFORMS, 2017, pp. 1–27. doi:10.1287/educ.2017.0171

work page doi:10.1287/educ.2017.0171 2017

[8] [8]

Varga, E

J. Varga, E. Karlsson, G. R. Raidl, E. Rönnberg, F. Lindsten, T. Rodemann, Speeding Up Logic-Based Benders Decomposition by Strengthening Cuts with Graph Neural Networks, in: G. Nicosia, V. Ojha, E. La Malfa, G. La Malfa, P. M. Pardalos, R. Umeton (Eds.), Machine Learning, Optimization, and Data Science, Springer Nature Switzerland, Cham, 2024, pp. 24–38....

work page doi:10.1007/978-3-031-53969-5_3 2024

[9] [9]

H. Dai, Y. Xue, Z. Syed, D. Schuurmans, B. Dai, Neural Stochastic Dual Dynamic Program- ming, 2021. doi:10.48550/arXiv.2112.00874

work page doi:10.48550/arxiv.2112.00874 2021

[10] [10]

H. Bae, J. Lee, W. C. Kim, Y. Lee, Deep Value Function Networks for Large-Scale Multi- stage Stochastic Programs, in: Proceedings of The 26th International Conference on Artificial Intelligence and Statistics, PMLR, 2023, pp. 11267–11287. URL:https://proceedings.mlr. press/v206/bae23a.html

work page 2023

[11] [11]

Larsen, E

E. Larsen, E. Frejinger, B. Gendron, A. Lodi, Fast Continuous and Integer L-Shaped Heuristics Through Supervised Learning, INFORMS Journal on Computing 36 (2024) 203–223. doi:10. 1287/ijoc.2022.0175. 24

work page arXiv 2024

[12] [12]

Anh-Nguyen, J

T. Anh-Nguyen, J. Huchette, C. Tjandraatmadja, Learning Generalized Linear Programming Value Functions, in: A. Globerson, L. Mackey, D. Belgrave, A. Fan, U. Paquet, J. Tom- czak, C. Zhang (Eds.), Advances in Neural Information Processing Systems, volume 37, Cur- ran Associates, Inc., 2024, pp. 135437–135463. URL:https://dl.acm.org/doi/abs/10.5555/ 3737916.3742221

work page arXiv 2024

[13] [13]

Bengio, E

Y. Bengio, E. Frejinger, A. Lodi, R. Patel, S. Sankaranarayanan, A Learning-Based Al- gorithm to Quickly Compute Good Primal Solutions for Stochastic Integer Programs, in: E. Hebrard, N. Musliu (Eds.), Integration of Constraint Programming, Artificial Intelli- gence, and Operations Research, Springer International Publishing, Cham, 2020, pp. 99–111. doi:1...

work page doi:10.1007/978-3-030-58942-4_7 2020

[14] [14]

Y. Wu, W. Song, Z. Cao, J. Zhang, Learning Scenario Representation for Solving Two-stage Stochastic Integer Programs, in: Proceedings of the 10th International Conference on Learning Representations, 2021. URL:https://openreview.net/forum?id=06Wy2BtxXrz

work page 2021

[15] [15]

Y. Wu, Y. Zhang, Z. Liang, J. Cheng, HGCN2SP: Hierarchical Graph Convolutional Network for Two-Stage Stochastic Programming, in: Proceedings of the 41st International Conference on Machine Learning, 2024, pp. 53999 – 54014. doi:10.5555/3692070.3694285

work page doi:10.5555/3692070.3694285 2024

[16] [16]

C. D. Hubbs, H. D. Perez, O. Sarwar, N. V. Sahinidis, I. E. Grossmann, J. M. Wassick, OR- Gym: A Reinforcement Learning Library for Operations Research Problems, 2020. doi:10. 48550/arXiv.2008.06319

work page arXiv 2020

[17] [17]

Yilmaz, İ

D. Yilmaz, İ. E. Büyüktahtakın, A deep reinforcement learning framework for solving two-stage stochastic programs, Optimization Letters 18 (2024) 1993–2020. doi:10.1007/ s11590-023-02009-5

work page 2024

[18] [18]

Stranieri, E

F. Stranieri, E. Fadda, F. Stella, Combining deep reinforcement learning and multi-stage stochastic programming to address the supply chain inventory management problem, Interna- tionalJournalofProductionEconomics268(2024)109099.doi:10.1016/j.ijpe.2023.109099

work page doi:10.1016/j.ijpe.2023.109099 2024

[19] [19]

Dumouchelle, R

J. Dumouchelle, R. Patel, E. B. Khalil, M. Bodur, Neur2SP: Neural Two-Stage Stochastic Programming, 2022. doi:10.48550/arXiv.2205.12006

work page doi:10.48550/arxiv.2205.12006 2022

[20] [20]

Fischetti, J

M. Fischetti, J. Jo, Deep neural networks and mixed integer linear optimization, Constraints 23 (2018) 296–309. doi:10.1007/s10601-018-9285-6

work page doi:10.1007/s10601-018-9285-6 2018

[21] [21]

Kronqvist, B

J. Kronqvist, B. Li, J. Rolfes, S. Zhao, Alternating Mixed-Integer Programming and Neu- ral Network Training for Approximating Stochastic Two-Stage Problems, in: G. Nicosia, V. Ojha, E. La Malfa, G. La Malfa, P. M. Pardalos, R. Umeton (Eds.), Machine Learn- ing, Optimization, and Data Science, Springer Nature Switzerland, Cham, 2024, pp. 124–139. doi:10.1...

work page doi:10.1007/978-3-031-53966-4_10 2024

[22] [22]

Y. Liu, F. Oliveira, Simulator-based surrogate optimisation employing adaptive uncertainty- aware sampling, Computers & Chemical Engineering 201 (2025) 109243. doi:10.1016/j. compchemeng.2025.109243

work page doi:10.1016/j 2025

[23] [23]

B. Amos, L. Xu, J. Z. Kolter, Input Convex Neural Networks, in: Proceedings of the 34th International Conference on Machine Learning, PMLR, 2017, pp. 146–155. URL:https:// proceedings.mlr.press/v70/amos17b.html

work page 2017

[24] [24]

Duchesne, Q

L. Duchesne, Q. Louveaux, L. Wehenkel, Supervised learning of convex piecewise linear ap- proximations of optimization problems, in: Proceedings of the 29th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN), 2021, pp. 493–498. doi:10.14428/esann/2021.ES2021-74

work page doi:10.14428/esann/2021.es2021-74 2021

[25] [25]

Agrawal, R

S. Agrawal, R. Jia, Learning in Structured MDPs with Convex Cost Functions: Improved Regret Bounds for Inventory Management, in: Proceedings of the 2019 ACM Conference on Economics and Computation, EC ’19, Association for Computing Machinery, New York, NY, USA, 2019, pp. 743–744. doi:10.1145/3328526.3329565

work page doi:10.1145/3328526.3329565 2019

[26] [26]

Zhong, X

H. Zhong, X. Yan, Z. Tan, Real-Time Distributed Economic Dispatch Adapted to General Convex Cost Functions: A Secant Approximation-Based Method, IEEE Transactions on Smart Grid 12 (2021) 2089–2101. doi:10.1109/TSG.2020.3049054

work page doi:10.1109/tsg.2020.3049054 2021

[27] [27]

Luxenberg, S

E. Luxenberg, S. Boyd, Portfolio construction with Gaussian mixture returns and exponential utility via convex optimization, Optimization and Engineering 25 (2024) 555–574. doi:10. 1007/s11081-023-09814-y

work page 2024

[28] [28]

Y. Chen, Y. Shi, B. Zhang, Data-Driven Optimal Voltage Regulation Using Input Convex Neural Networks, Electric Power Systems Research 189 (2020) 106741. doi:10.1016/j.epsr. 2020.106741

work page doi:10.1016/j.epsr 2020

[29] [29]

Rosemberg, M

A. Rosemberg, M. Tanneau, B. Fanzeres, J. Garcia, P. Van Hentenryck, Learning Optimal Power Flow value functions with input-convex neural networks, Electric Power Systems Re- search 235 (2024) 110643. doi:10.1016/j.epsr.2024.110643

work page doi:10.1016/j.epsr.2024.110643 2024

[30] [30]

S. Yang, B. W. Bequette, Optimization-based control using input convex neural networks, Computers & Chemical Engineering 144 (2021) 107143. doi:10.1016/j.compchemeng.2020. 107143

work page doi:10.1016/j.compchemeng.2020 2021

[31] [31]

Bünning, A

F. Bünning, A. Schalbetter, A. Aboudonia, M. H. d. Badyn, P. Heer, J. Lygeros, Input Convex Neural Networks for Building MPC, in: Proceedings of the 3rd Conference on Learning for Dynamics and Control, PMLR, 2021, pp. 251–262. URL:https://proceedings.mlr.press/ v144/bunning21a.html. 26

work page 2021

[32] [32]

Jiang, C

K. Jiang, C. Hu, F. Yan, Path-following control of autonomous ground vehicles based on input convex neural networks, Proceedings of the Institution of Mechanical Engineers, Part D: Journal of Automobile Engineering 236 (2022) 2806–2816. doi:10.1177/09544070221114690

work page doi:10.1177/09544070221114690 2022

[33] [33]

W. Wang, H. Zhang, Y. Wang, Y. Tian, Z. Wu, Fast Explicit Machine Learning-Based Model Predictive Control of Nonlinear Processes Using Input Convex Neural Networks, Industrial & Engineering Chemistry Research 63 (2024) 17279–17293. doi:10.1021/acs.iecr.4c02257

work page doi:10.1021/acs.iecr.4c02257 2024

[34] [34]

S. P. Boyd, L. Vandenberghe, Convex optimization, Cambridge university press, 2004. doi:10. 1017/CBO9780511804441

work page 2004

[35] [35]

J. R. Birge, F. Louveaux, Introduction to Stochastic Programming, Springer Series in Op- erations Research and Financial Engineering, Springer, New York, NY, 2011. doi:10.1007/ 978-1-4614-0237-4

work page 2011

[36] [36]

H. J. M. Peters, P. P. Wakker, Convex functions on non-convex domains, Economics Letters 22 (1986) 251–255. doi:10.1016/0165-1765(86)90242-9

work page doi:10.1016/0165-1765(86)90242-9 1986

[37] [37]

Hassanzadeh, T

A. Hassanzadeh, T. K. Ralphs, On the Value Function of a Mixed Integer Linear Optimization Problem and an Algorithm for its Construction, 2014. URL:https://optimization-online. org/?p=12986

work page 2014

[38] [38]

Patel, J

R. Patel, J. Dumouchelle, Khalil-research/Neur2SP, 2022. URL:https://github.com/ khalil-research/Neur2SP

work page 2022

[39] [39]

URL:https://www

Gurobi Optimization, LLC, Gurobi Optimizer Reference Manual, 2024. URL:https://www. gurobi.com

work page 2024

[40] [40]

Paszke, S

A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, A. Desmaison, A. Kopf, E. Yang, Z. DeVito, M. Raison, A. Te- jani, S. Chilamkurthy, B. Steiner, L. Fang, J. Bai, S. Chintala, PyTorch: An imperative style, high-performance deep learning library, in: H. Wallach, H. Larochelle, A. Beygelzimer, F. ...

work page arXiv 2019

[41] [41]

Pedregosa, G

F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, Édouard Duchesnay, Scikit-learn: Machine learning in python, Journal of Machine Learning Research 12 (2011) 2825–2830. URL:http://jmlr.org/papers/v12/pedregosa11a. html. 27

work page 2011

[42] [42]

Cornuejols, R

G. Cornuejols, R. Sridharan, J. M. Thizy, A comparison of heuristics and relaxations for the capacitated plant location problem, European Journal of Operational Research 50 (1991) 280–297. doi:10.1016/0377-2217(91)90261-S

work page doi:10.1016/0377-2217(91)90261-s 1991

[43] [43]

Ntaimo, S

L. Ntaimo, S. Sen, The Million-Variable “March” for Stochastic Combinatorial Optimization, Journal of Global Optimization 32 (2005) 385–400. doi:10.1007/s10898-004-5910-6

work page doi:10.1007/s10898-004-5910-6 2005

[44] [44]

Schultz, L

R. Schultz, L. Stougie, M. H. van der Vlerk, Solving stochastic programs with integer recourse by enumeration: A framework using Gröbner basis, Mathematical Programming 83 (1998) 229–252. doi:10.1007/BF02680560

work page doi:10.1007/bf02680560 1998

[45] [45]

Kronqvist, R

J. Kronqvist, R. Misener, C. Tsay, P-split formulations: a class of intermediate formulations between big-M and convex hull for disjunctive constraints, Mathematical Programming (2025). doi:10.1007/s10107-025-02232-1

work page doi:10.1007/s10107-025-02232-1 2025

[46] [46]

Ahmed, R

S. Ahmed, R. Garcia, N. Kong, L. Ntaimo, G. Perboli, F. Qiu, S. Sen, SIPLIB: A Stochastic Integer Programming Test Problem Library, 2015. URL:https://www2.isye.gatech.edu/ ~sahmed/siplib/

work page 2015

[47] [47]

URL:https://docs.gurobi.com/projects/optimizer/en/current/reference/ parameters.html

Gurobi Optimization, LLC, Parameter Reference - Gurobi Optimizer Reference Man- ual, 2025. URL:https://docs.gurobi.com/projects/optimizer/en/current/reference/ parameters.html. 28

work page 2025