arxiv: 2604.11507 · v1 · submitted 2026-04-13 · 🧮 math.OC · cs.AI· cs.LG· cs.SY· eess.SY· stat.ML

Recognition: unknown

Deep Learning for Sequential Decision Making under Uncertainty: Foundations, Frameworks, and Frontiers

I. Esra Buyuktahtakin

Authors on Pith no claims yet

Pith reviewed 2026-05-10 15:36 UTC · model grok-4.3

classification 🧮 math.OC cs.AIcs.LGcs.SYeess.SYstat.ML

keywords deep learningsequential decision makingoptimization under uncertaintyoperations researchreinforcement learninghybrid modelsneural architecturesdecision support

0 comments

The pith

Deep learning complements optimization for sequential decisions under uncertainty rather than replacing it.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper sets out to show that deep learning and operations research methods work best when combined for problems involving sequential choices under uncertainty. Deep learning supplies flexible approximations that scale with data, while optimization supplies the constraints, recourse options, and uncertainty modeling that keep solutions feasible and reliable. It reviews core decision foundations, links them to architectures such as LSTMs, transformers, and deep reinforcement learning, and surveys practical ways to fuse the two. A reader would care because the approach points toward AI systems that can plan and act in changing environments instead of only forecasting. This framing also positions operations research as essential for building the next generation of decision-capable systems.

Core claim

Deep learning is valuable not as a replacement for optimization, but as a complement to it. Deep learning brings adaptability and scalable approximation, whereas OR/MS provides the structural rigor needed to represent constraints, recourse, and uncertainty. The tutorial reviews key decision-making foundations, connects them to major neural architectures, and discusses leading approaches to integrating learning and optimization, while highlighting applications in supply chains, healthcare, agriculture, energy, and autonomous operations as part of a shift from predictive to decision-capable AI.

What carries the argument

Hybrid integration approaches that pair neural architectures for approximation with optimization models that enforce constraints and model uncertainty in sequential decisions.

If this is right

Hybrid systems scale to large decision problems while respecting domain constraints and uncertainty structures.
Concrete applications improve in supply chains, healthcare and epidemic response, agriculture, energy, and autonomous operations.
Operations research helps shape integrated learning-optimization systems during the move from predictive to decision-capable AI.
Neural architectures such as transformers and deep reinforcement learning become practical tools inside optimization frameworks.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Hybrid methods may require new training procedures that embed optimization constraints directly into neural loss functions.
Domain-specific uncertainty models from operations research could be used to generate training scenarios for neural networks.
The perspective suggests testing whether certain neural architectures align better with particular classes of stochastic programs.
Educational curricula in operations research and machine learning may converge around shared hybrid case studies.

Load-bearing premise

That deep learning and optimization can be integrated effectively at scale while retaining the structural benefits of optimization models for constraints and uncertainty.

What would settle it

A head-to-head empirical comparison on standard benchmark problems showing that either pure deep learning or pure optimization methods achieve equal or better performance and scalability than the reviewed hybrid approaches.

Figures

Figures reproduced from arXiv: 2604.11507 by I. Esra Buyuktahtakin.

**Figure 2.** Figure 2: Comparison of FNN, GNN, RNN/LSTM, and transformer architectures for learning-based decision [PITH_FULL_IMAGE:figures/full_fig_p007_2.png] view at source ↗

**Figure 3.** Figure 3: Illustration of the self-attention mechanism. The layer- [PITH_FULL_IMAGE:figures/full_fig_p012_3.png] view at source ↗

**Figure 4.** Figure 4: Comparison of learning–optimization paradigms. (a) Predict-then-optimize separates prediction [PITH_FULL_IMAGE:figures/full_fig_p014_4.png] view at source ↗

**Figure 5.** Figure 5: High-level learning-to-optimize pipeline for direct solution generation. Solved optimization [PITH_FULL_IMAGE:figures/full_fig_p016_5.png] view at source ↗

**Figure 6.** Figure 6: Bidirectional LSTM architecture for sequential decision learning in multi-stage optimization. [PITH_FULL_IMAGE:figures/full_fig_p018_6.png] view at source ↗

**Figure 7.** Figure 7: Compact view of the expandable PredOpt architecture. The learned predictor generates partial [PITH_FULL_IMAGE:figures/full_fig_p019_7.png] view at source ↗

**Figure 8.** Figure 8: Normalized objective-value trajectories for Gurobi and ScenPredOpt across representative in [PITH_FULL_IMAGE:figures/full_fig_p023_8.png] view at source ↗

**Figure 9.** Figure 9: Two complementary views of simulation-integrated deep reinforcement learning. Panel (a) presents [PITH_FULL_IMAGE:figures/full_fig_p026_9.png] view at source ↗

**Figure 10.** Figure 10: Two-stage DRL training overview. Agent 2 learns recourse actions after scenario realization, [PITH_FULL_IMAGE:figures/full_fig_p027_10.png] view at source ↗

**Figure 11.** Figure 11: Structure-aware DRL pipeline for the multi-dimensional knapsack problem. Training instances [PITH_FULL_IMAGE:figures/full_fig_p029_11.png] view at source ↗

read the original abstract

Artificial intelligence (AI) is moving increasingly beyond prediction to support decisions in complex, uncertain, and dynamic environments. This shift creates a natural intersection with operations research and management sciences (OR/MS), which have long offered conceptual and methodological foundations for sequential decision-making under uncertainty. At the same time, recent advances in deep learning, including feedforward neural networks, LSTMs, transformers, and deep reinforcement learning, have expanded the scope of data-driven modeling and opened new possibilities for large-scale decision systems. This tutorial presents an OR/MS-centered perspective on deep learning for sequential decision-making under uncertainty. Its central premise is that deep learning is valuable not as a replacement for optimization, but as a complement to it. Deep learning brings adaptability and scalable approximation, whereas OR/MS provides the structural rigor needed to represent constraints, recourse, and uncertainty. The tutorial reviews key decision-making foundations, connects them to the major neural architectures in modern AI, and discusses leading approaches to integrating learning and optimization. It also highlights emerging impact in domains such as supply chains, healthcare and epidemic response, agriculture, energy, and autonomous operations. More broadly, it frames these developments as part of a wider transition from predictive AI toward decision-capable AI and highlights the role of OR/MS in shaping the next generation of integrated learning--optimization systems.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 2 minor

Summary. The manuscript is a tutorial presenting an OR/MS-centered perspective on deep learning for sequential decision-making under uncertainty. Its central premise is that deep learning complements rather than replaces optimization: DL provides adaptability and scalable approximation while OR/MS supplies structural rigor for constraints, recourse, and uncertainty. It reviews decision-making foundations, connects them to neural architectures including feedforward networks, LSTMs, transformers, and deep reinforcement learning, discusses leading integration approaches, and highlights applications in supply chains, healthcare, agriculture, energy, and autonomous operations, framing a broader shift from predictive to decision-capable AI.

Significance. If the described integrations prove effective, the tutorial could meaningfully advance interdisciplinary research by offering a balanced conceptual framework that bridges AI and OR/MS communities. It explicitly credits the review of leading neural architectures and application domains as a foundation for hybrid learning-optimization systems, which may guide scalable decision systems while preserving OR/MS structural benefits.

minor comments (2)

[Abstract] The abstract and introduction could more explicitly list the specific integration frameworks reviewed (e.g., end-to-end learning, hybrid models) to help readers navigate the tutorial's structure.
[Applications] Application sections would benefit from brief pointers to key references or case studies for each domain (supply chains, healthcare, etc.) to strengthen the illustrative value without altering the tutorial format.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive summary, significance assessment, and recommendation of minor revision. The referee accurately captures the tutorial's OR/MS-centered framing of deep learning as a complement to optimization for sequential decision-making under uncertainty, including its coverage of neural architectures, integration approaches, and applications. No major comments were raised, so we will incorporate any minor editorial suggestions from the editor.

Circularity Check

0 steps flagged

No significant circularity: review paper with no derivations

full rationale

This is a tutorial/review manuscript that frames deep learning as a complement to OR/MS for sequential decision-making under uncertainty. It contains no original mathematical derivations, equations, fitted parameters, predictions, or uniqueness theorems. All content is descriptive synthesis of prior literature, with the central premise stated at a high level without any reduction to self-referential inputs or self-citation chains. No load-bearing steps exist that could be circular by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

This tutorial paper introduces no new free parameters, axioms, or invented entities; it reviews existing concepts from OR/MS and deep learning without new derivations.

pith-pipeline@v0.9.0 · 5548 in / 1063 out tokens · 48991 ms · 2026-05-10T15:36:35.561282+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

138 extracted references · 113 canonical work pages · 6 internal anchors

[1]

European Journal of Operational Research 183, 1109–1130

Absi N, van den Heuvel W (2019) Worst-case analysis of relax and fix heuristics for lot-sizing problems. European Journal of Operational Research279(2):449–458, URLhttp://dx.doi.org/10.1016/j.ejor. 2019.06.010

work page doi:10.1016/j.ejor 2019
[2]

Advances in Neural Information Processing Systems, volume 32, URLhttp://dx.doi.org/10.48550/ arXiv.1910.12430

AgrawalA,AmosB,BarrattS,BoydS,DiamondS,KolterJZ(2019)Differentiableconvexoptimizationlayers. Advances in Neural Information Processing Systems, volume 32, URLhttp://dx.doi.org/10.48550/ arXiv.1910.12430

work page arXiv 2019
[3]

AhmaditeshniziA,GaoW,UdellM(2024)Optimus: Scalableoptimizationmodelingwith(MI)LPsolversand large language models.Proceedings of the 41st International Conference on Machine Learning, volume 235 of Proceedings of Machine Learning Research, 577–596 (PMLR), URLhttps://proceedings.mlr.press/ v235/ahmaditeshnizi24a.html

2024
[4]

URLhttps: //optimization-online.org/2019/05/7199/, optimization Online, pp

Ahmed S, Ding L, Shapiro A (2019) A python package for multi-stage stochastic programming. URLhttps: //optimization-online.org/2019/05/7199/, optimization Online, pp. 1–41

2019
[5]

AmosB,KolterJZ(2017)Optnet: Differentiableoptimizationasalayerinneuralnetworks.Proceedingsofthe 34thInternationalConferenceonMachineLearning,volume70ofProceedingsofMachineLearningResearch, 136–145 (PMLR), URLhttp://dx.doi.org/10.48550/arXiv.1703.00443

work page doi:10.48550/arxiv.1703.00443 2017
[6]

org/10.1007/s10107-020-01474-5

Anderson R, Huchette J, Ma W, Tjandraatmadja C, Vielma JP (2020) Strong mixed-integer programming formulations for trained neural networks.Mathematical Programming183(1):3–39, URLhttp://dx.doi. org/10.1007/s10107-020-01474-5

work page doi:10.1007/s10107-020-01474-5 2020
[7]

Angelopoulos AN, Bates S (2023) Conformal prediction: A gentle introduction.Foundations and Trends in Machine Learning16(4):494–591, URLhttp://dx.doi.org/10.1561/2200000101

work page doi:10.1561/2200000101 2023
[9]

InternationalConferenceonLearningRepresentations,URLhttp://dx.doi.org/10.48550/arXiv.1409

Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. InternationalConferenceonLearningRepresentations,URLhttp://dx.doi.org/10.48550/arXiv.1409. 0473

work page doi:10.48550/arxiv.1409 2014
[10]

Bai S, Kolter JZ, Koltun V (2018) An empirical evaluation of generic convolutional and recurrent networks for sequence modeling.arXiv preprint arXiv:1803.01271URLhttp://dx.doi.org/10.48550/arXiv.1803. 01271

work page doi:10.48550/arxiv.1803 2018
[11]

Working paper

Baswapuram AK, Chen C, Cai W, Büyüktahtakın İE (2026) An interpretable ensemble heuristic for principal- agent games with machine learning. Working paper

2026
[12]

Bellman R (1957)Dynamic Programming(Princeton, NJ: Princeton University Press), ISBN 9780691079516, URLhttps://press.princeton.edu/books/hardcover/9780691079516/dynamic-programming

work page arXiv 1957
[13]

Bello I, Pham H, Le QV, Norouzi M, Bengio S (2017) Neural combinatorial optimization with reinforce- 36 mentlearning.InternationalConferenceonLearningRepresentations,URLhttp://dx.doi.org/10.48550/ arXiv.1611.09940

work page Pith review arXiv 2017
[14]

Ben-TalA,NemirovskiA(1998)Robustconvexoptimization.MathematicsofOperationsResearch23(4):769– 805, URLhttp://dx.doi.org/10.1287/moor.23.4.769

work page doi:10.1287/moor.23.4.769 1998
[15]

Benders JF (1962) Partitioning procedures for solving mixed-variables programming problems.Numerische Mathematik4(1):238–252, URLhttp://dx.doi.org/10.1007/BF01386316

work page doi:10.1007/bf01386316 1962
[16]

1016/j.ejor.2020.07.063

Bengio Y, Lodi A, Prouvost A (2021) Machine learning for combinatorial optimization: A methodological tour d’horizon.European Journal of Operational Research290(2):405–421, URLhttp://dx.doi.org/10. 1016/j.ejor.2020.07.063

2021
[17]

BengioY,SimardP,FrasconiP(1994)Learninglong-termdependencieswithgradientdescentisdifficult.IEEE Transactions on Neural Networks5(2):157–166, URLhttp://dx.doi.org/10.1109/72.279181

work page doi:10.1109/72.279181 1994
[18]

Bertsekas DP (1995)Dynamic Programming and Optimal Control(Belmont, MA: Athena Scientific), ISBN 9781886529434, URLhttps://www.athenasc.com/dpcontents.html

1995
[19]

Bertsekas DP, Tsitsiklis JN (1996)Neuro-Dynamic Programming(Belmont, MA: Athena Scientific), ISBN 9781886529106, URLhttps://www.athenasc.com/ndpbook.html

1996
[20]

Bertsimas D, Demir R (2002) An approximate dynamic programming approach to multidimensional knapsack problems.ManagementScience48(4):550–565,URLhttp://dx.doi.org/10.1287/mnsc.48.4.550.208

work page doi:10.1287/mnsc.48.4.550.208 2002
[21]

Bertsimas D, Gupta V, Kallus N (2018) Data-driven robust optimization.Mathematical Programming 167(2):235–292, URLhttp://dx.doi.org/10.1007/s10107-017-1125-8

work page doi:10.1007/s10107-017-1125-8 2018
[22]

BertsimasD,KallusN(2020)Frompredictivetoprescriptiveanalytics.ManagementScience66(3):1025–1044, URLhttp://dx.doi.org/10.1287/mnsc.2018.3253

work page doi:10.1287/mnsc.2018.3253 2020
[23]

Birge JR, Louveaux F (2011)Introduction to Stochastic Programming(New York, NY: Springer), 2 edition, URLhttp://dx.doi.org/10.1007/978-1-4614-0237-4

work page doi:10.1007/978-1-4614-0237-4 2011
[24]

Blekos K, Brand D, Ceschini A, Chou CH, Li RH, Pandya K, Summer A (2024) A review on quantum approximate optimization algorithm and its variants.Physics Reports1068:1–66, URLhttp://dx.doi.org/ 10.1016/j.physrep.2024.03.002

work page doi:10.1016/j.physrep.2024.03.002 2024
[25]

BrownTB,MannB,RyderN,SubbiahM,KaplanJD,DhariwalP,NeelakantanA,ShyamP,SastryG,AskellA, Agarwal S, Herbert-Voss A, Krueger G, Henighan T, Child R, Ramesh A, Ziegler DM, Wu J, Winter C, Hesse C, Chen M, Sigler E, Litwin M, Gray S, Chess B, Clark J, Berner C, McCandlish S, Radford A, Sutskever I, AmodeiD(2020)Languagemodelsarefew-shotlearners.AdvancesinNeural...

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2005.14165 2020
[26]

BuschN,CrönertT,MinnerS,RettingerM,SelB(2023)Deeplearningforcommodityprocurement: Nonlinear data-driven optimization of hedging decisions.INFORMS Journal on Optimization5(3):273–294, URLhttp: //dx.doi.org/10.1287/ijoo.2022.0086

work page doi:10.1287/ijoo.2022.0086 2023
[27]

Bushaj S, Büyüktahtakın İE (2024) A k-means supported reinforcement learning framework to multi- dimensionalknapsack.JournalofGlobalOptimization89(3):655–685,URLhttp://dx.doi.org/10.1007/ s10898-024-01364-6

2024
[28]

Bushaj S, Yin X, Beqiri A, Andrews D, Büyüktahtakın İE (2023) A simulation-deep reinforcement learning (SiRL) approach for epidemic control optimization.Annals of Operations Research328(1):245–277, URL http://dx.doi.org/10.1007/s10479-022-04926-7

work page doi:10.1007/s10479-022-04926-7 2023
[29]

Büyüktahtakın İE (2022) Stage-𝑡scenario dominance for risk-averse multi-stage stochastic mixed- 37 integer programs.Annals of Operations Research309:1–35, URLhttp://dx.doi.org/10.1007/ s10479-021-04388-3

2022
[30]

Büyüktahtakın İE, des Bordes E, Kıbış EY (2018) A new epidemics–logistics model: Insights into controlling the ebola virus disease in west africa.European Journal of Operational Research265(3):1046–1063, URL http://dx.doi.org/10.1016/j.ejor.2017.08.037

work page doi:10.1016/j.ejor.2017.08.037 2018
[31]

Büyüktahtakın İE, Feng Z, Frisvold G, Szidarovszky F, Olsson A (2011) A dynamic model of controlling invasive species.Computers & Mathematics with Applications62(9):3326–3333

2011
[32]

Çetinkaya İO, İ Esra Büyüktahtakın, Shojaee P, Reddy CK (2026) Discovering heuristics with large language models (LLMs) for mixed-integer programs: Single-machine scheduling.Computers & Operations Research 186:107325, URLhttp://dx.doi.org/10.1016/j.cor.2025.107325

work page doi:10.1016/j.cor.2025.107325 2026
[33]

Science China Mathematics67(6):1191–1262, URLhttp://dx.doi.org/10.1007/s11425-023-2293-3

Chen X, Liu J, Yin W (2024) Learning to optimize: A tutorial for continuous and mixed-integer optimization. Science China Mathematics67(6):1191–1262, URLhttp://dx.doi.org/10.1007/s11425-023-2293-3

work page doi:10.1007/s11425-023-2293-3 2024
[34]

Choi SJ, Cibaku E, Svirsko A, Skipper D, Büyüktahtakın İE (2026) Safety-constrained reinforcement learning fornavalwarfaresearchingwithanintelligenttarget.RefereedProceedingsofthe2026INFORMSOptimization Society Conference (IOS 2026)(Atlanta, GA)

2026
[35]

Choi SJ, Cooper J, Büyüktahtakın Toy E (2024) A temporal convolutional neural network (TCNN) approach to predicting capacitated lot-sizing solutions.Proceedings of the 2024 IISE Annual Conference & Expo, 1– 6 (Institute of Industrial and Systems Engineers (IISE)), URLhttp://dx.doi.org/10.21872/2024IISE_ 7151

work page doi:10.21872/2024iise_ 2024
[36]

ChoiSJ,JozaniK,CooperJF,BüyüktahtakınİE(2025)Learningtooptimizeatscale: Abendersdecomposition- transfORmers framework for stochastic combinatorial optimization.NeurIPS 2025 Workshop MLxOR: Mathe- maticalFoundationsandOperationalIntegrationofMachineLearningforUncertainty-AwareDecision-Making, URLhttps://openreview.net/forum?id=jVcPvWjrQ5, poster paper, pub...

2025
[37]

Cibaku E, Büyüktahtakın İE (2026) An adaptive k-means and reinforcement learning (rl) algorithm to effective vaccine distribution.Computers & Operations Research185:107275, URLhttp://dx.doi.org/10.1016/ j.cor.2025.107275

work page arXiv 2026
[38]

Cooper JF, Choi SJ, Büyüktahtakın İE (2024) Toward transfORmers: Revolutionizing the solution of mixed integer programs with transformers.Proceedings of the 2024 Industrial and Systems Engineering Research Conference (ISERC)(Montreal, Canada), URLhttp://dx.doi.org/10.48550/arXiv.2402.13380, also available as arXiv:2402.13380

work page doi:10.48550/arxiv.2402.13380 2024
[39]

Coşgun Ö, Büyüktahtakın İE (2018) Stochastic dynamic resource allocation for hiv prevention and treatment: An approximate dynamic programming approach.Computers & Industrial Engineering118:423–439, URL http://dx.doi.org/10.1016/j.cie.2018.01.018

work page doi:10.1016/j.cie.2018.01.018 2018
[40]

Dai JG, Gluzman M (2022) Queueing network controls via deep reinforcement learning.Stochastic Systems 12(1):30–67, URLhttp://dx.doi.org/10.1287/stsy.2021.0081

work page doi:10.1287/stsy.2021.0081 2022
[41]

DelageE,YeY(2010)Distributionallyrobustoptimizationundermomentuncertaintywithapplicationtodata- driven problems.Operations Research58(3):595–612, URLhttp://dx.doi.org/10.1287/opre.1090. 0741

work page doi:10.1287/opre.1090 2010
[42]

DontiPL,AmosB,KolterJZ(2017)Task-basedend-to-endmodellearninginstochasticoptimization.Advances in Neural Information Processing Systems, volume 30, 5484–5494, URLhttp://dx.doi.org/10.48550/ arXiv.1710.08005

work page arXiv 2017
[43]

Duchi J, Hashimoto T, Namkoong H (2023) Distributionally robust losses for latent covariate mixtures.Opera- tions Research71(2):649–664, URLhttp://dx.doi.org/10.1287/opre.2022.2363. 38

work page doi:10.1287/opre.2022.2363 2023
[44]

Duchi JC, Namkoong H (2021) Learning models with uniform performance via distributionally robust opti- mization.The Annals of Statistics49(3):1378–1406, URLhttp://dx.doi.org/10.1214/20-AOS2004

work page doi:10.1214/20-aos2004 2021
[45]

El Balghiti O, Elmachtoub AN, Grigas P, Tewari A (2023) Generalization bounds in the predict-then-optimize framework.Mathematics of Operations Research48(4):2043–2065, URLhttp://dx.doi.org/10.1287/ moor.2022.1330

work page arXiv 2023
[46]

Predict, then Optimize

Elmachtoub AN, Grigas P (2022) Smart “predict, then optimize”.Management Science68(1):9–26, URL http://dx.doi.org/10.1287/mnsc.2020.3922

work page doi:10.1287/mnsc.2020.3922 2022
[47]

1016/0364-0213(90)90002-E

ElmanJL(1990)Findingstructureintime.CognitiveScience14(2):179–211,URLhttp://dx.doi.org/10. 1016/0364-0213(90)90002-E

1990
[48]

FanM,WuY,LiaoT,CaoZ,GuoH,SartorettiG,WuG(2023)Deepreinforcementlearningforuavroutingin thepresenceofmultiplechargingstations.IEEETransactionsonVehicularTechnology72(5):5732–5746,URL http://dx.doi.org/10.1109/TVT.2022.3232607

work page doi:10.1109/tvt.2022.3232607 2023
[49]

Fioretto F, Mak TWK, Van Hentenryck P (2020) Predicting ac optimal power flows: Combining deep learning and lagrangian dual methods.Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, 630– 637, URLhttp://dx.doi.org/10.1609/aaai.v34i01.5403

work page doi:10.1609/aaai.v34i01.5403 2020
[50]

FischettiM,JoJ(2018)Deepneuralnetworksandmixedintegerlinearoptimization.Constraints23(3):296–309, URLhttp://dx.doi.org/10.1007/s10601-018-9285-6

work page doi:10.1007/s10601-018-9285-6 2018
[51]

Gal Y, Ghahramani Z (2016) Dropout as a bayesian approximation: Representing model uncertainty in deep learning.Proceedings of the 33rd International Conference on Machine Learning, volume 48 ofProceedings of Machine Learning Research, 1050–1059, URLhttp://dx.doi.org/10.48550/arXiv.1506.02142

work page Pith review doi:10.48550/arxiv.1506.02142 2016
[52]

Galande N, Jozani KM, Büyüktahtakın İE (2025) Artificial intelligence in supply chain optimization: A systematicreviewofmachinelearningmodels,methods,andapplications.OptimizationOnline1–66,published online December 8, 2025

2025
[53]

Gasse M, Chételat D, Ferroni N, Charlin L, Lodi A (2019) Exact combinatorial optimization with graph convolutional neural networks.Advances in Neural Information Processing Systems, volume 32, 15554–15566, URLhttp://dx.doi.org/10.48550/arXiv.1906.01629, neurIPS 2019

work page doi:10.48550/arxiv.1906.01629 2019
[54]

GautronR,MaillardOA,PreuxP,CorbeelsM,SabbadinR(2022)Reinforcementlearningforcropmanagement support: Review,prospectsandchallenges.ComputersandElectronicsinAgriculture200:107182,URLhttp: //dx.doi.org/10.1016/j.compag.2022.107182

work page doi:10.1016/j.compag.2022.107182 2022
[55]

Gers FA, Schmidhuber J, Cummins F (2000) Learning to forget: continual prediction with lstm.Neural Computation12(10):2451–2471, URLhttp://dx.doi.org/10.1162/089976600300015015

work page doi:10.1162/089976600300015015 2000
[56]

GijsbrechtsJ,BouteRN,VanMieghemJA,ZhangDJ(2022)Candeepreinforcementlearningimproveinventory management? performanceonlostsales,dual-sourcing,andmulti-echelonproblems.Manufacturing&Service Operations Management24(3):1349–1368, URLhttp://dx.doi.org/10.1287/msom.2021.1064

work page doi:10.1287/msom.2021.1064 2022
[57]

Goodfellow I, Bengio Y, Courville A (2016)Deep Learning(Cambridge, MA: MIT Press), ISBN 9780262035613, URLhttps://www.deeplearningbook.org/

2016
[58]

HamiltonWL,YingR,LeskovecJ(2017)Inductiverepresentationlearningonlargegraphs.AdvancesinNeural Information Processing Systems30, URLhttp://dx.doi.org/10.48550/arXiv.1706.02216

work page Pith review doi:10.48550/arxiv.1706.02216 2017
[59]

Harsha P, Jagmohan A, Kalagnanam J, Quanz B, Singhvi D (2025) Deep policy iteration with integer pro- grammingforinventorymanagement.Manufacturing&ServiceOperationsManagement27(2):369–388,URL http://dx.doi.org/10.1287/msom.2022.0617. 39

work page doi:10.1287/msom.2022.0617 2025
[60]

Hausknecht M, Stone P (2015) Deep recurrent q-learning for partially observable mdps.arXiv preprint arXiv:1507.06527URLhttp://dx.doi.org/10.48550/arXiv.1507.06527

work page doi:10.48550/arxiv.1507.06527 2015
[61]

Hochreiter S, Schmidhuber J (1997) Long short-term memory.Neural Computation9(8):1735–1780, URL http://dx.doi.org/10.1162/neco.1997.9.8.1735

work page doi:10.1162/neco.1997.9.8.1735 1997
[62]

Multilayer feedforward networks are universal approximators , journal =

Hornik K, Stinchcombe M, White H (1989) Multilayer feedforward networks are universal approximators. Neural Networks2(5):359–366, URLhttp://dx.doi.org/10.1016/0893-6080(89)90020-8

work page doi:10.1016/0893-6080(89)90020-8 1989
[63]

IvanovD(2023)Intelligentdigitaltwin(idt)forsupplychainstress-testing,resilience,andviability.International JournalofProductionEconomics263:108938,URLhttp://dx.doi.org/10.1016/j.ijpe.2023.108938

work page doi:10.1016/j.ijpe.2023.108938 2023
[64]

IvanovD,DolguiA(2020)Viabilityofintertwinedsupplynetworks: Extendingthesupplychainresiliencean- glestowardssurvivability.apositionpapermotivatedbycovid-19outbreak.InternationalJournalofProduction Research58(10):2904–2915, URLhttp://dx.doi.org/10.1080/00207543.2020.1750727

work page doi:10.1080/00207543.2020.1750727 2020
[65]

Jozani K, Sageer NA, Eldardiry H, Tunc S, Buyuktahtakin Toy E (2025) A multi-echelon demand-driven supply chain model for proactive optimal control of epidemics: Insights from a covid-19 study URLhttp: //dx.doi.org/10.48550/arXiv.2510.16969

work page doi:10.48550/arxiv.2510.16969 2025
[66]

Littman, and Anthony R

KaelblingLP,LittmanML,CassandraAR(1998)Planningandactinginpartiallyobservablestochasticdomains. Artificial Intelligence101(1–2):99–134, URLhttp://dx.doi.org/10.1016/S0004-3702(98)00023-X

work page doi:10.1016/s0004-3702(98)00023-x 1998
[67]

Kallus N, Mao X (2023) Stochastic optimization forests.Management Science69(4):1975–1994, URLhttp: //dx.doi.org/10.1287/mnsc.2022.4458

work page doi:10.1287/mnsc.2022.4458 2023
[68]

KendallA,GalY(2017)Whatuncertaintiesdoweneedinbayesiandeeplearningforcomputervision?Advances in Neural Information Processing Systems, volume 30, URLhttp://dx.doi.org/10.48550/arXiv.1703. 04977

work page doi:10.48550/arxiv.1703 2017
[69]

Kerr CC, Stuart RM, Mistry D, Abeysuriya RG, Rosenfeld K, Hart GR, Nuñez RC, Cohen JA, Selvaraj P, Hagedorn B, George L, Jastrzębska M, Izzo A, Fowler G, Palmer A, Delport D, Scott N, Kelly S, Bennette CS, Wagner B, Chang ST, Vassall A, Pearson BJ, Winskill PH, Panovska-Griffiths A, Famulare M, Klein DJ (2021) Covasim: An agent-based model of COVID-19 dyn...

work page doi:10.1371/journal.pcbi.1009149 2021
[71]

Khalil EB, Le Bodic P, Song L, Nemhauser G, Dilkina B (2016) Learning to branch in mixed inte- ger programming.Proceedings of the AAAI Conference on Artificial Intelligence30(1):724–731, URL http://dx.doi.org/10.1609/aaai.v30i1.10080

work page doi:10.1609/aaai.v30i1.10080 2016
[72]

Proceedings of the AAAI Conference on Artificial Intelligence36(9):10219–10227, URLhttp://dx.doi

Khalil EB, Morris C, Lodi A (2022) Mip-gnn: A data-driven framework for guiding combinatorial solvers. Proceedings of the AAAI Conference on Artificial Intelligence36(9):10219–10227, URLhttp://dx.doi. org/10.1609/aaai.v36i9.21262

work page doi:10.1609/aaai.v36i9.21262 2022
[73]

Kıbış EY, Büyüktahtakın İE (2019) Optimizing multi-modal cancer treatment under 3d spatio-temporal tumor growth.Mathematical Biosciences307:53–69, URLhttp://dx.doi.org/10.1016/j.mbs.2018.10.004

work page doi:10.1016/j.mbs.2018.10.004 2019
[74]

Kıbış EY, Büyüktahtakın İE, Haight RG, Akhundov N, Knight K, Flower CE (2021) A multistage stochastic programming approach to the optimal surveillance and control of the emerald ash borer in cities.INFORMS Journal on Computing33(2):808–834, URLhttp://dx.doi.org/10.1287/ijoc.2020.0963

work page doi:10.1287/ijoc.2020.0963 2021
[75]

Kipf TN, Welling M (2017) Semi-supervised classification with graph convolutional networks.International Conference on Learning Representations, URLhttp://dx.doi.org/10.48550/arXiv.1609.02907. 40

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1609.02907 2017
[76]

doi.org/10.1038/s41591-018-0213-5

Komorowski M, Celi LA, Badawi O, Gordon AC, Faisal AA (2018) The artificial intelligence clinician learns optimal treatment strategies for sepsis in intensive care.Nature Medicine24:1716–1720, URLhttp://dx. doi.org/10.1038/s41591-018-0213-5

work page doi:10.1038/s41591-018-0213-5 2018
[77]

Konda VR, Tsitsiklis JN (2000) Actor-critic algorithms.Advances in Neural Information Pro- cessing Systems12:1008–1014, URLhttps://proceedings.neurips.cc/paper/2000/hash/ 4e6cd95227cb0c280e99a195be5f6615-Abstract.html

2000
[78]

Kool W, van Hoof H, Welling M (2019) Attention, learn to solve routing problems!International Conference on Learning Representations, URLhttp://dx.doi.org/10.48550/arXiv.1803.08475

work page Pith review doi:10.48550/arxiv.1803.08475 2019
[79]

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence (IJCAI-21), 4475–4482, URLhttp://dx.doi.org/10.24963/ijcai.2021/610

KotaryJ,FiorettoF,VanHentenryckP,WilderB(2021)End-to-endconstrainedoptimizationlearning: Asurvey. Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence (IJCAI-21), 4475–4482, URLhttp://dx.doi.org/10.24963/ijcai.2021/610

work page doi:10.24963/ijcai.2021/610 2021
[80]

Lakshminarayanan B, Pritzel A, Blundell C (2017) Simple and scalable predictive uncertainty estimation using deepensembles.AdvancesinNeuralInformationProcessingSystems,volume30,URLhttp://dx.doi.org/ 10.48550/arXiv.1612.01474

work page Pith review doi:10.48550/arxiv.1612.01474 2017
[81]

Guyon I, Luxburg Uv, Bengio S, Wallach H, FergusR,VishwanathanSVN,GarnettR,eds.,AdvancesinNeuralInformationProcessingSystems30(Curran Associates, Inc.)

Lanctot M, Zambaldi V, Gruslys A, Lazaridou A, Tuyls K, Perolat J, Silver D, Graepel T (2017) A unified game-theoretic approach to multiagent reinforcement learning. Guyon I, Luxburg Uv, Bengio S, Wallach H, FergusR,VishwanathanSVN,GarnettR,eds.,AdvancesinNeuralInformationProcessingSystems30(Curran Associates, Inc.)

2017
[82]

LeCunY,BengioY,HintonG(2015)Deeplearning.Nature521(7553):436–444,URLhttp://dx.doi.org/ 10.1038/nature14539

work page doi:10.1038/nature14539 2015

Showing first 80 references.