Decision-Focused Learning: When and Why Traditional Prediction Models Fail

Mo Liu

arxiv: 2606.21773 · v1 · pith:57VUW5SAnew · submitted 2026-06-19 · 💻 cs.LG · stat.ME

Decision-Focused Learning: When and Why Traditional Prediction Models Fail

Mo Liu This is my paper

Pith reviewed 2026-06-26 14:22 UTC · model grok-4.3

classification 💻 cs.LG stat.ME

keywords decision-focused learningpredict-then-optimizestochastic linear programmingdecision qualitypredictive accuracyoperations researchmachine learning for optimization

0 comments

The pith

Improved predictive accuracy does not generally translate into better decisions when predictions feed into optimization problems.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that the predict-then-optimize approach, long standard in decision-making under uncertainty, often produces suboptimal decisions even with highly accurate predictions of unknown parameters. This disconnect motivates decision-focused learning, which trains models to optimize downstream decision quality rather than standalone prediction error. A sympathetic reader would care because it questions reliance on conventional statistical learning tools for operational decisions, especially in stochastic linear programming. The tutorial shows why data collection driven purely by predictive uncertainty and measures like the Wasserstein distance require rethinking in decision-focused contexts.

Core claim

The central claim is that improved predictive accuracy does not, in general, translate into improved decision quality, which has motivated decision-focused learning as a distinct paradigm that must rethink standard statistical tools including uncertainty-driven data collection and distributional distances such as the Wasserstein distance, with particular attention to stochastic linear programming as the downstream problem.

What carries the argument

The predict-then-optimize paradigm, in which predictions of unknown parameters are plugged directly into a downstream optimization problem, contrasted with decision-focused learning that aligns training directly to decision quality.

If this is right

Data collection strategies should incorporate the downstream decision problem rather than optimizing solely for predictive uncertainty reduction.
Distributional distance measures such as the Wasserstein distance are not guaranteed to align with improvements in decision quality.
New training objectives and evaluation metrics must be developed that directly target decision performance instead of prediction error.
Properties that distinguish decision-focused learning from conventional predictive modeling guide the design of specialized algorithms.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

In applied settings, a model with modestly lower accuracy on standard metrics could still be preferred if it yields higher-quality decisions under the same optimization constraints.
The same logic may extend beyond linear programs to other classes of stochastic optimization, though the paper confines its detailed treatment to that case.
Empirical comparisons that jointly report both prediction error and decision regret would provide clearer guidance for practitioners than accuracy alone.

Load-bearing premise

The disconnect between higher predictive accuracy and better decision quality is general enough across decision problems to require rethinking conventional statistical tools.

What would settle it

A controlled experiment on stochastic linear programs in which models with measurably higher predictive accuracy consistently produce decisions with higher expected value would falsify the central claim.

Figures

Figures reproduced from arXiv: 2606.21773 by Mo Liu.

read the original abstract

Plugging predictions of unknown parameters into downstream optimization problems, often referred to as the ``predict-then-optimize'' paradigm, has long been a standard approach in decision-making under uncertainty. However, improved predictive accuracy does not, in general, translate into improved decision quality. This disconnect has motivated growing interest in decision-focused learning (DFL) within the operations research community. This tutorial reviews recent developments in DFL and highlights key methodological insights, with a particular focus on stochastic linear programming as the downstream decision-making problem. We discuss why several widely used tools in traditional statistical learning are not directly suited to decision-focused settings and must be rethought, including (i) data collection strategies driven purely by predictive uncertainty and (ii) distributional distance measures such as the Wasserstein distance. We summarize properties of DFL that distinguish it from conventional predictive modeling and provide insights into the development of new decision-focused tools.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This is a clear tutorial on why predict-then-optimize falls short for stochastic LPs, but it adds no new results and its 'in general' claim stays within that narrow setting.

read the letter

The paper collects the standard observation that higher prediction accuracy need not improve downstream decisions and walks through why common statistical tools break in decision-focused settings. It focuses on stochastic linear programming and explains the issues with uncertainty-driven data collection and Wasserstein distances when the goal is decision quality rather than pointwise error. That part is useful for someone who wants a compact summary of the existing arguments without chasing the original papers.

Nothing in the work is new. It is explicitly a tutorial that restates limitations already in the cited literature and does not derive fresh properties, run new experiments, or extend the framework past linear programs. The abstract itself flags the LP focus, so the broader phrasing that improved accuracy 'does not, in general, translate' rests on an extrapolation the paper does not secure.

The soft spot is therefore scope. Without counter-examples or arguments for nonlinear, integer, or dynamic problems, the tutorial cannot claim the disconnect is universal. That does not make the LP-specific points wrong, only limited.

This is for operations-research readers who are new to decision-focused learning and want an organized entry point. It is not for people looking for original methods or broad theoretical results. A journal that publishes tutorials could reasonably send it to referees; the writing is straightforward and the citations appear on target. I would not cite it for a new claim, but I would point a student to it for background.

Referee Report

1 major / 0 minor

Summary. The manuscript is a tutorial reviewing decision-focused learning (DFL) in the predict-then-optimize framework for decision-making under uncertainty. It asserts that improved predictive accuracy does not, in general, translate into improved decision quality, motivating DFL. With a focus on stochastic linear programming, it explains why traditional statistical tools such as uncertainty-driven data collection and Wasserstein distance measures require rethinking in decision-focused settings, and summarizes distinguishing properties of DFL.

Significance. If the reviewed insights hold, the tutorial offers a valuable synthesis of recent DFL developments and methodological guidance for creating decision-aware tools, helping to bridge statistical learning and optimization in operations research.

major comments (1)

[Abstract] Abstract: The claim that improved predictive accuracy 'does not, in general, translate into improved decision quality' is stated broadly. However, the tutorial's scope is restricted to stochastic linear programming as the downstream problem, without extensions, arguments, or counterexamples for other classes such as nonlinear programs, integer programs, or dynamic settings. This limits the support for the 'in general' qualifier and should be qualified or expanded.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the careful reading and constructive comment on the abstract. We agree that the broad phrasing of the claim should be qualified to match the tutorial's explicit scope.

read point-by-point responses

Referee: [Abstract] Abstract: The claim that improved predictive accuracy 'does not, in general, translate into improved decision quality' is stated broadly. However, the tutorial's scope is restricted to stochastic linear programming as the downstream problem, without extensions, arguments, or counterexamples for other classes such as nonlinear programs, integer programs, or dynamic settings. This limits the support for the 'in general' qualifier and should be qualified or expanded.

Authors: We agree with the observation. The tutorial is explicitly scoped to stochastic linear programming (as stated in the abstract and throughout the manuscript), and the 'in general' phrasing in the opening sentence is not supported by arguments or examples outside this class. We will revise the abstract to replace the broad claim with a more precise statement that improved predictive accuracy does not necessarily translate into improved decision quality in stochastic linear programs, and we will add a brief note that analogous issues motivate DFL in other settings but lie outside the tutorial's scope. revision: yes

Circularity Check

0 steps flagged

Review paper draws on external literature; no internal derivation reduces to fitted inputs or self-citations

full rationale

This is a tutorial/review summarizing developments in decision-focused learning from the operations research literature. The abstract and provided text state a focus on stochastic linear programming and discuss limitations of standard statistical tools, but make no original derivations, predictions, or first-principles results whose validity depends on equations or parameters defined within the paper itself. All load-bearing claims are attributed to cited external work rather than self-contained reductions. No self-citation chains, fitted-input-as-prediction patterns, or ansatz smuggling are present. The paper is therefore self-contained against external benchmarks with score 0.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

As a review paper based on the abstract, no new free parameters, axioms, or invented entities are introduced by this work itself.

pith-pipeline@v0.9.1-grok · 5672 in / 1022 out tokens · 17297 ms · 2026-06-26T14:22:38.956845+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

52 extracted references · 12 canonical work pages · 2 internal anchors

[1]

Advances in Neural Information Processing Systems, 9558–9570

Agrawal A, Amos B, Barratt S, Boyd S, Diamond S, Kolter JZ (2019) Differentiable convex optimization layers. Advances in Neural Information Processing Systems, 9558–9570

2019
[2]

Amos B, Kolter JZ (2017) OptNet: Differentiable optimization as a layer in neural networks.International Conference on Machine Learning, 136–145

2017
[3]

Ban GY, Rudin C (2019) The big data newsvendor: Practical insights from machine learning.Operations Research 67(1):90–108

2019
[4]

Bennouna O, Bennouna A, Amin S, Ozdaglar A (2025) What data enables optimal decisions? An exact charac- terization for linear optimization.arXiv preprint arXiv:2505.21692

arXiv 2025
[5]

Berden S, Mahmuto ˘gulları A˙I, Tsouros D, Guns T (2025) Solver-free decision-focused learning for linear opti- mization problems.arXiv preprint arXiv:2505.22224

arXiv 2025
[6]

Bertsimas D, Kallus N (2020) From predictive to prescriptive analytics.Management Science66(3):1025–1044

2020
[7]

Bertsimas D, Mundru N (2023) Optimization-based scenario reduction for data-driven two-stage stochastic optimization.Operations Research71(4):1343–1361

2023
[8]

Bucarey V, Calder ´on S, Mu˜noz G, Semet F (2024) Decision-focused predictions via pessimistic bilevel optimiza- tion: a computational study.International Conference on the Integration of Constraint Programming, Artificial Intelligence, and Operations Research, 127–135 (Springer)

2024
[9]

org/10.48550/arXiv.2505.13564, arXiv:2505.13564

Capitaine A, Haddouche M, Moulines E, Jordan MI, Boursier E, Durmus A (2026) Online decision-focused learning.Proceedings of the International Conference on Learning Representations, URLhttp://dx.doi. org/10.48550/arXiv.2505.13564, arXiv:2505.13564. Mo Liu:Tutorial for Decision-Focused Learning 36 TutORials in Operations Research

work page doi:10.48550/arxiv.2505.13564 2026
[10]

Generative models for decision-making under distributional shift

Cheng X, Zhu Y, Xie Y (2026) Generative models for decision-making under distributional shift. URLhttp: //dx.doi.org/10.48550/arXiv.2604.04342

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2604.04342 2026
[11]

Chernozhukov V, Hansen C, Kallus N, Spindler M, Syrgkanis V (2024) Applied causal inference powered by ml and ai.arXiv preprint arXiv:2403.02467

arXiv 2024
[12]

Chu LY, Shanthikumar JG, Shen ZJM (2008) Solving operational statistics via a bayesian analysis.Operations research letters36(1):110–116

2008
[13]

Chung ATH, Abdulai J, Bayoh P, Sandi L, Smart F, Bastani H, Bastani O (2026) Improving access to essential medicines via decision-aware machine learning.Nature1–6

2026
[14]

arXiv (2025)

Cristian R, Harsha P, Perakis G, Quanz B (2025) Efficient end-to-end learning for decision-making: A meta- optimization approach.arXiv preprint arXiv:2505.11360URLhttp://dx.doi.org/10.48550/arXiv. 2505.11360

work page internal anchor Pith review doi:10.48550/arxiv 2025
[15]

Donti PL, Kolter JZ, Amos B (2017) Task-based end-to-end model learning in stochastic optimization.Advances in Neural Information Processing Systems, 5484–5494

2017
[16]

El Balghiti O, Elmachtoub AN, Grigas P, Tewari A (2023) Generalization bounds in the predict-then-optimize framework.Mathematics of Operations Research48(4):2043–2065, URLhttp://dx.doi.org/10.1287/ moor.2022.1330

arXiv 2023
[17]

predict, then optimize

Elmachtoub AN, Grigas P (2022) Smart “predict, then optimize”.Management Science68(1):9–26, URLhttp: //dx.doi.org/10.1287/mnsc.2020.3922

work page doi:10.1287/mnsc.2020.3922 2022
[18]

Elmachtoub AN, Lam H, Lan H, Zhang H (2025) Dissecting the impact of model misspecification in data-driven optimization.arXiv preprint arXiv:2503.00626

arXiv 2025
[19]

Elmachtoub AN, Lam H, Zhang H, Zhao Y (2023) Estimate-then-optimize versus integrated-estimation- optimization versus sample average approximation: A stochastic dominance perspective.arXiv preprint arXiv:2304.06833URLhttp://dx.doi.org/10.48550/arXiv.2304.06833

work page doi:10.48550/arxiv.2304.06833 2023
[20]

Er C, Liu M (2025) Decision-focused bias correction for fluid approximation.arXiv preprint arXiv:2512.15726 URLhttp://dx.doi.org/10.48550/arXiv.2512.15726

work page doi:10.48550/arxiv.2512.15726 2025
[21]

Feng Q, Shanthikumar JG, Wu J (2025) Contextual data-integrated newsvendor solution with operational data analytics (oda).Management Science71(11):9384–9403

2025
[22]

Gupta V (2026) End-to-end learning and optimization: Course reader. Course reader, USC Mar- shall School of Business, URLhttps://www.dropbox.com/scl/fo/hku67dioasy5rz0pxgs08/ AOl182gWOvBUvDujNd8m8f4?dl=0&e=1&rlkey=v0wumq8vbwqe0dia5dk6acw37, spring 2026. Accessed May 31, 2026

2026
[23]

Management Science68(12):8680–8698, URLhttp://dx.doi.org/10.1287/mnsc.2022.4321

Ho-Nguyen N, Kılınc ¸-Karzan F (2022) Risk guarantees for end-to-end prediction and optimization processes. Management Science68(12):8680–8698, URLhttp://dx.doi.org/10.1287/mnsc.2022.4321. Mo Liu:Tutorial for Decision-Focused Learning TutORials in Operations Research 37

work page doi:10.1287/mnsc.2022.4321 2022
[24]

Homem-de Mello T, Valencia J, Lagos F, Lagos G (2024) Forecasting outside the box: Application-driven optimal pointwise forecasts for stochastic optimization.arXiv preprint arXiv:2411.03520

arXiv 2024
[25]

Advances in neural information processing systems36:14247–14272

Hu X, Lee J, Lee J (2023) Two-stage predict+ optimize for MILPs with unknown parameters in constraints. Advances in neural information processing systems36:14247–14272

2023
[26]

Hu Y, Kallus N, Mao X (2022) Fast rates for contextual linear optimization.Management Science68(6):4236– 4245, URLhttp://dx.doi.org/10.1287/mnsc.2022.4383

work page doi:10.1287/mnsc.2022.4383 2022
[27]

Hu Y, Kallus N, Mao X, Wu Y (2025) Contextual linear optimization under partial feedback.Available at SSRN 5724783

2025
[28]

Huang M, Gupta V (2024) Decision-focused learning with directional gradients.Advances in Neural Information Processing Systems37:79194–79220

2024
[29]

Im H, Benslimane W, Grigas P (2025) Smart surrogate losses for contextual stochastic linear optimization with robust constraints.arXiv preprint arXiv:2505.22881

arXiv 2025
[30]

Kotary J, Di Vito V, Christopher J, Van Hentenryck P, Fioretto F (2023) Predict-then-optimize by proxy: Learning joint models of prediction and optimization.arXiv preprint arXiv:2311.13087

arXiv 2023
[31]

Lan H, Liao L, Elmachtoub AN, Kroer C, Lam H, Zhang H (2025) The bias-variance tradeoff in data-driven optimization: A local misspecification perspective.arXiv preprint arXiv:2510.18215

arXiv 2025
[32]

Lee J, Jin S, Lee Y (2026) Decision-focused learning via tangent-space projection of prediction error.Proceedings of the International Conference on Machine Learning, URLhttps://arxiv.org/abs/2605.01361

Pith/arXiv arXiv 2026
[33]

Liu H, Grigas P (2022) Online contextual decision-making with a smart predict-then-optimize method.arXiv preprint arXiv:2206.07316

arXiv 2022
[34]

0236, published online April 7, 2026

Liu M, Bai Y, Qi M, Shen ZJM (2026) Inventory management with transformer: Automated decision making for order timing and quantity.Service Science0(0), URLhttp://dx.doi.org/10.1287/serv.2024. 0236, published online April 7, 2026

work page doi:10.1287/serv.2024 2026
[35]

Available at SSRN 4487888

Liu M, Cao J, Shen ZJM (2023) Value of one data point: Active label acquisition in assortment optimization. Available at SSRN 4487888

2023
[36]

Liu M, Grigas P, Liu H, Shen ZJM (2023) Active learning in the predict-then-optimize framework: A margin- based approach.arXiv preprint arXiv:2305.06584URLhttp://dx.doi.org/10.48550/arXiv.2305. 06584

work page doi:10.48550/arxiv.2305 2023
[37]

Liu S, Liu M (2026) Decision-focused optimal transport.arXiv preprint arXiv:2602.02800

arXiv 2026
[38]

Mandi J, Kotary J, Berden S, Mulamba M, Bucarey V, Guns T, Fioretto F (2024) Decision-focused learning: Foundations, state of the art, benchmark and future opportunities.Journal of Artificial Intelligence Research 81:1623–1701, URLhttp://dx.doi.org/10.1613/jair.1.15320

work page doi:10.1613/jair.1.15320 2024
[39]

Mo Liu:Tutorial for Decision-Focused Learning 38 TutORials in Operations Research

Qi M, Grigas P, Shen ZJ (2025) Integrated conditional estimation-optimization.Operations Research. Mo Liu:Tutorial for Decision-Focused Learning 38 TutORials in Operations Research

2025
[40]

Qi M, Shi Y, Qi Y, Ma C, Yuan R, Wu D, Shen ZJ (2023) A practical end-to-end inventory management model with deep learning.Management Science69(2):759–773

2023
[41]

Rodriguez-Diaz P, Kong L, Wang K, Alvarez-Melis D, Tambe M (2024) What is the right notion of distance between predict-then-optimize tasks?arXiv preprint arXiv:2409.06997

arXiv 2024
[42]

Sadana U, Chenreddy A, Delage E, Forel A, Frejinger E, Vidal T (2025) A survey of contextual optimization methods for decision-making under uncertainty.European Journal of Operational Research320(2):271–289

2025
[43]

URLhttp://dx

Schneider PJ, Kuhn D (2026) Soft-radial projection for constrained end-to-end learning. URLhttp://dx. doi.org/10.48550/arXiv.2602.03461

work page doi:10.48550/arxiv.2602.03461 2026
[44]

Shah S, Wang K, Wilder B, Perrault A, Tambe M (2022) Decision-focused learning without decision-making: Learning locally optimized decision losses.Advances in Neural Information Processing Systems35:1320–1332

2022
[45]

Shah S, Wilder B, Perrault A, Tambe M (2024) Leaving the nest: Going beyond local loss functions for predict- then-optimize.Proceedings of the AAAI Conference on Artificial Intelligence, volume 38, 14902–14909

2024
[46]

Tang B, Khalil EB (2022) PyEPO: A pytorch-based end-to-end predict-then-optimize library with linear objective function.OPT 2022: Optimization for Machine Learning (NeurIPS 2022 Workshop)

2022
[47]

Wan B, Liu M (2026) A solver-free training method for predict-then-optimize.Proceedings of the International Conference on Machine Learning, URLhttps://arxiv.org/abs/2606.19587

Pith/arXiv arXiv 2026
[48]

Wan B, Liu M, Grigas P, Shen ZJM (2026) Decision-focused sequential experimental design: A directional uncertainty-guided approach.arXiv preprint arXiv:2602.05340URLhttp://dx.doi.org/10.48550/ arXiv.2602.05340

arXiv 2026
[49]

Wang K, Wilder B, Perrault A, Tambe M (2020) Automatically learning compact quality-aware surrogates for optimization problems.Advances in Neural Information Processing Systems

2020
[50]

Wilder B, Ewing E, Dilkina B, Tambe M (2019) End to end learning and optimization on graphs.Advances in Neural Information Processing Systems32

2019
[51]

Yeh C, Christianson N, Wierman A, Yue Y (2025) Conformal risk training: End-to-end optimization of conformal risk control.arXiv preprint arXiv:2510.08748

arXiv 2025
[52]

Zhao J (2024) Experimental design for causal inference through an optimization lens.Tutorials in Operations Research: Smarter Decisions for a Better World, 146–188 (INFORMS)

2024

[1] [1]

Advances in Neural Information Processing Systems, 9558–9570

Agrawal A, Amos B, Barratt S, Boyd S, Diamond S, Kolter JZ (2019) Differentiable convex optimization layers. Advances in Neural Information Processing Systems, 9558–9570

2019

[2] [2]

Amos B, Kolter JZ (2017) OptNet: Differentiable optimization as a layer in neural networks.International Conference on Machine Learning, 136–145

2017

[3] [3]

Ban GY, Rudin C (2019) The big data newsvendor: Practical insights from machine learning.Operations Research 67(1):90–108

2019

[4] [4]

Bennouna O, Bennouna A, Amin S, Ozdaglar A (2025) What data enables optimal decisions? An exact charac- terization for linear optimization.arXiv preprint arXiv:2505.21692

arXiv 2025

[5] [5]

Berden S, Mahmuto ˘gulları A˙I, Tsouros D, Guns T (2025) Solver-free decision-focused learning for linear opti- mization problems.arXiv preprint arXiv:2505.22224

arXiv 2025

[6] [6]

Bertsimas D, Kallus N (2020) From predictive to prescriptive analytics.Management Science66(3):1025–1044

2020

[7] [7]

Bertsimas D, Mundru N (2023) Optimization-based scenario reduction for data-driven two-stage stochastic optimization.Operations Research71(4):1343–1361

2023

[8] [8]

Bucarey V, Calder ´on S, Mu˜noz G, Semet F (2024) Decision-focused predictions via pessimistic bilevel optimiza- tion: a computational study.International Conference on the Integration of Constraint Programming, Artificial Intelligence, and Operations Research, 127–135 (Springer)

2024

[9] [9]

org/10.48550/arXiv.2505.13564, arXiv:2505.13564

Capitaine A, Haddouche M, Moulines E, Jordan MI, Boursier E, Durmus A (2026) Online decision-focused learning.Proceedings of the International Conference on Learning Representations, URLhttp://dx.doi. org/10.48550/arXiv.2505.13564, arXiv:2505.13564. Mo Liu:Tutorial for Decision-Focused Learning 36 TutORials in Operations Research

work page doi:10.48550/arxiv.2505.13564 2026

[10] [10]

Generative models for decision-making under distributional shift

Cheng X, Zhu Y, Xie Y (2026) Generative models for decision-making under distributional shift. URLhttp: //dx.doi.org/10.48550/arXiv.2604.04342

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2604.04342 2026

[11] [11]

Chernozhukov V, Hansen C, Kallus N, Spindler M, Syrgkanis V (2024) Applied causal inference powered by ml and ai.arXiv preprint arXiv:2403.02467

arXiv 2024

[12] [12]

Chu LY, Shanthikumar JG, Shen ZJM (2008) Solving operational statistics via a bayesian analysis.Operations research letters36(1):110–116

2008

[13] [13]

Chung ATH, Abdulai J, Bayoh P, Sandi L, Smart F, Bastani H, Bastani O (2026) Improving access to essential medicines via decision-aware machine learning.Nature1–6

2026

[14] [14]

arXiv (2025)

Cristian R, Harsha P, Perakis G, Quanz B (2025) Efficient end-to-end learning for decision-making: A meta- optimization approach.arXiv preprint arXiv:2505.11360URLhttp://dx.doi.org/10.48550/arXiv. 2505.11360

work page internal anchor Pith review doi:10.48550/arxiv 2025

[15] [15]

Donti PL, Kolter JZ, Amos B (2017) Task-based end-to-end model learning in stochastic optimization.Advances in Neural Information Processing Systems, 5484–5494

2017

[16] [16]

El Balghiti O, Elmachtoub AN, Grigas P, Tewari A (2023) Generalization bounds in the predict-then-optimize framework.Mathematics of Operations Research48(4):2043–2065, URLhttp://dx.doi.org/10.1287/ moor.2022.1330

arXiv 2023

[17] [17]

predict, then optimize

Elmachtoub AN, Grigas P (2022) Smart “predict, then optimize”.Management Science68(1):9–26, URLhttp: //dx.doi.org/10.1287/mnsc.2020.3922

work page doi:10.1287/mnsc.2020.3922 2022

[18] [18]

Elmachtoub AN, Lam H, Lan H, Zhang H (2025) Dissecting the impact of model misspecification in data-driven optimization.arXiv preprint arXiv:2503.00626

arXiv 2025

[19] [19]

Elmachtoub AN, Lam H, Zhang H, Zhao Y (2023) Estimate-then-optimize versus integrated-estimation- optimization versus sample average approximation: A stochastic dominance perspective.arXiv preprint arXiv:2304.06833URLhttp://dx.doi.org/10.48550/arXiv.2304.06833

work page doi:10.48550/arxiv.2304.06833 2023

[20] [20]

Er C, Liu M (2025) Decision-focused bias correction for fluid approximation.arXiv preprint arXiv:2512.15726 URLhttp://dx.doi.org/10.48550/arXiv.2512.15726

work page doi:10.48550/arxiv.2512.15726 2025

[21] [21]

Feng Q, Shanthikumar JG, Wu J (2025) Contextual data-integrated newsvendor solution with operational data analytics (oda).Management Science71(11):9384–9403

2025

[22] [22]

Gupta V (2026) End-to-end learning and optimization: Course reader. Course reader, USC Mar- shall School of Business, URLhttps://www.dropbox.com/scl/fo/hku67dioasy5rz0pxgs08/ AOl182gWOvBUvDujNd8m8f4?dl=0&e=1&rlkey=v0wumq8vbwqe0dia5dk6acw37, spring 2026. Accessed May 31, 2026

2026

[23] [23]

Management Science68(12):8680–8698, URLhttp://dx.doi.org/10.1287/mnsc.2022.4321

Ho-Nguyen N, Kılınc ¸-Karzan F (2022) Risk guarantees for end-to-end prediction and optimization processes. Management Science68(12):8680–8698, URLhttp://dx.doi.org/10.1287/mnsc.2022.4321. Mo Liu:Tutorial for Decision-Focused Learning TutORials in Operations Research 37

work page doi:10.1287/mnsc.2022.4321 2022

[24] [24]

Homem-de Mello T, Valencia J, Lagos F, Lagos G (2024) Forecasting outside the box: Application-driven optimal pointwise forecasts for stochastic optimization.arXiv preprint arXiv:2411.03520

arXiv 2024

[25] [25]

Advances in neural information processing systems36:14247–14272

Hu X, Lee J, Lee J (2023) Two-stage predict+ optimize for MILPs with unknown parameters in constraints. Advances in neural information processing systems36:14247–14272

2023

[26] [26]

Hu Y, Kallus N, Mao X (2022) Fast rates for contextual linear optimization.Management Science68(6):4236– 4245, URLhttp://dx.doi.org/10.1287/mnsc.2022.4383

work page doi:10.1287/mnsc.2022.4383 2022

[27] [27]

Hu Y, Kallus N, Mao X, Wu Y (2025) Contextual linear optimization under partial feedback.Available at SSRN 5724783

2025

[28] [28]

Huang M, Gupta V (2024) Decision-focused learning with directional gradients.Advances in Neural Information Processing Systems37:79194–79220

2024

[29] [29]

Im H, Benslimane W, Grigas P (2025) Smart surrogate losses for contextual stochastic linear optimization with robust constraints.arXiv preprint arXiv:2505.22881

arXiv 2025

[30] [30]

Kotary J, Di Vito V, Christopher J, Van Hentenryck P, Fioretto F (2023) Predict-then-optimize by proxy: Learning joint models of prediction and optimization.arXiv preprint arXiv:2311.13087

arXiv 2023

[31] [31]

Lan H, Liao L, Elmachtoub AN, Kroer C, Lam H, Zhang H (2025) The bias-variance tradeoff in data-driven optimization: A local misspecification perspective.arXiv preprint arXiv:2510.18215

arXiv 2025

[32] [32]

Lee J, Jin S, Lee Y (2026) Decision-focused learning via tangent-space projection of prediction error.Proceedings of the International Conference on Machine Learning, URLhttps://arxiv.org/abs/2605.01361

Pith/arXiv arXiv 2026

[33] [33]

Liu H, Grigas P (2022) Online contextual decision-making with a smart predict-then-optimize method.arXiv preprint arXiv:2206.07316

arXiv 2022

[34] [34]

0236, published online April 7, 2026

Liu M, Bai Y, Qi M, Shen ZJM (2026) Inventory management with transformer: Automated decision making for order timing and quantity.Service Science0(0), URLhttp://dx.doi.org/10.1287/serv.2024. 0236, published online April 7, 2026

work page doi:10.1287/serv.2024 2026

[35] [35]

Available at SSRN 4487888

Liu M, Cao J, Shen ZJM (2023) Value of one data point: Active label acquisition in assortment optimization. Available at SSRN 4487888

2023

[36] [36]

Liu M, Grigas P, Liu H, Shen ZJM (2023) Active learning in the predict-then-optimize framework: A margin- based approach.arXiv preprint arXiv:2305.06584URLhttp://dx.doi.org/10.48550/arXiv.2305. 06584

work page doi:10.48550/arxiv.2305 2023

[37] [37]

Liu S, Liu M (2026) Decision-focused optimal transport.arXiv preprint arXiv:2602.02800

arXiv 2026

[38] [38]

Mandi J, Kotary J, Berden S, Mulamba M, Bucarey V, Guns T, Fioretto F (2024) Decision-focused learning: Foundations, state of the art, benchmark and future opportunities.Journal of Artificial Intelligence Research 81:1623–1701, URLhttp://dx.doi.org/10.1613/jair.1.15320

work page doi:10.1613/jair.1.15320 2024

[39] [39]

Mo Liu:Tutorial for Decision-Focused Learning 38 TutORials in Operations Research

Qi M, Grigas P, Shen ZJ (2025) Integrated conditional estimation-optimization.Operations Research. Mo Liu:Tutorial for Decision-Focused Learning 38 TutORials in Operations Research

2025

[40] [40]

Qi M, Shi Y, Qi Y, Ma C, Yuan R, Wu D, Shen ZJ (2023) A practical end-to-end inventory management model with deep learning.Management Science69(2):759–773

2023

[41] [41]

Rodriguez-Diaz P, Kong L, Wang K, Alvarez-Melis D, Tambe M (2024) What is the right notion of distance between predict-then-optimize tasks?arXiv preprint arXiv:2409.06997

arXiv 2024

[42] [42]

Sadana U, Chenreddy A, Delage E, Forel A, Frejinger E, Vidal T (2025) A survey of contextual optimization methods for decision-making under uncertainty.European Journal of Operational Research320(2):271–289

2025

[43] [43]

URLhttp://dx

Schneider PJ, Kuhn D (2026) Soft-radial projection for constrained end-to-end learning. URLhttp://dx. doi.org/10.48550/arXiv.2602.03461

work page doi:10.48550/arxiv.2602.03461 2026

[44] [44]

Shah S, Wang K, Wilder B, Perrault A, Tambe M (2022) Decision-focused learning without decision-making: Learning locally optimized decision losses.Advances in Neural Information Processing Systems35:1320–1332

2022

[45] [45]

Shah S, Wilder B, Perrault A, Tambe M (2024) Leaving the nest: Going beyond local loss functions for predict- then-optimize.Proceedings of the AAAI Conference on Artificial Intelligence, volume 38, 14902–14909

2024

[46] [46]

Tang B, Khalil EB (2022) PyEPO: A pytorch-based end-to-end predict-then-optimize library with linear objective function.OPT 2022: Optimization for Machine Learning (NeurIPS 2022 Workshop)

2022

[47] [47]

Wan B, Liu M (2026) A solver-free training method for predict-then-optimize.Proceedings of the International Conference on Machine Learning, URLhttps://arxiv.org/abs/2606.19587

Pith/arXiv arXiv 2026

[48] [48]

Wan B, Liu M, Grigas P, Shen ZJM (2026) Decision-focused sequential experimental design: A directional uncertainty-guided approach.arXiv preprint arXiv:2602.05340URLhttp://dx.doi.org/10.48550/ arXiv.2602.05340

arXiv 2026

[49] [49]

Wang K, Wilder B, Perrault A, Tambe M (2020) Automatically learning compact quality-aware surrogates for optimization problems.Advances in Neural Information Processing Systems

2020

[50] [50]

Wilder B, Ewing E, Dilkina B, Tambe M (2019) End to end learning and optimization on graphs.Advances in Neural Information Processing Systems32

2019

[51] [51]

Yeh C, Christianson N, Wierman A, Yue Y (2025) Conformal risk training: End-to-end optimization of conformal risk control.arXiv preprint arXiv:2510.08748

arXiv 2025

[52] [52]

Zhao J (2024) Experimental design for causal inference through an optimization lens.Tutorials in Operations Research: Smarter Decisions for a Better World, 146–188 (INFORMS)

2024