Decision-Focused Learning via Tangent-Space Projection of Prediction Error

Junhyeong Lee; Sangjin Jin; Yongjae Lee

arxiv: 2605.01361 · v2 · pith:R7BCWP37new · submitted 2026-05-02 · 💻 cs.LG

Decision-Focused Learning via Tangent-Space Projection of Prediction Error

Junhyeong Lee , Sangjin Jin , Yongjae Lee This is my paper

Pith reviewed 2026-05-20 23:34 UTC · model grok-4.3

classification 💻 cs.LG

keywords decision-focused learningregret gradienttangent space projectionactive constraintslinear programmingquadratic programmingend-to-end optimization

0 comments

The pith

The regret gradient equals the prediction error projected onto the tangent space of active constraints and scaled by local curvature.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that regret gradients in decision-focused learning have a closed-form geometric expression under standard regularity conditions with locally stable active constraints. This expression is the prediction error projected onto the tangent space of those active constraints and then scaled by local curvature. A reader would care because it means the true regret signal can be recovered by filtering out decision-irrelevant components from a simple mean-squared-error gradient, without differentiating through an optimization solver or introducing surrogate losses. The resulting method computes the gradient from a reduced linear system defined only over the active constraints.

Core claim

Under standard regularity with locally stable active constraints, the regret gradient admits a closed-form geometric characterization equivalent to the prediction error projected onto the tangent space of active constraints scaled by local curvature. This shows that regret gradients arise by filtering decision-irrelevant components from the MSE gradient. The authors therefore introduce a procedure that solves a reduced linear system over the active constraints to obtain the gradient directly.

What carries the argument

Tangent-space projection of the prediction error onto active constraints, which filters decision-irrelevant directions and scales the remainder by local curvature to recover the exact regret gradient.

If this is right

Regret gradients can be obtained without differentiating through solver iterations or running extra optimization solves.
A single reduced linear system over the active constraints is sufficient to produce the gradient.
The resulting predictor achieves higher downstream decision quality than existing baselines on linear and quadratic programs.
Performance advantages remain even when the underlying constraint set shifts after training.
Filtering decision-irrelevant components from the mean-squared-error gradient produces the true regret signal.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same projection idea could be applied to other bilevel problems where only a subset of constraints is active at optimality.
If active sets change rapidly across samples, the cost of identifying the active set might offset some of the computational savings.
The approach suggests a general recipe for turning any differentiable prediction loss into a decision-aware loss by an inexpensive geometric filter.

Load-bearing premise

Active constraints must remain locally stable and the optimization problem must satisfy standard regularity conditions so that the projection exactly recovers the regret gradient.

What would settle it

On a small linear program with known active set, compute the PEAR gradient from the reduced system and check whether it matches the regret gradient obtained by finite-difference perturbation of the prediction or by full automatic differentiation through the solver.

Figures

Figures reproduced from arXiv: 2605.01361 by Junhyeong Lee, Sangjin Jin, Yongjae Lee.

**Figure 1.** Figure 1: The PEAR framework. PEAR provides a direct pipeline for transforming prediction error into a regret gradient via a projection onto the local decision geometry. After identifying the active constraints, PEAR rescales the error by local curvature and projects it onto the feasible tangent space, filtering out decision-irrelevant components. 3. Regret Gradient via Tangent-Space Projection This section establis… view at source ↗

**Figure 2.** Figure 2: Robustness under constraint shifts. We train on an original train set and evaluate under shifted constraints. Our method maintains the lowest regret among DFL methods across all three tasks. Additional results are in Appendix F. malized regret, cumulative and annualized returns, Sharpe ratio, and maximum drawdown. Batched computation. In our implementation, the forward pass calls OSQP on each sample to ob… view at source ↗

**Figure 3.** Figure 3: Portfolio cumulative returns. Portfolio value over the test period. PEAR achieves the highest value among all methods. 80 100 120 140 160 180 Trading Days 0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3 Portfolio Value MSE (2-stage) PEAR (Ours) QPTH CVXPYLayers (a) Drawdown Period I (days 80–180) 560 580 600 620 640 Trading Days 1.5 2.0 2.5 3.0 3.5 4.0 4.5 Portfolio Value MSE (2-stage) PEAR (Ours) QPTH CVXPYLayers (b) Dra… view at source ↗

**Figure 4.** Figure 4: Drawdown analysis. Portfolio value during the two most severe market downturns. PEAR demonstrates the most stable behavior, indicating robust risk management. 18 view at source ↗

read the original abstract

Decision-Focused Learning (DFL) trains predictors to improve downstream decision quality, but computing regret gradients typically requires differentiating through solvers or relying on surrogate losses, which can be computationally expensive or deviate from the true objective. We show that, under standard regularity with locally stable active constraints, the regret gradient admits a closed-form geometric characterization, equivalent to the prediction error projected onto the tangent space of active constraints, scaled by local curvature. This reveals that regret gradients can be obtained by filtering decision-irrelevant components from the MSE gradient, providing a simpler and more direct alternative to existing approaches. Based on this, we propose PEAR (Projected Error As Regret-gradient), which computes regret gradients via a reduced linear system over active constraints, avoiding differentiation through solver iterations or additional optimization solves. Experiments on LP benchmarks and a real-world QP task show that PEAR achieves the best decision quality among all baselines while being the most computationally efficient, with gains that persist under constraint shifts.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives a closed-form geometric view of regret gradients as a tangent-space projection of prediction error, leading to an efficient PEAR method that works well in the reported experiments.

read the letter

The main takeaway is that the authors derive a geometric characterization of regret gradients in decision-focused learning. Under standard regularity and locally stable active constraints, the gradient reduces to the prediction error projected onto the tangent space of the active set, scaled by local curvature. They turn this into PEAR, which solves a smaller linear system over active constraints instead of differentiating through the solver or running extra optimizations.

Referee Report

1 major / 2 minor

Summary. The manuscript claims that, under standard regularity assumptions with locally stable active constraints, the regret gradient in decision-focused learning admits a closed-form geometric characterization: the prediction error projected onto the tangent space of the active constraints and scaled by local curvature. This yields the PEAR method, which obtains regret gradients from a reduced linear system over active constraints without differentiating through solvers or additional optimizations. Experiments on LP benchmarks and a real-world QP task report that PEAR achieves the highest decision quality among baselines while being the most computationally efficient, with gains persisting under constraint shifts.

Significance. If the derivation is correct and the local-stability assumption holds in the reported regimes, the work supplies a direct, low-overhead route to true regret gradients that filters decision-irrelevant components from the MSE gradient. The geometric insight could simplify implementation and analysis in optimization-based learning pipelines.

major comments (1)

[Experiments] The equivalence between the tangent-space projection and the true regret gradient is load-bearing on the assumption of locally stable active constraints (KKT system insensitive to small parameter perturbations). The Experiments section reports no diagnostic—e.g., frequency of active-set changes during training or under observed prediction errors—for the LP or QP benchmarks; without such verification the claimed closed-form characterization may not equal the regret gradient when the active set flips.

minor comments (2)

[Abstract] The abstract and introduction should explicitly list the regularity conditions (e.g., LICQ, strict complementarity, second-order sufficient conditions) rather than referring only to “standard regularity.”
[Method] Notation for the reduced linear system (PEAR) should be cross-referenced to the KKT stationarity conditions so readers can trace the projection and curvature scaling without ambiguity.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their constructive feedback and for highlighting the importance of verifying the locally stable active constraints assumption. We address the major comment below and commit to revisions that will strengthen the empirical support for our claims.

read point-by-point responses

Referee: [Experiments] The equivalence between the tangent-space projection and the true regret gradient is load-bearing on the assumption of locally stable active constraints (KKT system insensitive to small parameter perturbations). The Experiments section reports no diagnostic—e.g., frequency of active-set changes during training or under observed prediction errors—for the LP or QP benchmarks; without such verification the claimed closed-form characterization may not equal the regret gradient when the active set flips.

Authors: We agree that explicit verification of active-set stability is valuable to confirm that the closed-form characterization equals the true regret gradient in the reported regimes. The theoretical derivation relies on the standard regularity assumption of locally stable active constraints, but the current Experiments section does not include diagnostics such as the frequency of active-set changes. In the revised manuscript we will add these diagnostics for both the LP benchmarks and the real-world QP task, reporting the rate of active-set flips during training and under the observed prediction errors. This will provide direct evidence that the assumption holds and that PEAR computes regret gradients without active-set changes affecting the results. revision: yes

Circularity Check

0 steps flagged

Derivation of regret gradient via tangent-space projection is self-contained first-principles result

full rationale

The paper derives the closed-form geometric characterization of the regret gradient directly from KKT conditions under the stated regularity assumptions with locally stable active constraints, showing equivalence to the prediction error projected onto the tangent space scaled by local curvature. This is obtained by filtering decision-irrelevant components from the MSE gradient and solving a reduced linear system, without any fitted parameters, self-citations for load-bearing uniqueness theorems, or renaming of known empirical patterns. The PEAR method follows as a direct computational consequence rather than a tautology or input-renamed output. No steps reduce by construction to the paper's own data or prior fitted results; the derivation remains independent and falsifiable via the provided LP/QP benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the assumption of standard regularity and locally stable active constraints for the projection to be valid; no free parameters or new entities are introduced in the abstract.

axioms (1)

domain assumption standard regularity with locally stable active constraints
Invoked to guarantee the closed-form geometric characterization of the regret gradient.

pith-pipeline@v0.9.0 · 5696 in / 1034 out tokens · 28073 ms · 2026-05-20T23:34:07.325111+00:00 · methodology

Review history (2 revisions) →

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

the regret gradient admits a closed-form geometric characterization, equivalent to the prediction error projected onto the tangent space of active constraints, scaled by local curvature
IndisputableMonolith/Foundation/AbsoluteFloorClosure.lean absolute_floor_iff_bare_distinguishability unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Under Assumptions 3.1–3.3, the regret gradient admits the closed form ∇_ĉR = P_H e

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

12 extracted references · 12 canonical work pages · 1 internal anchor

[1]

Agrawal, A., Amos, B., Barratt, S., Boyd, S., Diamond, S., and Kolter, J. Z. Differentiable convex optimization lay- ers.Advances in neural information processing systems, 32, 2019a. Agrawal, A., Barratt, S., Boyd, S., Busseti, E., and Moursi, W. M. Differentiating through a cone program.arXiv preprint arXiv:1904.09043, 2019b. Amos, B. and Kolter, J. Z. O...

work page arXiv 1904
[2]

Leveraging augmented-lagrangian techniques for differ- entiating over infeasible quadratic programs in machine learning

Bambade, A., Schramm, F., Taylor, A., and Carpentier, J. Leveraging augmented-lagrangian techniques for differ- entiating over infeasible quadratic programs in machine learning. InICLR 2024-The Twelfth International Con- ference on Learning Representations,

work page 2024
[3]

Decision- informed neural networks with large language model integration for portfolio optimization.arXiv preprint arXiv:2502.00828,

Hwang, Y ., Kong, Y ., Zohren, S., and Lee, Y . Decision- informed neural networks with large language model integration for portfolio optimization.arXiv preprint arXiv:2502.00828,

work page arXiv
[4]

˙I., Berden, S., and Guns, T

Mandi, J., Mahmuto ˘gulları, A. ˙I., Berden, S., and Guns, T. Minimizing surrogate losses for decision-focused learning using differentiable optimization.arXiv preprint arXiv:2508.11365,

work page arXiv
[5]

Contrastive losses and so- lution caching for predict-and-optimize.arXiv preprint arXiv:2011.05354,

Mulamba, M., Mandi, J., Diligenti, M., Lombardi, M., Bucarey, V ., and Guns, T. Contrastive losses and so- lution caching for predict-and-optimize.arXiv preprint arXiv:2011.05354,

work page arXiv 2011
[6]

PyEPO: A PyTorch-based End-to-End Predict-then-Optimize Library for Linear and Integer Programming

Tang, B. and Khalil, E. B. Pyepo: A pytorch-based end-to- end predict-then-optimize library for linear and integer programming.arXiv preprint arXiv:2206.14234,

work page internal anchor Pith review Pith/arXiv arXiv
[7]

Restless multi-armed bandits for maternal and child health: Results from decision- focused learning

Verma, S., Mate, A., Wang, K., Madhiwalla, N., Hegde, A., Taneja, A., and Tambe, M. Restless multi-armed bandits for maternal and child health: Results from decision- focused learning. InProceedings of the 2023 Interna- tional Conference on Autonomous Agents and Multiagent Systems, pp. 1312–1320,

work page 2023
[8]

Additional Theory A.1

10 Decision-Focused Learning via Tangent-Space Projection of Prediction Error A. Additional Theory A.1. Regularity and Local Differentiability This appendix states standard conditions under which the solution mapping ˆc7→z ∗(ˆc)is locally single-valued and differentiable and the active set remains locally stable. Under Assumptions 3.1–3.3, classical sensi...

work page 1983
[9]

Under this assumption, the same Jacobian J and projection operator PH apply

Proof of Equation (15).Assume that the active set remains unchanged when replacing ˆcby ˆc+αδN for any scalar α. Under this assumption, the same Jacobian J and projection operator PH apply. By Theorem 3.5, the regret gradient is given by ∇ˆcR(ˆc;c) =PH(ˆc−c). 12 Decision-Focused Learning via Tangent-Space Projection of Prediction Error Hence, ∇ˆcR(ˆc+αδN;...

work page 2022
[10]

For evaluation, portfolios are rebalanced every 21 trading days with a transaction cost of 0.1% applied to turnover

based on validation regret. For evaluation, portfolios are rebalanced every 21 trading days with a transaction cost of 0.1% applied to turnover. We report cumulative returnQT t=1(1 +r t)−1 , annualized return and volatility (scaled by √ 252), Sharpe ratio assuming zero risk-free rate, maximum drawdown, and average turnover at each rebalancing. F. Addition...

work page 2021
[11]

Large β (0.5) degrades performance on Knapsack, worsening monotonically with the polynomial degree

and β= 0.1 best at Deg 4, whereas the pure gradient is the weakest of the small-β settings. Large β (0.5) degrades performance on Knapsack, worsening monotonically with the polynomial degree. PEAR is otherwise stable for β∈[0.05,0.2], and we useβ= 0.1as the default. 18 Decision-Focused Learning via Tangent-Space Projection of Prediction Error Table 8.Norm...

work page 2020
[12]

Table 10.Real-world Knapsack (building investment).Normalized regret (%,↓) over 5 seeds

The budget is set to B= 0.4 P i wi, all other settings follow Appendix E.1, and we report normalized regret over 5 seeds. Table 10.Real-world Knapsack (building investment).Normalized regret (%,↓) over 5 seeds. Best inbold, second best underlined . MethodMSE SPO+ PFYL DBB LA V A PEAR Regret (%) 1.22±0.890.72±0.570.70±0.80 12.20±10.052.12±0.790.51±0.57 Tab...

work page 2006

[1] [1]

Agrawal, A., Amos, B., Barratt, S., Boyd, S., Diamond, S., and Kolter, J. Z. Differentiable convex optimization lay- ers.Advances in neural information processing systems, 32, 2019a. Agrawal, A., Barratt, S., Boyd, S., Busseti, E., and Moursi, W. M. Differentiating through a cone program.arXiv preprint arXiv:1904.09043, 2019b. Amos, B. and Kolter, J. Z. O...

work page arXiv 1904

[2] [2]

Leveraging augmented-lagrangian techniques for differ- entiating over infeasible quadratic programs in machine learning

Bambade, A., Schramm, F., Taylor, A., and Carpentier, J. Leveraging augmented-lagrangian techniques for differ- entiating over infeasible quadratic programs in machine learning. InICLR 2024-The Twelfth International Con- ference on Learning Representations,

work page 2024

[3] [3]

Decision- informed neural networks with large language model integration for portfolio optimization.arXiv preprint arXiv:2502.00828,

Hwang, Y ., Kong, Y ., Zohren, S., and Lee, Y . Decision- informed neural networks with large language model integration for portfolio optimization.arXiv preprint arXiv:2502.00828,

work page arXiv

[4] [4]

˙I., Berden, S., and Guns, T

Mandi, J., Mahmuto ˘gulları, A. ˙I., Berden, S., and Guns, T. Minimizing surrogate losses for decision-focused learning using differentiable optimization.arXiv preprint arXiv:2508.11365,

work page arXiv

[5] [5]

Contrastive losses and so- lution caching for predict-and-optimize.arXiv preprint arXiv:2011.05354,

Mulamba, M., Mandi, J., Diligenti, M., Lombardi, M., Bucarey, V ., and Guns, T. Contrastive losses and so- lution caching for predict-and-optimize.arXiv preprint arXiv:2011.05354,

work page arXiv 2011

[6] [6]

PyEPO: A PyTorch-based End-to-End Predict-then-Optimize Library for Linear and Integer Programming

Tang, B. and Khalil, E. B. Pyepo: A pytorch-based end-to- end predict-then-optimize library for linear and integer programming.arXiv preprint arXiv:2206.14234,

work page internal anchor Pith review Pith/arXiv arXiv

[7] [7]

Restless multi-armed bandits for maternal and child health: Results from decision- focused learning

Verma, S., Mate, A., Wang, K., Madhiwalla, N., Hegde, A., Taneja, A., and Tambe, M. Restless multi-armed bandits for maternal and child health: Results from decision- focused learning. InProceedings of the 2023 Interna- tional Conference on Autonomous Agents and Multiagent Systems, pp. 1312–1320,

work page 2023

[8] [8]

Additional Theory A.1

10 Decision-Focused Learning via Tangent-Space Projection of Prediction Error A. Additional Theory A.1. Regularity and Local Differentiability This appendix states standard conditions under which the solution mapping ˆc7→z ∗(ˆc)is locally single-valued and differentiable and the active set remains locally stable. Under Assumptions 3.1–3.3, classical sensi...

work page 1983

[9] [9]

Under this assumption, the same Jacobian J and projection operator PH apply

Proof of Equation (15).Assume that the active set remains unchanged when replacing ˆcby ˆc+αδN for any scalar α. Under this assumption, the same Jacobian J and projection operator PH apply. By Theorem 3.5, the regret gradient is given by ∇ˆcR(ˆc;c) =PH(ˆc−c). 12 Decision-Focused Learning via Tangent-Space Projection of Prediction Error Hence, ∇ˆcR(ˆc+αδN;...

work page 2022

[10] [10]

For evaluation, portfolios are rebalanced every 21 trading days with a transaction cost of 0.1% applied to turnover

based on validation regret. For evaluation, portfolios are rebalanced every 21 trading days with a transaction cost of 0.1% applied to turnover. We report cumulative returnQT t=1(1 +r t)−1 , annualized return and volatility (scaled by √ 252), Sharpe ratio assuming zero risk-free rate, maximum drawdown, and average turnover at each rebalancing. F. Addition...

work page 2021

[11] [11]

Large β (0.5) degrades performance on Knapsack, worsening monotonically with the polynomial degree

and β= 0.1 best at Deg 4, whereas the pure gradient is the weakest of the small-β settings. Large β (0.5) degrades performance on Knapsack, worsening monotonically with the polynomial degree. PEAR is otherwise stable for β∈[0.05,0.2], and we useβ= 0.1as the default. 18 Decision-Focused Learning via Tangent-Space Projection of Prediction Error Table 8.Norm...

work page 2020

[12] [12]

Table 10.Real-world Knapsack (building investment).Normalized regret (%,↓) over 5 seeds

The budget is set to B= 0.4 P i wi, all other settings follow Appendix E.1, and we report normalized regret over 5 seeds. Table 10.Real-world Knapsack (building investment).Normalized regret (%,↓) over 5 seeds. Best inbold, second best underlined . MethodMSE SPO+ PFYL DBB LA V A PEAR Regret (%) 1.22±0.890.72±0.570.70±0.80 12.20±10.052.12±0.790.51±0.57 Tab...

work page 2006