COALA applies convex optimization reformulations of neural networks to direct preference optimization, claiming single-GPU training with ~18% of DPO's TFLOPs and competitive performance on multiple datasets and models up to 8B parameters.
Mathematical Programming Computation , volume=
3 Pith papers cite this work. Polarity classification is still indexing.
years
2026 3verdicts
UNVERDICTED 3representative citing papers
Optimizing trajectory-trees in belief space improves performance in partially observable robotic planning by capturing observation-dependent contingencies, shown via PO-MPC with D-AuLa optimization and PO-LGP extending LGP.
PEAR computes regret gradients via tangent-space projection of prediction error, delivering top decision quality and efficiency on LP and QP tasks without solver differentiation.
citing papers explorer
-
Convex Optimization for Alignment and Preference Learning on a Single GPU
COALA applies convex optimization reformulations of neural networks to direct preference optimization, claiming single-GPU training with ~18% of DPO's TFLOPs and competitive performance on multiple datasets and models up to 8B parameters.
-
Optimizing Trajectory-Trees in Belief Space: An Application from Model Predictive Control to Task and Motion Planning
Optimizing trajectory-trees in belief space improves performance in partially observable robotic planning by capturing observation-dependent contingencies, shown via PO-MPC with D-AuLa optimization and PO-LGP extending LGP.
-
Decision-Focused Learning via Tangent-Space Projection of Prediction Error
PEAR computes regret gradients via tangent-space projection of prediction error, delivering top decision quality and efficiency on LP and QP tasks without solver differentiation.