arxiv: 2604.24488 · v1 · submitted 2026-04-27 · 💻 cs.DS · math.OC

Recognition: unknown

Scalable First-Order Interior Point Trust Region Algorithms for Linearly Constrained Optimization

Yuexin Su , Chenyi Zhang , Peiyuan Huang , Tongyang Li , Yinyu Ye

Authors on Pith no claims yet

Pith reviewed 2026-05-08 01:27 UTC · model grok-4.3

classification 💻 cs.DS math.OC

keywords interior point methodstrust region algorithmslinear constraintslow-rank updatesfirst-order optimizationnonconvex programmingKKT conditionsscalability

0 comments

The pith

Replacing exact trust-region solves with a low-rank updated approximate projector yields a scalable first-order interior-point method that keeps iterates feasible and retains global convergence.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper develops an approximate version of interior-point trust-region algorithms for linearly constrained nonconvex optimization. Instead of solving a trust-region subproblem exactly at each step, it maintains an approximate projector using low-rank updates, which lowers the cost per iteration. The method still produces feasible points and converges globally to first-order KKT points, and it extends to second-order points using only first-order information. Experiments show speedups of up to 2.48 times on large problems, making the approach practical where exact versions were too slow.

Core claim

The paper presents an approximate first-order IPTR framework in which an approximate projector is maintained through low-rank updates in place of exact trust-region subproblem solves. This produces a method that preserves strict feasibility of all iterates and the global convergence guarantees of standard IPTR schemes for reaching approximate first-order KKT points in linearly constrained problems. The framework further integrates a gradient-based routine to find approximate second-order KKT points without computing Hessians.

What carries the argument

A low-rank-updated approximate projector that substitutes for repeated exact solutions of trust-region subproblems while controlling the approximation error to retain convergence.

Load-bearing premise

The low-rank updates keep the approximate projector accurate enough to inherit the convergence and feasibility properties of the exact interior-point trust-region method.

What would settle it

Observing divergence or infeasibility on a test problem where the projector approximation error accumulates beyond a threshold that the theory assumes is controlled.

Figures

Figures reproduced from arXiv: 2604.24488 by Chenyi Zhang, Peiyuan Huang, Tongyang Li, Yinyu Ye, Yuexin Su.

**Figure 1.** Figure 1: Iteration trajectories of IPTR-type algorithms on the constrained nonconvex problem min view at source ↗

**Figure 2.** Figure 2: Iteration trajectories of IPTR-type algorithms on the constrained nonconvex problem min view at source ↗

**Figure 3.** Figure 3: Empirical running time per iteration and convergence to approximate first- and second-order KKT points view at source ↗

**Figure 4.** Figure 4: Empirical running time per iteration and convergence to approximate first- and second-order KKT points view at source ↗

**Figure 5.** Figure 5: Empirical running time per iteration and convergence to approximate first- and second-order KKT points view at source ↗

read the original abstract

Computing approximate Karush--Kuhn--Tucker (KKT) points for constrained nonconvex programs is a fundamental problem in mathematical programming. Interior-point trust-region (IPTR) methods are particularly attractive for such problems because they maintain strictly feasible iterates throughout the iterative process and converge to a first-order and second-order KKT solution. Their scalability, however, is limited by the repeated computation of trust-region search directions. In this paper, we propose an approximate first-order IPTR framework that addresses this bottleneck by replacing exact trust-region subproblem solves with an approximate projector maintained through low-rank updates. The resulting method preserves feasibility and the global convergence guarantees of standard IPTR schemes while substantially reducing the per-iteration cost. We further extend the framework to obtain approximate second-order KKT points using only first-order information by integrating a gradient-based negative-curvature routine, thus avoiding explicit Hessian computations. We conduct numerical experiments to demonstrate the scalability of our approximate first-order IPTR framework in large-scale settings, where it achieves up to a $2.48\times$ speedup over the existing first-order IPTR algorithm.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives a low-rank update for the projector inside IPTR trust-region steps plus a gradient-only negative-curvature finder, which cuts per-iteration work and shows decent speedups, but the argument that exact global convergence is retained rests on missing error bounds.

read the letter

The main new piece is the low-rank maintenance of an approximate projector for the trust-region subproblem, combined with a first-order routine to detect negative curvature without forming the Hessian. This targets the repeated expensive solves that limit standard IPTR on large linearly constrained nonconvex problems. The experiments report up to 2.48x speedup while keeping iterates feasible, which is the practical payoff they emphasize.

Referee Report

2 major / 2 minor

Summary. The paper proposes an approximate first-order interior-point trust-region (IPTR) framework for linearly constrained nonconvex optimization. It replaces exact trust-region subproblem solves with a low-rank-updated approximate projector to reduce per-iteration cost, while claiming to preserve strict feasibility and the global convergence guarantees of standard IPTR methods to first-order KKT points. The framework is extended to approximate second-order KKT points using a gradient-based negative-curvature routine that avoids explicit Hessian evaluations. Numerical experiments on large-scale instances report speedups of up to 2.48× relative to the exact first-order IPTR baseline.

Significance. If the preservation of global convergence under the low-rank approximation can be rigorously established with explicit error controls, the work would be significant for scaling IPTR methods to large linearly constrained problems without losing their feasibility-maintenance and KKT-convergence properties. The gradient-only extension for second-order points and the reported practical speedups would further strengthen its impact on scalable nonconvex optimization.

major comments (2)

[§4 (Convergence Analysis)] §4 (Convergence Analysis): The central claim that the low-rank-updated approximate projector 'preserves ... the global convergence guarantees of standard IPTR schemes' is load-bearing for the paper's contribution, yet no explicit error bound on the projector approximation (independent of iteration count) is derived to ensure the sufficient-decrease condition and KKT-residual control continue to hold; without this, accumulation of projection error could invalidate the descent lemma in the limit.
[§5 (Numerical Experiments)] §5 (Numerical Experiments), speedup claim: The reported 2.48× speedup is presented without baseline implementation details, problem dimensions, or statistical significance testing, which is necessary to substantiate the scalability improvement over the exact IPTR method.

minor comments (2)

[Abstract] Abstract: The phrase 'up to a 2.48× speedup' should specify the exact baseline algorithm and problem class for clarity.
Notation in the low-rank update description could be improved by including an explicit formula or pseudocode for the update rule to aid reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their careful reading of the manuscript and for providing constructive feedback. We address the major comments point by point below, indicating the changes we will make in the revised version.

read point-by-point responses

Referee: §4 (Convergence Analysis): The central claim that the low-rank-updated approximate projector 'preserves ... the global convergence guarantees of standard IPTR schemes' is load-bearing for the paper's contribution, yet no explicit error bound on the projector approximation (independent of iteration count) is derived to ensure the sufficient-decrease condition and KKT-residual control continue to hold; without this, accumulation of projection error could invalidate the descent lemma in the limit.

Authors: We thank the referee for highlighting this important point. Upon re-examination, we recognize that while the analysis in Section 4 establishes convergence under the assumption that the approximate projector satisfies a uniform error bound, an explicit derivation of such a bound independent of the iteration count was not provided. In the revised manuscript, we will include a new result (Lemma 4.X) that derives this bound using the properties of the low-rank updates, showing that the error remains controlled by a constant depending only on the problem data and the update rank, thereby ensuring the descent lemma and KKT convergence hold without accumulation issues. revision: yes
Referee: §5 (Numerical Experiments), speedup claim: The reported 2.48× speedup is presented without baseline implementation details, problem dimensions, or statistical significance testing, which is necessary to substantiate the scalability improvement over the exact IPTR method.

Authors: We agree that the experimental section would benefit from more details. In the revised version, we will augment Section 5 with: detailed descriptions of the exact IPTR baseline implementation (including solver used for subproblems), the specific dimensions (number of variables and constraints) for each test instance, and results averaged over 10 independent runs with standard deviations to confirm the statistical significance of the 2.48× speedup. revision: yes

Circularity Check

0 steps flagged

No significant circularity; algorithmic construction and convergence claims are self-contained

full rationale

The paper derives an approximate first-order IPTR method by replacing exact trust-region solves with a low-rank-updated projector, then asserts preservation of feasibility and global convergence from the standard IPTR framework. No step reduces by the paper's own equations to a fitted quantity, self-defined parameter, or load-bearing self-citation chain; the low-rank update is presented as an explicit algorithmic modification whose error properties are analyzed independently of the target result. Numerical scaling results are reported separately from the theoretical guarantees. The derivation chain therefore remains non-circular.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on standard convergence theory for exact IPTR methods plus the unstated assumption that the low-rank approximation error does not destroy those guarantees. No free parameters, invented entities, or ad-hoc axioms are visible in the abstract.

axioms (1)

domain assumption Standard global convergence theory for exact interior-point trust-region methods on linearly constrained problems
Invoked when claiming that the approximate version inherits the same guarantees

pith-pipeline@v0.9.0 · 5506 in / 1274 out tokens · 27782 ms · 2026-05-08T01:27:23.877559+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

58 extracted references · 42 canonical work pages

[1]

Bengio, H

Zeyuan Allen-Zhu and Yuanzhi Li,Neon2: Finding local minima via first-order oracles, Advances in Neural Information Processing Systems (S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa- Bianchi, and R. Garnett, eds.), vol. 31, Curran Associates, Inc., 2018, arXiv:1711.06673

work page arXiv 2018
[2]

2005–2039, 2025,https://epubs.siam.org/doi/pdf/10.1137/ 1.9781611978322.63

Josh Alman, Ran Duan, Virginia Vassilevska Williams, Yinzhan Xu, Zixuan Xu, and Renfei Zhou,More asymmetry yields faster matrix multiplication, Proceedings of the 2025 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pp. 2005–2039, 2025,https://epubs.siam.org/doi/pdf/10.1137/ 1.9781611978322.63. 31

2025
[3]

Anstreicher,Volumetric path following algorithms for linear programming, Mathematical Pro- gramming76(1997), no

Kurt M. Anstreicher,Volumetric path following algorithms for linear programming, Mathematical Pro- gramming76(1997), no. 1, 245–263,https://doi.org/10.1007/BF02614386

work page doi:10.1007/bf02614386 1997
[4]

Amir Beck,First-order methods in optimization, Society for Industrial and Applied Mathematics, Philadelphia, PA, 2017,https://epubs.siam.org/doi/abs/10.1137/1.9781611974997

work page doi:10.1137/1.9781611974997 2017
[5]

D. P. Bertsekas,Nonlinear programming, Journal of the Operational Research Society48(1997), no. 3, 334–334,https://doi.org/10.1057/palgrave.jors.2600425

work page doi:10.1057/palgrave.jors.2600425 1997
[6]

1, 1–61,https://doi.org/10

Digvijay Boob, Qi Deng, and Guanghui Lan,Level constrained first order methods for function con- strained optimization, Mathematical Programming209(2025), no. 1, 1–61,https://doi.org/10. 1007/s10107-024-02057-4

2025
[7]

Duchi, Oliver Hinder, and Aaron Sidford,Accelerated methods for nonconvex optimization, SIAM Journal on Optimization28(2018), no

Yair Carmon, John C. Duchi, Oliver Hinder, and Aaron Sidford,Accelerated methods for nonconvex optimization, SIAM Journal on Optimization28(2018), no. 2, 1751–1772,https://doi.org/10.1137/ 17M1114296

2018
[8]

Cartis, N

C. Cartis, N. I. M. Gould, and Ph. L. Toint,An adaptive cubic regularization algorithm for nonconvex optimization with convex constraints and its function-evaluation complexity, IMA Journal of Numerical Analysis32(2012), no. 4, 1662–1695,https://doi.org/10.1093/imanum/drr035

work page doi:10.1093/imanum/drr035 2012
[9]

Coralia Cartis, Nicholas I. M. Gould, and Philippe L. Toint,On the evaluation complexity of cubic regularization methods for potentially rank-deficient nonlinear least-squares problems and its relevance to constrained nonlinear optimization, SIAM Journal on Optimization23(2013), no. 3, 1553–1574, https://doi.org/10.1137/120869687

work page doi:10.1137/120869687 2013
[10]

Coralia Cartis, Nicholas I. M. Gould, and Philippe L. Toint,On the evaluation complexity of con- strained nonlinear least-squares and general constrained nonlinear optimization using second-order meth- ods, SIAM Journal on Numerical Analysis53(2015), no. 2, 836–851,https://doi.org/10.1137/ 130915546

2015
[11]

Taherkhani, A

Hangjun Che, Jun Wang, and Andrzej Cichocki,Sparse signal reconstruction via collaborative neuro- dynamic optimization, Neural Networks154(2022), 255–269,https://doi.org/10.1016/j.neunet. 2022.07.018

work page doi:10.1016/j.neunet 2022
[12]

3, 1–103,https://doi.org/10.1145/3728631

Li Chen, Rasmus Kyng, Yang Liu, Richard Peng, Maximilian Probst Gutenberg, and Sushant Sachdeva, Maximum flow and minimum-cost flow in almost-linear time, Journal of the ACM72(2025), no. 3, 1–103,https://doi.org/10.1145/3728631

work page doi:10.1145/3728631 2025
[13]

Cohen, Yin Tat Lee, and Zhao Song,Solving linear programs in the current matrix multipli- cation time, Journal of the ACM (JACM)68(2021), no

Michael B. Cohen, Yin Tat Lee, and Zhao Song,Solving linear programs in the current matrix multipli- cation time, Journal of the ACM (JACM)68(2021), no. 1, 1–39,https://doi.org/10.1145/3313276. 3316303

work page doi:10.1145/3313276 2021
[14]

1, 59–91,https://doi.org/10.1007/s00211-007-0114-x

James Demmel, Ioana Dumitriu, and Olga Holtz,Fast linear algebra is stable, Numerische Mathematik 108(2007), no. 1, 59–91,https://doi.org/10.1007/s00211-007-0114-x

work page doi:10.1007/s00211-007-0114-x 2007
[15]

2024.102379

Yi-Ting Guo, Qin-Qin Li, and Chun-Sheng Liang,The rise of nonnegative matrix factorization: Algo- rithms and applications, Information Systems123(2024), 102379,https://doi.org/10.1016/j.is. 2024.102379

work page doi:10.1016/j.is 2024
[16]

1, 263–299,https://doi.org/10.1007/s10107-018-1290-4

Gabriel Haeser, Hongcheng Liu, and Yinyu Ye,Optimality condition and complexity analysis for linearly- constrained optimization without differentiability on the boundary, Mathematical Programming178 (2019), no. 1, 263–299,https://doi.org/10.1007/s10107-018-1290-4

work page doi:10.1007/s10107-018-1290-4 2019
[17]

3, 1734–1766,https://doi.org/10.1137/ 22M1489824

Chuan He, Zhaosong Lu, and Ting Kei Pong,A Newton-CG based augmented Lagrangian method for finding a second-order stationary point of nonconvex equality constrained optimization with complexity guarantees, SIAM Journal on Optimization33(2023), no. 3, 1734–1766,https://doi.org/10.1137/ 22M1489824. 32

2023
[18]

233–244, IEEE, 2022, arXiv:2101.08208

Baihe Huang, Shunhua Jiang, Zhao Song, Runzhou Tao, and Ruizhe Zhang,Solving SDP faster: A ro- bust IPM framework and efficient implementation, 2022 IEEE 63rd Annual Symposium on Foundations of Computer Science (FOCS), pp. 233–244, IEEE, 2022, arXiv:2101.08208

work page arXiv 2022
[19]

Globerson, L

Xinmeng Huang, Shuo Li, Edgar Dobriban, Osbert Bastani, Hamed Hassani, and Dong- sheng Ding,One-shot safety alignment for large language models via optimal dualization, Ad- vances in Neural Information Processing Systems (A. Globerson, L. Mackey, D. Belgrave, A. Fan, U. Paquet, J. Tomczak, and C. Zhang, eds.), vol. 37, pp. 84350–84383, Cur- ran Associates...

2024
[20]

479–488, PMLR, 2017, arXiv:1507.05854

Prateek Jain, Chi Jin, Sham Kakade, and Praneeth Netrapalli,Global convergence of non-convex gradi- ent descent for computing matrix squareroot, Artificial Intelligence and Statistics, pp. 479–488, PMLR, 2017, arXiv:1507.05854

work page arXiv 2017
[21]

Shunhua Jiang, Zhao Song, Omri Weinstein, and Hengjie Zhang,Faster dynamic matrix inverse for faster LPs, 2020, arXiv:2004.07470

work page arXiv 2020
[22]

1, 28,https://doi.org/10.1007/s10915-025-03154-y

Yuntian Jiang, Chang He, Chuwen Zhang, Dongdong Ge, Bo Jiang, and Yinyu Ye,Beyond nonconvexity: A universal trust-region method with new analyses, Journal of Scientific Computing106(2026), no. 1, 28,https://doi.org/10.1007/s10915-025-03154-y

work page doi:10.1007/s10915-025-03154-y 2026
[23]

Chi Jin, Praneeth Netrapalli, and Michael I. Jordan,Accelerated gradient descent escapes saddle points faster than gradient descent, Proceedings of the 31st Conference On Learning Theory (S´ ebastien Bubeck, Vianney Perchet, and Philippe Rigollet, eds.), Proceedings of Machine Learning Research, vol. 75, pp. 1042–1085, PMLR, 06–09 Jul 2018, arXiv:1711.10456

work page arXiv 2018
[24]

Karmarkar,A new polynomial-time algorithm for linear programming, Proceedings of the Sixteenth Annual ACM Symposium on Theory of Computing (New York, NY, USA), STOC ’84, p

N. Karmarkar,A new polynomial-time algorithm for linear programming, Proceedings of the Sixteenth Annual ACM Symposium on Theory of Computing (New York, NY, USA), STOC ’84, p. 302–311, Association for Computing Machinery, 1984,https://doi.org/10.1145/800057.808695

work page doi:10.1145/800057.808695 1984
[25]

Melo, and Renato D

Weiwei Kong, Jefferson G. Melo, and Renato D. C. Monteiro,Iteration complexity of an inner acceler- ated inexact proximal augmented Lagrangian method based on the classical Lagrangian function, SIAM Journal on Optimization33(2023), no. 1, 181–210,https://doi.org/10.1137/20M136147X

work page doi:10.1137/20m136147x 2023
[26]

Simon Lacoste-Julien,Convergence rate of Frank–Wolfe for non-convex objectives, 2016, arXiv:1607.00345

work page arXiv 2016
[27]

Yin Tat Lee and Aaron Sidford,Solving linear programs with √ ranklinear system solves, 2019, arXiv preprint arXiv:1910.08033

work page arXiv 2019
[28]

2140–2157, PMLR, 2019, arXiv:1905.04447

Yin Tat Lee, Zhao Song, and Qiuyi Zhang,Solving empirical risk minimization in the current matrix multiplication time, Conference on Learning Theory, pp. 2140–2157, PMLR, 2019, arXiv:1905.04447

work page arXiv 2019
[29]

Vempala,Tutorial on the robust interior point method, 2021, arXiv:2108.04734

Yin Tat Lee and Santosh S. Vempala,Tutorial on the robust interior point method, 2021, arXiv:2108.04734

work page arXiv 2021
[30]

4, 373–397, https://doi.org/10.1287/ijoo.2021.0052

Zichong Li and Yangyang Xu,Augmented Lagrangian–based first-order methods for convex-constrained programs with weakly convex objective, INFORMS Journal on Optimization3(2021), no. 4, 373–397, https://doi.org/10.1287/ijoo.2021.0052

work page doi:10.1287/ijoo.2021.0052 2021
[31]

Bengio, H

Mingrui Liu, Zhe Li, Xiaoyu Wang, Jinfeng Yi, and Tianbao Yang,Adaptive negative curvature de- scent with applications in non-convex optimization, Advances in Neural Information Processing Systems (S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett, eds.), vol. 31, Curran Associates, Inc., 2018,https://proceedings.neurips.cc...

2018
[32]

Zixuan Liu, Xiaolin Sun, and Zizhan Zheng,Enhancing llm safety via constrained direct preference optimization, 2024, arXiv:2403.02475 33

work page arXiv 2024
[33]

Larochelle, M

Songtao Lu, Meisam Razaviyayn, Bo Yang, Kejun Huang, and Mingyi Hong,Finding second-order stationary points efficiently in smooth nonconvex linearly constrained optimization problems, Advances in Neural Information Processing Systems (H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin, eds.), vol. 33, pp. 2811–2822, Curran Associates, Inc., 20...

2020
[34]

Bengio, H

Aryan Mokhtari, Asuman Ozdaglar, and Ali Jadbabaie,Escaping saddle points in constrained opti- mization, Advances in Neural Information Processing Systems (S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett, eds.), vol. 31, Curran Associates, Inc., 2018, arXiv:1809.02162

work page arXiv 2018
[35]

Jordan,Accelerated first-order optimization under nonlinear con- straints, Mathematical Programming (2025), 1–46,https://doi.org/10.1007/s10107-025-02224-1

Michael Muehlebach and Michael I. Jordan,Accelerated first-order optimization under nonlinear con- straints, Mathematical Programming (2025), 1–46,https://doi.org/10.1007/s10107-025-02224-1

work page doi:10.1007/s10107-025-02224-1 2025
[36]

Wal- lach, H

Yatin Nandwani, Abhishek Pathak, Mausam, and Parag Singla,A primal dual formulation for deep learning with constraints, Advances in Neural Information Processing Systems (H. Wal- lach, H. Larochelle, A. Beygelzimer, F. d'Alch´ e-Buc, E. Fox, and R. Garnett, eds.), vol. 32, Curran Associates, Inc., 2019,https://proceedings.neurips.cc/paper_files/paper/2019...

2019
[37]

16, 3215–3224

Arkadi Nemirovski,Interior point polynomial time methods in convex programming, Lecture notes42 (2004), no. 16, 3215–3224

2004
[38]

Nesterov and Michael J

Yu E. Nesterov and Michael J. Todd,Self-scaled barriers and interior-point methods for convex pro- gramming, Mathematics of Operations Research22(1997), no. 1, 1–42,https://doi.org/10.1287/ moor.22.1.1

1997
[39]

Yurii Nesterov and Arkadii Nemirovskii,Interior-point polynomial algorithms in convex programming, SIAM, 1994

1994
[40]

Wright,Numerical optimization, Springer, 2006,https://doi.org/10

Jorge Nocedal and Stephen J. Wright,Numerical optimization, Springer, 2006,https://doi.org/10. 1007/978-0-387-40065-5

2006
[41]

Koyejo, S

Lianhui Qin, Sean Welleck, Daniel Khashabi, and Yejin Choi,Cold decoding: Energy-based con- strained text generation with langevin dynamics, Advances in Neural Information Processing Systems (S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh, eds.), vol. 35, pp. 9538–9551, Curran Associates, Inc., 2022,https://proceedings.neurips.cc/paper_...

2022
[42]

A polynomial-time algorithm, based on Newton's method, for linear programming

James Renegar,A polynomial-time algorithm, based on Newton’s method, for linear programming, Math- ematical Programming40(1988), no. 1, 59–93,https://doi.org/10.1007/BF01580724

work page doi:10.1007/bf01580724 1988
[43]

5, 1131–1198,https://doi.org/10.1007/s10208-017-9365-9

Ju Sun, Qing Qu, and John Wright,A geometric analysis of phase retrieval, Foundations of Computa- tional Mathematics18(2018), no. 5, 1131–1198,https://doi.org/10.1007/s10208-017-9365-9

work page doi:10.1007/s10208-017-9365-9 2018
[44]

Pravin M. Vaidya,An algorithm for linear programming which requiresO(((m+n)n 2 + (m+n) 1.5n)L) arithmetic operations, Proceedings of the nineteenth annual ACM symposium on Theory of computing, pp. 29–38, 1987,https://doi.org/10.1145/28395.28399

work page doi:10.1145/28395.28399 1987
[45]

Vaidya and David S

Pravin M. Vaidya and David S. Atkinson,A technique for bounding the number of iterations in path following algorithms, Complexity in Numerical Optimization, World Scientific, 1993,https://doi.org/ 10.1142/9789814354363_0021, pp. 462–489

work page doi:10.1142/9789814354363_0021 1993
[46]

doi: 10.1137/1.9781611975994.16

Jan van den Brand,A deterministic linear program solver in current matrix multiplication time, Pro- ceedings of the Fourteenth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 259–278, SIAM, 2020,https://doi.org/10.1137/1.9781611975994.16. 34

work page doi:10.1137/1.9781611975994.16 2020
[47]

919–930, IEEE, 2020, arXiv:2009.01802

Jan van den Brand, Yin-Tat Lee, Danupon Nanongkai, Richard Peng, Thatchaphol Saranurak, Aaron Sidford, Zhao Song, and Di Wang,Bipartite matching in nearly-linear time on moderately dense graphs, 2020 IEEE 61st Annual Symposium on Foundations of Computer Science (FOCS), pp. 919–930, IEEE, 2020, arXiv:2009.01802

work page arXiv 2020
[48]

775–788, 2020,https://doi.org/10.1145/3357713.3384309

Jan van den Brand, Yin Tat Lee, Aaron Sidford, and Zhao Song,Solving tall dense linear programs in nearly linear time, Proceedings of the 52nd Annual ACM SIGACT Symposium on Theory of Computing, pp. 775–788, 2020,https://doi.org/10.1145/3357713.3384309

work page doi:10.1145/3357713.3384309 2020
[49]

Liu and Aaron Sidford , editor =

Jan Van Den Brand, Yang P. Liu, and Aaron Sidford,Dynamic maxflow via dynamic interior point methods, Proceedings of the 55th Annual ACM Symposium on Theory of Computing, pp. 1215–1228, 2023,https://doi.org/10.1145/3564246.3585135

work page doi:10.1145/3564246.3585135 2023
[50]

Tran, Rei Sato, Takumi Tanabe, and Youhei Akimoto,Stepwise alignment for constrained language model policy optimization, Advances in Neural Information Processing Systems (A

Akifumi Wachi, Thien Q. Tran, Rei Sato, Takumi Tanabe, and Youhei Akimoto,Stepwise alignment for constrained language model policy optimization, Advances in Neural Information Processing Systems (A. Globerson, L. Mackey, D. Belgrave, A. Fan, U. Paquet, J. Tomczak, and C. Zhang, eds.), vol. 37, pp. 104471–104520, Curran Associates, Inc., 2024,https://proce...

2024
[51]

Biegler,Line search filter methods for nonlinear programming: Mo- tivation and global convergence, SIAM Journal on Optimization16(2005), no

Andreas W¨ achter and Lorenz T. Biegler,Line search filter methods for nonlinear programming: Mo- tivation and global convergence, SIAM Journal on Optimization16(2005), no. 1, 1–31,https: //doi.org/10.1137/S1052623403426556

work page doi:10.1137/s1052623403426556 2005
[52]

Mathematical Programming 106, 25-57 (2006)

Andreas W¨ achter and Lorenz T. Biegler,On the implementation of an interior-point filter line-search algorithm for large-scale nonlinear programming, Mathematical Programming106(2006), no. 1, 25–57, https://doi.org/10.1007/s10107-004-0559-y

work page doi:10.1007/s10107-004-0559-y 2006
[53]

Siwei Xia, Yuehan Yang, and Hu Yang,High-dimensional sparse portfolio selection with nonnegative constraint, Applied Mathematics and Computation443(2023), 127766,https://doi.org/10.1016/ j.amc.2022.127766

work page arXiv 2023
[54]

Yi Xu, Rong Jin, and Tianbao Yang,Neon+: Accelerated gradient methods for extracting negative curvature for non-convex optimization, 2017, arXiv:1712.01033

work page arXiv 2017
[55]

2, 195–211,https://doi.org/10.1007/BF01581726

Yinyu Ye,On the complexity of approximating a KKT point of quadratic programming, Mathematical Programming80(1998), no. 2, 195–211,https://doi.org/10.1007/BF01581726

work page doi:10.1007/bf01581726 1998
[56]

Dongjie Yu, Haitong Ma, Shengbo Li, and Jianyu Chen,Reachability constrained reinforcement learning, Proceedings of the 39th International Conference on Machine Learning (Kamalika Chaudhuri, Stefanie Jegelka, Le Song, Csaba Szepesvari, Gang Niu, and Sivan Sabato, eds.), Proceedings of Machine Learn- ing Research, vol. 162, pp. 25636–25655, PMLR, 17–23 Jul...

2022
[57]

1, 717–761,https://doi.org/10.1007/s10107-023-02055-y

Liaoyuan Zeng, Yongle Zhang, Guoyin Li, Ting Kei Pong, and Xiaozhou Wang,Frank–Wolfe-type methods for a class of nonconvex inequality-constrained problems, Mathematical Programming208 (2024), no. 1, 717–761,https://doi.org/10.1007/s10107-023-02055-y

work page doi:10.1007/s10107-023-02055-y 2024
[58]

Ranzato, A

Chenyi Zhang and Tongyang Li,Escape saddle points by a simple gradient-descent based algorithm, Advances in Neural Information Processing Systems (M. Ranzato, A. Beygelzimer, Y. Dauphin, P.S. Liang, and J. Wortman Vaughan, eds.), vol. 34, pp. 8545–8556, Curran Associates, Inc., 2021, arXiv:2111.14069 35

work page arXiv 2021