A Non-Monotone Preconditioned Trust-Region Method for Neural Network Training

Andrea Angino; Bindi \c{C}apriqi; Ken Trotti; Rolf Krause; Shega Likaj

arxiv: 2605.14860 · v1 · pith:HCOQHVM5new · submitted 2026-05-14 · 🧮 math.OC · cs.LG

A Non-Monotone Preconditioned Trust-Region Method for Neural Network Training

Andrea Angino , Bindi \c{C}apriqi , Shega Likaj , Ken Trotti , Rolf Krause This is my paper

classification 🧮 math.OC cs.LG

keywords aptsnon-monotonetrust-regionglobalnetworkneuralparallelpreconditioned

0 comments

read the original abstract

Training deep neural networks at scale can benefit from domain decomposition, where the network is split into subdomains trained in parallel and coupled by a global trust-region mechanism. Building on the Additively Preconditioned Trust-Region Strategy (APTS), we propose a non-monotone variant with a nonlinear additive Schwarz preconditioner that combines parallel subdomain corrections with global coarse-space directions. A windowed acceptance criterion allows controlled objective increases, avoiding needless rejection of effective coarse steps. The resulting non-monotone APTS (NAPTS) preserves accuracy while reducing CPU time by 30\% and cutting rejected steps to one third of those in APTS.

This paper has not been read by Pith yet.

A Non-Monotone Preconditioned Trust-Region Method for Neural Network Training

discussion (0)