Low synchronization GMRES algorithms

Julien Langou; Kasia Swirydowicz; Shreyas Ananthan; Stephen Thomas; Ulrike Yang

arxiv: 1809.05805 · v1 · pith:3W7GFO2Wnew · submitted 2018-09-16 · 💻 cs.NA

Low synchronization GMRES algorithms

Kasia Swirydowicz , Julien Langou , Shreyas Ananthan , Ulrike Yang , Stephen Thomas This is my paper

classification 💻 cs.NA

keywords synchronizationalgorithmsgmresglobaliteratedoperationsperformancepipelined

0 comments

read the original abstract

Communication-avoiding and pipelined variants of Krylov solvers are critical for the scalability of linear system solvers on future exascale architectures. We present low synchronization variants of iterated classical (CGS) and modified Gram-Schmidt (MGS) algorithms that require one and two global reduction communication steps. Derivations of low synchronization iterated CGS algorithms are based on previous work by Ruhe. Our main contribution is to introduce a backward normalization lag into the compact $WY$ form of MGS resulting in a ${\cal O}(\eps)\kappa(A)$ stable GMRES algorithm that requires only one global synchronization per iteration. The reduction operations are overlapped with computations and pipelined to optimize performance. Further improvements in performance are achieved by accelerating GMRES BLAS-2 operations on GPUs.

This paper has not been read by Pith yet.

Low synchronization GMRES algorithms

discussion (0)