pith. machine review for the scientific record. sign in

arxiv: 1509.07919 · v1 · submitted 2015-09-25 · 💻 cs.DC · cs.MS· cs.NA

Recognition: unknown

Analysis of A Splitting Approach for the Parallel Solution of Linear Systems on GPU Cards

Authors on Pith no claims yet
classification 💻 cs.DC cs.MScs.NA
keywords approachsystemsbandeddensediagonaldirectefficiencylinear
0
0 comments X
read the original abstract

We discuss an approach for solving sparse or dense banded linear systems ${\bf A} {\bf x} = {\bf b}$ on a Graphics Processing Unit (GPU) card. The matrix ${\bf A} \in {\mathbb{R}}^{N \times N}$ is possibly nonsymmetric and moderately large; i.e., $10000 \leq N \leq 500000$. The ${\it split\ and\ parallelize}$ (${\tt SaP}$) approach seeks to partition the matrix ${\bf A}$ into diagonal sub-blocks ${\bf A}_i$, $i=1,\ldots,P$, which are independently factored in parallel. The solution may choose to consider or to ignore the matrices that couple the diagonal sub-blocks ${\bf A}_i$. This approach, along with the Krylov subspace-based iterative method that it preconditions, are implemented in a solver called ${\tt SaP::GPU}$, which is compared in terms of efficiency with three commonly used sparse direct solvers: ${\tt PARDISO}$, ${\tt SuperLU}$, and ${\tt MUMPS}$. ${\tt SaP::GPU}$, which runs entirely on the GPU except several stages involved in preliminary row-column permutations, is robust and compares well in terms of efficiency with the aforementioned direct solvers. In a comparison against Intel's ${\tt MKL}$, ${\tt SaP::GPU}$ also fares well when used to solve dense banded systems that are close to being diagonally dominant. ${\tt SaP::GPU}$ is publicly available and distributed as open source under a permissive BSD3 license.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.