Parallel implementation of fast randomized algorithms for the decomposition of low rank matrices

Andrew Lucas; John Feo; Mark Stalzer

arxiv: 1205.3830 · v2 · pith:5BTRCUGSnew · submitted 2012-05-16 · 💻 cs.DC

Parallel implementation of fast randomized algorithms for the decomposition of low rank matrices

Andrew Lucas , Mark Stalzer , John Feo This is my paper

classification 💻 cs.DC

keywords matricesparallelalgorithmsdecompositionperformancerandomizedrankalmost

0 comments

read the original abstract

We analyze the parallel performance of randomized interpolative decomposition by decomposing low rank complex-valued Gaussian random matrices up to 64 GB. We chose a Cray XMT supercomputer as it provides an almost ideal PRAM model permitting quick investigation of parallel algorithms without obfuscation from hardware idiosyncrasies. We obtain that on non-square matrices performance becomes very good, with overall runtime over 70 times faster on 128 processors. We also verify that numerically discovered error bounds still hold on matrices nearly two orders of magnitude larger than those previously tested.

This paper has not been read by Pith yet.

Parallel implementation of fast randomized algorithms for the decomposition of low rank matrices

discussion (0)