pith. sign in

arxiv: 1109.1264 · v1 · pith:YKSO6RFInew · submitted 2011-09-06 · 💻 cs.MS

A New Vectorization Technique for Expression Templates in C++

classification 💻 cs.MS
keywords templatesexpressionlibraryperformanceblastemplatealgebraapproach
0
0 comments X
read the original abstract

Vector operations play an important role in high performance computing and are typically provided by highly optimized libraries that implement the BLAS (Basic Linear Algebra Subprograms) interface. In C++ templates and operator overloading allow the implementation of these vector operations as expression templates which construct custom loops at compile time and providing a more abstract interface. Unfortunately existing expression template libraries lack the performance of fast BLAS(Basic Linear Algebra Subprograms) implementations. This paper presents a new approach - Statically Accelerated Loop Templates (SALT) - to close this performance gap by combining expression templates with an aggressive loop unrolling technique. Benchmarks were conducted using the Intel C++ compiler and GNU Compiler Collection to assess the performance of our library relative to Intel's Math Kernel Library as well as the Eigen template library. The results show that the approach is able to provide optimization comparable to the fastest available BLAS implementations, while retaining the convenience and flexibility of a template library.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.