Development of Lattice QCD Tool Kit on Cell Broadband Engine Processor
classification
✦ hep-lat
keywords
gflopscellspeedlatticematrixmultiplicationpeaktool
read the original abstract
We report an implementation of a code for SU(3) matrix multiplication on Cell/B.E., which is a part of our project, Lattice Tool Kit on Cell/B.E.. On QS20, the speed of the matrix multiplication on SPE in single precision is 227GFLOPS and it becomes 20GFLOPS {this vaule was remeasured and corrcted.} together with data transfer from main memory by DNA transfer, which is 4.6% of the hardware peak speed (460GFLOPS), and is 7.4% of the theoretical peak speed of this calculation (268.77GFLOPS). We briefly describe our tuning procedure.
This paper has not been read by Pith yet.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.