Methods for compressible fluid simulation on GPUs using high-order finite differences

1; 1); 1) ((1) ReSoLVE Center of Excellence; (2) Department of Physics; 3); (3) Max-Planck-Institut f\"ur Sonnensystemforschung; (4) AIP; (5) Nokia Solutions; Aalto; Finland)

arxiv: 1707.08900 · v1 · pith:7Z6JHQM5new · submitted 2017-07-27 · ⚛️ physics.comp-ph · astro-ph.IM· cs.DC· physics.flu-dyn

Methods for compressible fluid simulation on GPUs using high-order finite differences

Johannes Pekkil\"a (1) , Miikka S. V\"ais\"al\"a (2) , Maarit J. K\"apyl\"a (3 , 1) , Petri J. K\"apyl\"a (4 , 1 , 3) , Omer Anjum (5

show 9 more authors

1) ((1) ReSoLVE Center of Excellence Aalto (2) Department of Physics University of Helsinki (3) Max-Planck-Institut f\"ur Sonnensystemforschung (4) AIP (5) Nokia Solutions Networks Finland)

This is my paper

classification ⚛️ physics.comp-ph astro-ph.IMcs.DCphysics.flu-dyn

keywords compressiblebandwidth-boundcachefluidfluidshigh-orderimplementationlatency-bound

0 comments

read the original abstract

We focus on implementing and optimizing a sixth-order finite-difference solver for simulating compressible fluids on a GPU using third-order Runge-Kutta integration. Since graphics processing units perform well in data-parallel tasks, this makes them an attractive platform for fluid simulation. However, high-order stencil computation is memory-intensive with respect to both main memory and the caches of the GPU. We present two approaches for simulating compressible fluids using 55-point and 19-point stencils. We seek to reduce the requirements for memory bandwidth and cache size in our methods by using cache blocking and decomposing a latency-bound kernel into several bandwidth-bound kernels. Our fastest implementation is bandwidth-bound and integrates $343$ million grid points per second on a Tesla K40t GPU, achieving a $3.6 \times$ speedup over a comparable hydrodynamics solver benchmarked on two Intel Xeon E5-2690v3 processors. Our alternative GPU implementation is latency-bound and achieves the rate of $168$ million updates per second.

This paper has not been read by Pith yet.

Methods for compressible fluid simulation on GPUs using high-order finite differences

discussion (0)