Performance characteristics of a parallel treecode
read the original abstract
I describe here the performances of a parallel treecode with individual particle timesteps. The code is based on the Barnes-Hut algorithm and runs cosmological N-body simulations on parallel machines with a distributed memory architecture using the MPI message passing library. For a configuration with a constant number of particles per processor the scalability of the code has been tested up to P=32 processors. The average CPU time per processor necessary for solving the gravitational interactions is within $\sim 10 %$ of that expected from the ideal scaling relation. The load balancing efficiency is high ($\simgt90%$) if the processor domains are determined every large timestep according to a weighting scheme which takes into account the total particle computational load within the timestep.
This paper has not been read by Pith yet.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.