A massively parallel adaptive fast-multipole method on heterogeneous architectures , ISBN=

Lashuk, Ilya, Chandramowlishwaran, Aparna, Langston, Harper, Nguyen, Tuan-Anh, Sampath, Rahul, Shringarpure, Aashay · 2009 · arXiv 4059.165411

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

read on arXiv browse 2 citing papers

representative citing papers

A performance portable fast Ewald summation for Stokes flow

math.NA · 2026-06-17 · unverdicted · novelty 6.0

Portable Ewald summation algorithms for Stokes flow achieve ~8M particles/sec on H200 GPU with a novel P2G kernel providing 16x speedup and good multi-GPU scaling.

Efficient and Portable Support for Overdecomposition on Distributed Memory GPGPU Platforms

cs.DC · 2026-05-12 · unverdicted · novelty 4.0

Charm++ techniques enable efficient overdecomposition on multi-vendor GPGPU distributed systems.

citing papers explorer

Showing 1 of 1 citing paper after filters.

A performance portable fast Ewald summation for Stokes flow math.NA · 2026-06-17 · unverdicted · none · ref 3
Portable Ewald summation algorithms for Stokes flow achieve ~8M particles/sec on H200 GPU with a novel P2G kernel providing 16x speedup and good multi-GPU scaling.

A massively parallel adaptive fast-multipole method on heterogeneous architectures , ISBN=

fields

years

verdicts

representative citing papers

citing papers explorer