DMK extended to rectangular cuboids with arbitrary periodicity via localized octree evaluations on cubical tilings and Fourier-space root-level summation with truncated kernels for reduced periodicity.
Accelerating the nonuniform fast fourier transform,
3 Pith papers cite this work. Polarity classification is still indexing.
years
2026 3verdicts
UNVERDICTED 3representative citing papers
Portable Ewald summation algorithms for Stokes flow achieve ~8M particles/sec on H200 GPU with a novel P2G kernel providing 16x speedup and good multi-GPU scaling.
FLaG is a frequency-domain module using FFT, latent queries, and gating that improves token aggregation and shows gains on ESM2 AMP prediction and CIFAR-100 image classification while staying competitive on text tasks.
citing papers explorer
-
Fast summation on rectangular cuboids with arbitrary periodicity in the DMK framework
DMK extended to rectangular cuboids with arbitrary periodicity via localized octree evaluations on cubical tilings and Fourier-space root-level summation with truncated kernels for reduced periodicity.
-
A performance portable fast Ewald summation for Stokes flow
Portable Ewald summation algorithms for Stokes flow achieve ~8M particles/sec on H200 GPU with a novel P2G kernel providing 16x speedup and good multi-GPU scaling.