FP64 tensor cores accelerate high-order finite-element kernels in MFEM by up to 2x with 83% energy gains and near-perfect weak scaling on exascale hardware.
Fast and scalable FFT-based GPU-accelerated algorithms for block-triangular Toeplitz matrices with application to linear inverse problems governed by autonomous dynamical systems
2 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
fields
cs.DC 2years
2026 2verdicts
UNVERDICTED 2roles
method 1polarities
use method 1representative citing papers
A reformulation of Bayesian OED as dense matrix subset selection plus a pipelined Schur-complement greedy algorithm on hundreds of GPUs enables optimization of 175-sensor networks for billion-degree-of-freedom tsunami models with near-perfect scaling.
citing papers explorer
-
Accelerating High-Order Finite Element Simulations at Extreme Scale with FP64 Tensor Cores
FP64 tensor cores accelerate high-order finite-element kernels in MFEM by up to 2x with 83% energy gains and near-perfect weak scaling on exascale hardware.
-
Sensor Placement for Tsunami Early Warning via Large-Scale Bayesian Optimal Experimental Design
A reformulation of Bayesian OED as dense matrix subset selection plus a pipelined Schur-complement greedy algorithm on hundreds of GPUs enables optimization of 175-sensor networks for billion-degree-of-freedom tsunami models with near-perfect scaling.