CuPy: A NumPy-compatible library for NVIDIA GPU calculations

Ryosuke Okuta, Yuya Unno, Daisuke Nishino, Shohei Hido, Crissman Loomis · 2017

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

Matrix-Free 3D SIMP Topology Optimization with Fused Gather-GEMM-Scatter Kernels

cs.CE · 2026-04-20 · unverdicted · novelty 6.0

A fused gather-GEMM-scatter CUDA kernel achieves 4.6-7.3x end-to-end speedup and 3.2-4.9x lower energy for matrix-free 3D SIMP topology optimization on RTX 4090 compared to three-stage baselines.

citing papers explorer

Showing 1 of 1 citing paper.

Matrix-Free 3D SIMP Topology Optimization with Fused Gather-GEMM-Scatter Kernels cs.CE · 2026-04-20 · unverdicted · none · ref 13
A fused gather-GEMM-scatter CUDA kernel achieves 4.6-7.3x end-to-end speedup and 3.2-4.9x lower energy for matrix-free 3D SIMP topology optimization on RTX 4090 compared to three-stage baselines.

CuPy: A NumPy-compatible library for NVIDIA GPU calculations

fields

years

verdicts

representative citing papers

citing papers explorer