Title resolution pending

Garcia-Hernando G, Yuan S, Baek S, et al ( · 2018 · arXiv 2018.00050

8 Pith papers cite this work. Polarity classification is still indexing.

8 Pith papers citing it

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

citation-role summary

background 3

citation-polarity summary

background 3

representative citing papers

Mass Matrix Assembly on Tensor Cores for Implicit Particle-In-Cell Methods

cs.CE · 2026-04-21 · unverdicted · novelty 7.0

Mass matrix assembly for implicit PIC methods can be exactly reformulated cell-by-cell as tensor-core matrix products, delivering up to 3x kernel speedup and 15% end-to-end runtime reduction in ECSIM simulations.

Accelerating Locality-Driven Integration in Quantum Chemistry with Block-Structured Matrix Multiplication

physics.comp-ph · 2026-05-11 · unverdicted · novelty 6.0 · 2 refs

KerneLDI accelerates exchange-correlation integration in Kohn-Sham DFT by up to 10x through block-structured matrix multiplication that exploits spatial locality on GPUs while preserving accuracy.

Matrix-Free 3D SIMP Topology Optimization with Fused Gather-GEMM-Scatter Kernels

cs.CE · 2026-04-20 · unverdicted · novelty 6.0

A fused gather-GEMM-scatter CUDA kernel achieves 4.6-7.3x end-to-end speedup and 3.2-4.9x lower energy for matrix-free 3D SIMP topology optimization on RTX 4090 compared to three-stage baselines.

Iterative Refinement for Diagonalizable Non-Hermitian Eigendecompositions

math.NA · 2026-04-03 · unverdicted · novelty 6.0

The paper introduces matrix-multiplication-based iterative refinement for diagonalizable non-Hermitian eigendecompositions that achieves quadratic residual reduction for simple eigenvalues and includes cluster stabilization.

Mixed-precision iterative refinement for low-rank Lyapunov equations

math.NA · 2025-10-02 · unverdicted · novelty 6.0

Develops mixed-precision iterative refinement for low-rank Lyapunov equations with rounding error analysis enabling reduced precision for moderately conditioned problems.

Efficient Page Migration in Hybrid Memory Systems

cs.AR · 2026-04-21 · unverdicted · novelty 5.0

Duon eliminates TLB shootdown and cache invalidation costs during page migration in flat-address hybrid memory systems by updating mappings in-place, delivering 3.87% IPC gains over prior methods.

Sustaining Exascale Performance: Lessons from HPL and HPL-MxP on Aurora

cs.DC · 2026-04-10 · unverdicted · novelty 4.0

Aurora reached 1.01 EF/s FP64 HPL and 11.64 EF/s HPL-MxP through locality-aware mapping, CPU-GPU pipelining, mixed-precision orchestration, and hybrid resilience on a large Intel GPU-based system.

Visual Hand Gesture Recognition with Deep Learning: A Comprehensive Review of Methods, Datasets, Challenges and Future Research Directions

cs.CV · 2025-07-06 · unverdicted · novelty 2.0

A literature review that categorizes deep learning approaches for visual hand gesture recognition, summarizes state-of-the-art methods across tasks, reviews datasets and metrics, and identifies challenges and future directions.

citing papers explorer

Showing 8 of 8 citing papers.

Mass Matrix Assembly on Tensor Cores for Implicit Particle-In-Cell Methods cs.CE · 2026-04-21 · unverdicted · none · ref 6
Mass matrix assembly for implicit PIC methods can be exactly reformulated cell-by-cell as tensor-core matrix products, delivering up to 3x kernel speedup and 15% end-to-end runtime reduction in ECSIM simulations.
Accelerating Locality-Driven Integration in Quantum Chemistry with Block-Structured Matrix Multiplication physics.comp-ph · 2026-05-11 · unverdicted · none · ref 36 · 2 links
KerneLDI accelerates exchange-correlation integration in Kohn-Sham DFT by up to 10x through block-structured matrix multiplication that exploits spatial locality on GPUs while preserving accuracy.
Matrix-Free 3D SIMP Topology Optimization with Fused Gather-GEMM-Scatter Kernels cs.CE · 2026-04-20 · unverdicted · none · ref 20
A fused gather-GEMM-scatter CUDA kernel achieves 4.6-7.3x end-to-end speedup and 3.2-4.9x lower energy for matrix-free 3D SIMP topology optimization on RTX 4090 compared to three-stage baselines.
Iterative Refinement for Diagonalizable Non-Hermitian Eigendecompositions math.NA · 2026-04-03 · unverdicted · none · ref 25
The paper introduces matrix-multiplication-based iterative refinement for diagonalizable non-Hermitian eigendecompositions that achieves quadratic residual reduction for simple eigenvalues and includes cluster stabilization.
Mixed-precision iterative refinement for low-rank Lyapunov equations math.NA · 2025-10-02 · unverdicted · none · ref 15
Develops mixed-precision iterative refinement for low-rank Lyapunov equations with rounding error analysis enabling reduced precision for moderately conditioned problems.
Efficient Page Migration in Hybrid Memory Systems cs.AR · 2026-04-21 · unverdicted · none · ref 20
Duon eliminates TLB shootdown and cache invalidation costs during page migration in flat-address hybrid memory systems by updating mappings in-place, delivering 3.87% IPC gains over prior methods.
Sustaining Exascale Performance: Lessons from HPL and HPL-MxP on Aurora cs.DC · 2026-04-10 · unverdicted · none · ref 19
Aurora reached 1.01 EF/s FP64 HPL and 11.64 EF/s HPL-MxP through locality-aware mapping, CPU-GPU pipelining, mixed-precision orchestration, and hybrid resilience on a large Intel GPU-based system.
Visual Hand Gesture Recognition with Deep Learning: A Comprehensive Review of Methods, Datasets, Challenges and Future Research Directions cs.CV · 2025-07-06 · unverdicted · none · ref 50
A literature review that categorizes deep learning approaches for visual hand gesture recognition, summarizes state-of-the-art methods across tasks, reviews datasets and metrics, and identifies challenges and future directions.

Title resolution pending

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer