A new GPU-oriented batch SVD solver based on the one-sided Jacobi method delivers significant speedups over vendor libraries and prior open-source implementations across precisions and matrix shapes.
The International Journal of High 34 Contents Performance Computing Applications38(5), 468–490 (2024)
4 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 4roles
background 2polarities
background 2representative citing papers
Error analysis and cost estimator for recasting floating-point matrix multiplication as accumulated integer products on mixed-precision hardware.
Heterogeneous SYCL-based CG and Cholesky solvers deliver up to 32% and 29% faster runtimes than GPU-only versions for large matrices across multiple GPU vendors.
Review chapter summarizing advances in parallel sparse direct solvers along communication reduction and data-sparse compression axes.
citing papers explorer
-
An Efficient Batch Solver for the Singular Value Decomposition on GPUs
A new GPU-oriented batch SVD solver based on the one-sided Jacobi method delivers significant speedups over vendor libraries and prior open-source implementations across precisions and matrix shapes.
-
Analysis of Floating-Point Matrix Multiplication Computed via Integer Arithmetic
Error analysis and cost estimator for recasting floating-point matrix multiplication as accumulated integer products on mixed-precision hardware.
-
Comparing the Performance of Heterogeneous Conjugate Gradient and Cholesky Solvers on Various Hardware Using SYCL
Heterogeneous SYCL-based CG and Cholesky solvers deliver up to 32% and 29% faster runtimes than GPU-only versions for large matrices across multiple GPU vendors.
-
Parallel Sparse and Data-Sparse Factorization-based Linear Solvers
Review chapter summarizing advances in parallel sparse direct solvers along communication reduction and data-sparse compression axes.