hub

Davis and Yifan Hu

doi:10 · 2011 · arXiv 9662.204966

10 Pith papers cite this work. Polarity classification is still indexing.

10 Pith papers citing it

read on arXiv browse 10 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

dataset 2 background 1 baseline 1

citation-polarity summary

use dataset 2 background 1 baseline 1

representative citing papers

Hybrid Sketching Methods for Dynamic Connectivity on Sparse Graphs

cs.DS · 2026-05-14 · unverdicted · novelty 7.0

Hybrid sketching saves up to 97% space on dense graphs and 15% on sparse ones by sketching dense cores and storing sparse parts exactly, with new BalloonSketch reducing sketch sizes up to 8x.

A refined CJ--SS--RR method with a reliable removal approach of spurious Ritz values for the Hermitian eigenvalue problem

math.NA · 2026-05-13 · unverdicted · novelty 7.0

Refined SS-RRR methods with a reliable tune-free removal of spurious Ritz values improve accuracy and efficiency for computing eigenpairs of large Hermitian matrices in a target region.

AsyncSparse: Accelerating Sparse Matrix-Matrix Multiplication on Asynchronous GPU Architectures

cs.DC · 2026-04-20 · unverdicted · novelty 7.0

AsyncSparse presents BCSR and WCSR kernels that use TMA and warp specialization to accelerate SpMM, outperforming prior libraries by 1.47-6.24x on SuiteSparse and achieving 2.66x end-to-end speedup on Qwen2.5-7B at 90% block sparsity.

Partitioning Unstructured Sparse Tensor Algebra for Load-Balanced Parallel Execution

cs.PL · 2026-04-19 · unverdicted · novelty 7.0

A new partitioning algorithm that provably load-balances arbitrary sparse tensor algebra expressions by generalizing parallel merging to multi-operand, multi-dimensional hierarchical structures, implemented in a compiler framework.

PackSELL: A Sparse Matrix Format for Precision-Agnostic High-Performance SpMV

cs.DC · 2026-04-15 · unverdicted · novelty 7.0

PackSELL packs delta-encoded indices and values into single words with tunable bit allocation, delivering up to 1.63x faster FP16 SpMV and FP32-accurate performance exceeding FP16 cuSPARSE while reducing memory traffic.

Cache Blocking of Distributed-Memory Parallel Matrix Power Kernels

cs.DC · 2024-05-21 · unverdicted · novelty 7.0

Introduces Distributed Level-Blocked MPK combining RACE cache blocking with MPI, reporting substantial speedups up to 4x on 832 cores for matrix power kernels across scientific sparse matrices.

Evaluation of a Flow-Based Hypergraph Bipartitioning Algorithm

cs.DS · 2019-07-03 · unverdicted · novelty 6.0

ReBaHFC refines PaToH outputs with the new HyperFlowCutter flow algorithm to deliver hypergraph bipartition quality close to KaHyPar and hMETIS while running an order of magnitude faster.

Geometric Crossing-Minimization -- A Scalable Randomized Approach

cs.CG · 2019-07-02 · unverdicted · novelty 6.0

Presents a scalable randomized algorithm for geometric crossing minimization, including a theoretical approximation guarantee for vertex repositioning and experimental results on graphs with up to 13,000 edges.

Bridging the Gap between Sparse Matrix Reordering and Factorization: A Deep Learning Framework for Fill-in Reduction

cs.LG · 2026-05-17 · unverdicted · novelty 5.0

A GNN framework learns spectral embeddings of sparse matrices to minimize a fill-in surrogate and produces competitive reorderings versus classical graph algorithms.

Algebraic Temporal Blocking for Sparse Iterative Solvers on Multi-Core CPUs

math.NA · 2023-09-05 · unverdicted · novelty 5.0

Integrating RACE into Trilinos applies algebraic temporal blocking to MPK in s-step GMRES, polynomial preconditioners, and AMG, yielding up to 3x speedups on multi-core CPUs for MPK-dominated algorithms.

citing papers explorer

Showing 10 of 10 citing papers.

Hybrid Sketching Methods for Dynamic Connectivity on Sparse Graphs cs.DS · 2026-05-14 · unverdicted · none · ref 22
Hybrid sketching saves up to 97% space on dense graphs and 15% on sparse ones by sketching dense cores and storing sparse parts exactly, with new BalloonSketch reducing sketch sizes up to 8x.
A refined CJ--SS--RR method with a reliable removal approach of spurious Ritz values for the Hermitian eigenvalue problem math.NA · 2026-05-13 · unverdicted · none · ref 2
Refined SS-RRR methods with a reliable tune-free removal of spurious Ritz values improve accuracy and efficiency for computing eigenpairs of large Hermitian matrices in a target region.
AsyncSparse: Accelerating Sparse Matrix-Matrix Multiplication on Asynchronous GPU Architectures cs.DC · 2026-04-20 · unverdicted · none · ref 48
AsyncSparse presents BCSR and WCSR kernels that use TMA and warp specialization to accelerate SpMM, outperforming prior libraries by 1.47-6.24x on SuiteSparse and achieving 2.66x end-to-end speedup on Qwen2.5-7B at 90% block sparsity.
Partitioning Unstructured Sparse Tensor Algebra for Load-Balanced Parallel Execution cs.PL · 2026-04-19 · unverdicted · none · ref 18
A new partitioning algorithm that provably load-balances arbitrary sparse tensor algebra expressions by generalizing parallel merging to multi-operand, multi-dimensional hierarchical structures, implemented in a compiler framework.
PackSELL: A Sparse Matrix Format for Precision-Agnostic High-Performance SpMV cs.DC · 2026-04-15 · unverdicted · none · ref 11
PackSELL packs delta-encoded indices and values into single words with tunable bit allocation, delivering up to 1.63x faster FP16 SpMV and FP32-accurate performance exceeding FP16 cuSPARSE while reducing memory traffic.
Cache Blocking of Distributed-Memory Parallel Matrix Power Kernels cs.DC · 2024-05-21 · unverdicted · none · ref 8
Introduces Distributed Level-Blocked MPK combining RACE cache blocking with MPI, reporting substantial speedups up to 4x on 832 cores for matrix power kernels across scientific sparse matrices.
Evaluation of a Flow-Based Hypergraph Bipartitioning Algorithm cs.DS · 2019-07-03 · unverdicted · none · ref 11
ReBaHFC refines PaToH outputs with the new HyperFlowCutter flow algorithm to deliver hypergraph bipartition quality close to KaHyPar and hMETIS while running an order of magnitude faster.
Geometric Crossing-Minimization -- A Scalable Randomized Approach cs.CG · 2019-07-02 · unverdicted · none · ref 4
Presents a scalable randomized algorithm for geometric crossing minimization, including a theoretical approximation guarantee for vertex repositioning and experimental results on graphs with up to 13,000 edges.
Bridging the Gap between Sparse Matrix Reordering and Factorization: A Deep Learning Framework for Fill-in Reduction cs.LG · 2026-05-17 · unverdicted · none · ref 9
A GNN framework learns spectral embeddings of sparse matrices to minimize a fill-in surrogate and produces competitive reorderings versus classical graph algorithms.
Algebraic Temporal Blocking for Sparse Iterative Solvers on Multi-Core CPUs math.NA · 2023-09-05 · unverdicted · none · ref 18
Integrating RACE into Trilinos applies algebraic temporal blocking to MPK in s-step GMRES, polynomial preconditioners, and AMG, yielding up to 3x speedups on multi-core CPUs for MPK-dominated algorithms.

Davis and Yifan Hu

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer