Sparsep: Towards efficient sparse matrix vector multiplication on real processing-in-memory architectures,

· 2022 · DOI 10.1145/3508041

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

open at publisher browse 3 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

PRISM: Processing-In-Memory Sparse MTTKRP for Tensor Decomposition Acceleration

cs.DC · 2026-05-28 · unverdicted · novelty 6.0

PRISM is the first PIM-based approach for sparse MTTKRP, reporting up to 2.64x speedup over CPU baselines on UPMEM hardware with heterogeneous CPU collaboration.

TokenStack: A Heterogeneous HBM-PIM Architecture and Runtime for Efficient LLM Inference

cs.AR · 2026-05-07 · unverdicted · novelty 6.0

TokenStack's heterogeneous HBM-PIM design with base-die control and topology-aware KV placement delivers 1.62x higher geometric-mean token throughput and 1.70x SLO-compliant serving capacity than AttAcc while cutting per-token energy by 30-47%.

PIM-CACHE: High-Efficiency Content-Aware Copy for Processing-In-Memory

cs.ET · 2026-03-24 · unverdicted · novelty 5.0

PIM-CACHE reduces mandatory coarse-grained transfers in UPMEM-style PIM by dynamically staging only non-redundant data via content-aware copy that exploits workload similarity.

citing papers explorer

Showing 3 of 3 citing papers.

PRISM: Processing-In-Memory Sparse MTTKRP for Tensor Decomposition Acceleration cs.DC · 2026-05-28 · unverdicted · none · ref 8
PRISM is the first PIM-based approach for sparse MTTKRP, reporting up to 2.64x speedup over CPU baselines on UPMEM hardware with heterogeneous CPU collaboration.
TokenStack: A Heterogeneous HBM-PIM Architecture and Runtime for Efficient LLM Inference cs.AR · 2026-05-07 · unverdicted · none · ref 9
TokenStack's heterogeneous HBM-PIM design with base-die control and topology-aware KV placement delivers 1.62x higher geometric-mean token throughput and 1.70x SLO-compliant serving capacity than AttAcc while cutting per-token energy by 30-47%.
PIM-CACHE: High-Efficiency Content-Aware Copy for Processing-In-Memory cs.ET · 2026-03-24 · unverdicted · none · ref 25
PIM-CACHE reduces mandatory coarse-grained transfers in UPMEM-style PIM by dynamically staging only non-redundant data via content-aware copy that exploits workload similarity.

Sparsep: Towards efficient sparse matrix vector multiplication on real processing-in-memory architectures,

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer