In Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (Vancouver, BC, Canada) (ASPLOS 2023), Tor M

· 2023 · arXiv 2016.358206

10 Pith papers cite this work. Polarity classification is still indexing.

10 Pith papers citing it

read on arXiv browse 10 citing papers

citation-role summary

background 2 method 1

citation-polarity summary

background 2 use method 1

representative citing papers

SegFold: Accelerating Sparse GEMM with a Fine-Grained Dynamic Dataflow

cs.AR · 2026-06-25 · unverdicted · novelty 7.0

SegFold achieves 1.95× geometric-mean speedup over prior SpGEMM accelerators via fine-grained dynamic scheduling and remapping in its Segment dataflow.

Partitioning Unstructured Sparse Tensor Algebra for Load-Balanced Parallel Execution

cs.PL · 2026-04-19 · unverdicted · novelty 7.0

A new partitioning algorithm that provably load-balances arbitrary sparse tensor algebra expressions by generalizing parallel merging to multi-operand, multi-dimensional hierarchical structures, implemented in a compiler framework.

The Kernel's Write: Application Read-Only Memory

cs.AR · 2026-06-18 · unverdicted · novelty 6.0

Proposes AROM to shift LtRAM management to the OS by making pages read-only to applications, using CoW faults for writes to simplify DIMM hardware.

CXL-ClusterSim: Modeling CXL-based Disaggregated Memory Cluster for Pooling and Sharing using gem5 and SST

cs.AR · 2026-05-26 · unverdicted · novelty 6.0

CXL-ClusterSim is a full-system simulation framework combining gem5 and SST to model CXL disaggregated memory for pooling and sharing.

Proxics: an efficient programming model for far memory accelerators

cs.OS · 2026-04-20 · conditional · novelty 6.0

Proxics introduces lightweight virtual processors and low-latency communication channels as portable OS abstractions for programming near-data processing accelerators, demonstrated on real hardware for memory-intensive workloads.

Mambalaya: Einsum-Based Fusion Optimizations on State-Space Models

cs.AR · 2026-04-04 · unverdicted · novelty 6.0

Mambalaya delivers 4.9x prefill and 1.9x generation speedups on Mamba layers over prior accelerators by systematically fusing inter-Einsum operations.

CCCL: Node-Spanning GPU Collectives with CXL Memory Pooling

cs.DC · 2026-02-25 · unverdicted · novelty 6.0

CCCL delivers 1.34-1.94x faster cross-node GPU collectives via CXL memory pooling than 200 Gbps InfiniBand RDMA, with 1.11x LLM training speedup and 2.75x hardware cost reduction.

Equilibria: Fair Multi-Tenant CXL Memory Tiering At Scale

cs.OS · 2026-02-09 · conditional · novelty 6.0

Equilibria delivers per-container fairness controls and observability for CXL memory tiering, improving production workload performance by up to 52% over Linux TPP while suppressing noisy-neighbor interference.

PRISM: Probabilistic Runtime Insights and Scalable Performance Modeling for Large-Scale Distributed Training

cs.DC · 2025-10-17 · unverdicted · novelty 5.0

PRISM introduces a probabilistic performance modeling framework that quantifies guarantees on training time for large-scale distributed systems under runtime variability.

The EDGE Language: Extended General Einsums for Graph Algorithms

cs.DS · 2024-04-17 · 2 refs

citing papers explorer

Showing 1 of 1 citing paper after filters.

The EDGE Language: Extended General Einsums for Graph Algorithms cs.DS · 2024-04-17 · unreviewed · ref 63 · 2 links

In Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (Vancouver, BC, Canada) (ASPLOS 2023), Tor M

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer