arXiv preprint arXiv:2505.11970 , year =

An Zou, Yuankai Xu, Yinchen Ni, Jintao Chen, Yehan Ma, Jing Li, Christopher Gill, Xuan Zhang, Yier Jin · 2025 · arXiv 2505.11970

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

read on arXiv browse 2 citing papers

representative citing papers

Enabling Mixed criticality applications for the Versal AI-Engines

cs.AR · 2026-04-22 · unverdicted · novelty 7.0

A dynamic task dispatcher enables runtime assignment of mixed-criticality tasks to Versal AIE tiles, cutting idle time 65.5% with under 0.002% overhead in an autonomous driving workload.

ShadowNPU: System and Algorithm Co-design for NPU-Centric On-Device LLM Inference

cs.PF · 2025-08-22 · unverdicted · novelty 5.0

ShadowNPU presents shadowAttn, a co-designed sparse attention system that uses NPU pilot compute and techniques like graph bucketing and per-head sparsity to minimize CPU/GPU fallback during on-device LLM inference while maintaining accuracy.

citing papers explorer

Showing 2 of 2 citing papers.

Enabling Mixed criticality applications for the Versal AI-Engines cs.AR · 2026-04-22 · unverdicted · none · ref 8
A dynamic task dispatcher enables runtime assignment of mixed-criticality tasks to Versal AIE tiles, cutting idle time 65.5% with under 0.002% overhead in an autonomous driving workload.
ShadowNPU: System and Algorithm Co-design for NPU-Centric On-Device LLM Inference cs.PF · 2025-08-22 · unverdicted · none · ref 86
ShadowNPU presents shadowAttn, a co-designed sparse attention system that uses NPU pilot compute and techniques like graph bucketing and per-head sparsity to minimize CPU/GPU fallback during on-device LLM inference while maintaining accuracy.

arXiv preprint arXiv:2505.11970 , year =

fields

years

verdicts

representative citing papers

citing papers explorer