A dynamic task dispatcher enables runtime assignment of mixed-criticality tasks to Versal AIE tiles, cutting idle time 65.5% with under 0.002% overhead in an autonomous driving workload.
arXiv preprint arXiv:2505.11970 , year =
2 Pith papers cite this work. Polarity classification is still indexing.
verdicts
UNVERDICTED 2representative citing papers
ShadowNPU presents shadowAttn, a co-designed sparse attention system that uses NPU pilot compute and techniques like graph bucketing and per-head sparsity to minimize CPU/GPU fallback during on-device LLM inference while maintaining accuracy.
citing papers explorer
-
Enabling Mixed criticality applications for the Versal AI-Engines
A dynamic task dispatcher enables runtime assignment of mixed-criticality tasks to Versal AIE tiles, cutting idle time 65.5% with under 0.002% overhead in an autonomous driving workload.
-
ShadowNPU: System and Algorithm Co-design for NPU-Centric On-Device LLM Inference
ShadowNPU presents shadowAttn, a co-designed sparse attention system that uses NPU pilot compute and techniques like graph bucketing and per-head sparsity to minimize CPU/GPU fallback during on-device LLM inference while maintaining accuracy.