TileFuse introduces a fused kernel library enabling AWQ W4A16/W8A16 quantized LLM inference on AMD NPUs, reporting up to 2.0x lower prefilling latency and 64.6% lower energy on Ryzen AI laptops.
In: Annual International ACM SIGIR Conference on Research and Develop- ment in Information Retrieval (2024)
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
method 1
citation-polarity summary
verdicts
UNVERDICTED 2roles
method 1polarities
use method 1representative citing papers
A literature survey on abstract concept recognition in videos that catalogs prior tasks and datasets while advocating for foundation models and reuse of decades of community experience.
citing papers explorer
No citing papers match the current filters.