Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: Industry Track , year =

Tang, Yihong, Wang, Zhaokai, Qu, Ao, Yan, Yihao, Wu, Zhaofeng, Zhuang, Dingyi · 2024 · DOI 10.18653/v1/2024.emnlp-industry.104

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

open at publisher browse 1 citing papers

representative citing papers

CacheWeaver: Cache-Aware Evidence Ordering for Efficient Grounded RAG Inference

cs.CL · 2026-06-18 · unverdicted · novelty 4.0

CacheWeaver is a lightweight scheduling layer that orders evidence to exploit prefix caching, reducing median TTFT by 20-33% across vLLM setups while preserving answer quality.

citing papers explorer

Showing 1 of 1 citing paper.

CacheWeaver: Cache-Aware Evidence Ordering for Efficient Grounded RAG Inference cs.CL · 2026-06-18 · unverdicted · none · ref 10
CacheWeaver is a lightweight scheduling layer that orders evidence to exploit prefix caching, reducing median TTFT by 20-33% across vLLM setups while preserving answer quality.

Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: Industry Track , year =

fields

years

verdicts

representative citing papers

citing papers explorer