Title resolution pending

Xupeng Miao, Chunan Shi, Jiangfei Duan, Xiaoli Xi, Dahua Lin, Bin Cui, Zhihao Jia · 2024 · arXiv 0665.36404

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

representative citing papers

KernelFlume: Elastic Core-Attention Scaling for Agentic Long-Context Decoding

cs.DC · 2026-06-28 · unverdicted · novelty 5.0

KernelFlume presents a disaggregated decode architecture that separates core attention from projection/FFN paths to enable elastic scaling of attention nodes, reporting up to 61% lower cost per million tokens versus full-instance scaling on H100 hardware for Llama-3.1-8B under dynamic long-context w

citing papers explorer

Showing 1 of 1 citing paper after filters.

KernelFlume: Elastic Core-Attention Scaling for Agentic Long-Context Decoding cs.DC · 2026-06-28 · unverdicted · none · ref 20
KernelFlume presents a disaggregated decode architecture that separates core attention from projection/FFN paths to enable elastic scaling of attention nodes, reporting up to 61% lower cost per million tokens versus full-instance scaling on H100 hardware for Llama-3.1-8B under dynamic long-context w

Title resolution pending

fields

years

verdicts

representative citing papers

citing papers explorer