pith. sign in

Graham Lopez, Matthew B

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

fields

cs.DC 1

years

2026 1

verdicts

UNVERDICTED 1

clear filters

representative citing papers

KernelFlume: Elastic Core-Attention Scaling for Agentic Long-Context Decoding

cs.DC · 2026-06-28 · unverdicted · novelty 5.0

KernelFlume presents a disaggregated decode architecture that separates core attention from projection/FFN paths to enable elastic scaling of attention nodes, reporting up to 61% lower cost per million tokens versus full-instance scaling on H100 hardware for Llama-3.1-8B under dynamic long-context w

citing papers explorer

Showing 1 of 1 citing paper after filters.

  • KernelFlume: Elastic Core-Attention Scaling for Agentic Long-Context Decoding cs.DC · 2026-06-28 · unverdicted · none · ref 29

    KernelFlume presents a disaggregated decode architecture that separates core attention from projection/FFN paths to enable elastic scaling of attention nodes, reporting up to 61% lower cost per million tokens versus full-instance scaling on H100 hardware for Llama-3.1-8B under dynamic long-context w