pith:OZLX7HA3
Stream2LLM: Overlap Context Streaming and Prefill for Reduced Time-to-First-Token (TTFT)
Stream2LLM overlaps incremental context retrieval with LLM prefill to cut time-to-first-token by up to 11x while matching non-streaming throughput.
arxiv:2604.16395 v3 · 2026-03-29 · cs.DB · cs.AI
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{OZLX7HA3YXIBF5GH76HSDTRYMI}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge
Record completeness
Claims
Our evaluation demonstrates that streaming architecture delivers up to 11x TTFT improvements, with cost-aware scheduling providing critical benefits under memory pressure, all while maintaining throughput parity with non-streaming baselines.
The two collected real-world workloads (web crawling and approximate nearest neighbor search) are representative of production streaming patterns, and the adaptive preemption strategies incur acceptable overheads that do not erode the reported TTFT gains in practice.
Stream2LLM delivers up to 11x lower TTFT by streaming context retrieval, adaptive preemption for append and update modes, and longest-common-prefix reuse in disaggregated LLM deployments while preserving throughput.
Receipt and verification
| First computed | 2026-05-20T00:04:31.857493Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
76577f9c1bc5d012f4c7ff8f21ce38621db5bd06b819327452f90da7b015b9b3
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/OZLX7HA3YXIBF5GH76HSDTRYMI \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 76577f9c1bc5d012f4c7ff8f21ce38621db5bd06b819327452f90da7b015b9b3
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "018b068a404b395d18a282dcfae45ad08bbeb8b5f35a74b5176d4b4f954a889b",
"cross_cats_sorted": [
"cs.AI"
],
"license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
"primary_cat": "cs.DB",
"submitted_at": "2026-03-29T06:49:12Z",
"title_canon_sha256": "202db2d44f5a870a62c62d6b1f77396d4a4b6774a2324eb2d9b11247275e198d"
},
"schema_version": "1.0",
"source": {
"id": "2604.16395",
"kind": "arxiv",
"version": 3
}
}