pith:LWVTT7QD
StreamingVLM: Real-Time Understanding for Infinite Video Streams
A vision-language model achieves stable real-time understanding of arbitrarily long video streams through a streaming attention cache aligned with training on short clips.
arxiv:2510.09608 v1 · 2025-10-10 · cs.CV · cs.AI · cs.CL
Record completeness
Claims
On Inf-Streams-Eval, StreamingVLM achieves a 66.18% win rate against GPT-4O mini and maintains stable, real-time performance at up to 8 FPS on a single NVIDIA H100.
That supervised fine-tuning with full attention on short overlapped video chunks will produce stable coherence and performance when the same model is later run with the streaming KV cache on arbitrarily long, non-overlapped video streams.
StreamingVLM enables stable real-time understanding of infinite video streams at up to 8 FPS using a streaming KV cache and aligned SFT on overlapped chunks, with a 66.18% win rate over GPT-4O mini on a new two-hour video benchmark.
References
Formal links
Cited by
Receipt and verification
| First computed | 2026-05-17T23:38:14.195787Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
5dab39fe0367e9a66558f7c456b95a738cc15ff70419b293c2e5ec8f7245c54c
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/LWVTT7QDM7U2MZKY67CFNOK2OO \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 5dab39fe0367e9a66558f7c456b95a738cc15ff70419b293c2e5ec8f7245c54c
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "87a6d0d45f33b663733e1d3ccab4840f56fea1a808eabf19e20b21ee3d318aa3",
"cross_cats_sorted": [
"cs.AI",
"cs.CL"
],
"license": "http://creativecommons.org/licenses/by/4.0/",
"primary_cat": "cs.CV",
"submitted_at": "2025-10-10T17:59:58Z",
"title_canon_sha256": "b46c56aa5c0f7276e4ddc1686d851d012fd82e8fe2abf4e0a3fb109d49914448"
},
"schema_version": "1.0",
"source": {
"id": "2510.09608",
"kind": "arxiv",
"version": 1
}
}