pith. sign in
Pith Number

pith:ADJSTXNE

pith:2025:ADJSTXNEHHQEQFZIW2RD2VL445
not attested not anchored not stored refs resolved

LongLive: Real-time Interactive Long Video Generation

Enze Xie, Muyang Li, Ruihang Chu, Shuai Yang, Song Han, Wei Huang, Xianbang Wang, Yao Lu, Yicheng Xiao, Yingcong Chen, Yukang Chen, Yuyang Zhao

LongLive turns a short-clip autoregressive model into a real-time system that generates up to 240-second videos at 20.7 FPS while accepting streaming prompt changes.

arxiv:2509.22622 v2 · 2025-09-26 · cs.CV

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{ADJSTXNEHHQEQFZIW2RD2VL445}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

With these key designs, LongLive fine-tunes a 1.3B-parameter short-clip model to minute-long generation in just 32 GPU-days. At inference, LongLive sustains 20.7 FPS on a single NVIDIA H100, achieves strong performance on VBench in both short and long videos. LongLive supports up to 240-second videos on a single H100 GPU.

C2weakest assumption

The assumption that KV-recache combined with short-window attention and frame sink maintains visual consistency and semantic adherence across prompt transitions and long sequences without introducing cumulative artifacts or drift, as this is presented as sufficient based on the described training alignment.

C3one line summary

LongLive is a causal autoregressive video generator that produces up to 240-second interactive videos at 20.7 FPS on one H100 GPU after 32 GPU-days of fine-tuning from a 1.3B short-clip model.

References

108 extracted · 108 resolved · 9 Pith anchors

[1] Diffusion forcing: Next-token prediction meets full-sequence diffusion 2024
[2] SkyReels-V2: Infinite-length Film Generative Model 2025 · arXiv:2504.13074
[3] Sana-video: Efficient video generation with block linear diffusion transformer 2025
[4] SEINE: short-to-long video diffusion model for generative transition and prediction 2024
[5] Longlora: Efficient fine-tuning of long-context large language models 2024

Formal links

2 machine-checked theorem links

Cited by

45 papers in Pith

Receipt and verification
First computed 2026-05-17T23:38:53.619263Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

00d329dda439e0481728b6a23d557ce7605d4efdcd66dadac49cb2680477fd3f

Aliases

arxiv: 2509.22622 · arxiv_version: 2509.22622v2 · doi: 10.48550/arxiv.2509.22622 · pith_short_12: ADJSTXNEHHQE · pith_short_16: ADJSTXNEHHQEQFZI · pith_short_8: ADJSTXNE
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/ADJSTXNEHHQEQFZIW2RD2VL445 \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 00d329dda439e0481728b6a23d557ce7605d4efdcd66dadac49cb2680477fd3f
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "c515fbbf9e9c41cccec4793b0ed0a083e20133de3c46427cba0e8bf131f96a66",
    "cross_cats_sorted": [],
    "license": "http://creativecommons.org/licenses/by/4.0/",
    "primary_cat": "cs.CV",
    "submitted_at": "2025-09-26T17:48:24Z",
    "title_canon_sha256": "638303bcc893c10be0132226122bda9d4b5bc7db58ce39c77753e57801d03740"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2509.22622",
    "kind": "arxiv",
    "version": 2
  }
}