pith:U52T37GU
Latent Video Diffusion Models for High-Fidelity Long Video Generation
Video diffusion models shift to a low-dimensional 3D latent space to generate realistic clips longer than 1000 frames with modest compute.
arxiv:2211.13221 v2 · 2022-11-23 · cs.CV · cs.AI
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{U52T37GUSXOPBDROGA4KV2MDWC}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge
Record completeness
Claims
we introduce lightweight video diffusion models by leveraging a low-dimensional 3D latent space, significantly outperforming previous pixel-space video diffusion models under a limited computational budget... hierarchical diffusion in the latent space such that longer videos with more than one thousand frames can be produced... conditional latent perturbation and unconditional guidance that effectively mitigate the accumulated errors during the extension of video length.
The low-dimensional 3D latent space preserves sufficient spatial-temporal detail for high-fidelity generation, and the added perturbation and guidance steps prevent error accumulation without introducing new artifacts or inconsistencies.
Latent-space hierarchical diffusion models with targeted error-correction techniques generate realistic videos exceeding 1000 frames while using less compute than prior pixel-space approaches.
References
Formal links
Cited by
Receipt and verification
| First computed | 2026-05-17T23:38:53.534898Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
a7753dfcd495dcf08e2e3038aae983b0816e34f4ab1fadbd1e3ba9fe6640db33
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/U52T37GUSXOPBDROGA4KV2MDWC \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: a7753dfcd495dcf08e2e3038aae983b0816e34f4ab1fadbd1e3ba9fe6640db33
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "6dbcccf4bb7c02fbfe9928bc7b713e502af96cf6459bfa583f91f5be25e07262",
"cross_cats_sorted": [
"cs.AI"
],
"license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
"primary_cat": "cs.CV",
"submitted_at": "2022-11-23T18:58:39Z",
"title_canon_sha256": "c6faa78873c360f2d65fa170a921710b4b4f23535ada711afe37d86bc2dc53c3"
},
"schema_version": "1.0",
"source": {
"id": "2211.13221",
"kind": "arxiv",
"version": 2
}
}