Pith Number

pith:U52T37GU

pith:2022:U52T37GUSXOPBDROGA4KV2MDWC

not attested not anchored not stored refs resolved

Latent Video Diffusion Models for High-Fidelity Long Video Generation

Qifeng Chen, Tianyu Yang, Yingqing He, Ying Shan, Yong Zhang

Video diffusion models shift to a low-dimensional 3D latent space to generate realistic clips longer than 1000 frames with modest compute.

arxiv:2211.13221 v2 · 2022-11-23 · cs.CV · cs.AI

Open paper page JSON Open Graph Bundle Merged state Verified badge What is a Pith Number?

Add to your LaTeX paper

\usepackage{pith}
\pithnumber{U52T37GUSXOPBDROGA4KV2MDWC}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp

2 Internet Archive

3 Author claim open · sign in to claim

4 Citations open

5 Replications open

✓ Portable graph bundle live · download bundle · merged state

The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

we introduce lightweight video diffusion models by leveraging a low-dimensional 3D latent space, significantly outperforming previous pixel-space video diffusion models under a limited computational budget... hierarchical diffusion in the latent space such that longer videos with more than one thousand frames can be produced... conditional latent perturbation and unconditional guidance that effectively mitigate the accumulated errors during the extension of video length.

C2weakest assumption

The low-dimensional 3D latent space preserves sufficient spatial-temporal detail for high-fidelity generation, and the added perturbation and guidance steps prevent error accumulation without introducing new artifacts or inconsistencies.

C3one line summary

Latent-space hierarchical diffusion models with targeted error-correction techniques generate realistic videos exceeding 1000 frames while using less compute than prior pixel-space approaches.

References

48 extracted · 48 resolved · 15 Pith anchors

[1] Large scale GAN training for high ﬁdelity natural image synthesis 2019

[2] Generating long videos of dynamic scenes 2022

[3] Hier- archical video generation for complex data 2021

[4] Diffusion models beat gans on image synthesis 2021

[5] Taming transformers for high-resolution image synthesis 2021

Formal links

2 machine-checked theorem links

Cited by

41 papers in Pith

Scene-Action Prompt Fusion for Coherent Text-to-Video Storytelling

We'll Fix it in Post: Improving Text-to-Video Generation with Neuro-Symbolic Feedback

DriVerse: Navigation World Model for Driving Simulation via Multimodal Trajectory Prompting and Motion Alignment

Character-Centered Dialogue Generation from Scene-Level Prompts

Causal Forcing: Autoregressive Diffusion Distillation Done Right for High-Quality Real-Time Interactive Video Generation

Receipt and verification

First computed	2026-05-17T23:38:53.534898Z
Builder	pith-number-builder-2026-05-17-v1
Signature	Pith Ed25519 (`pith-v1-2026-05`) · public key
Schema	pith-number/v1.0

Canonical hash

a7753dfcd495dcf08e2e3038aae983b0816e34f4ab1fadbd1e3ba9fe6640db33

Aliases

arxiv: 2211.13221 · arxiv_version: 2211.13221v2 · doi: 10.48550/arxiv.2211.13221 · pith_short_12: U52T37GUSXOP · pith_short_16: U52T37GUSXOPBDRO · pith_short_8: U52T37GU

Agent API

Resolver JSON Graph JSON Events JSON Schema Signing key

Verify this Pith Number yourself

curl -sH 'Accept: application/ld+json' https://pith.science/pith/U52T37GUSXOPBDROGA4KV2MDWC \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: a7753dfcd495dcf08e2e3038aae983b0816e34f4ab1fadbd1e3ba9fe6640db33

Canonical record JSON

{
  "metadata": {
    "abstract_canon_sha256": "6dbcccf4bb7c02fbfe9928bc7b713e502af96cf6459bfa583f91f5be25e07262",
    "cross_cats_sorted": [
      "cs.AI"
    ],
    "license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
    "primary_cat": "cs.CV",
    "submitted_at": "2022-11-23T18:58:39Z",
    "title_canon_sha256": "c6faa78873c360f2d65fa170a921710b4b4f23535ada711afe37d86bc2dc53c3"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2211.13221",
    "kind": "arxiv",
    "version": 2
  }
}