pith. sign in
Pith Number

pith:F5ML3ZOQ

pith:2025:F5ML3ZOQZMUC5AV5HCUUZNQI6T
not attested not anchored not stored refs resolved

VBench-2.0: Advancing Video Generation Benchmark Suite for Intrinsic Faithfulness

Dian Zheng, Fan Zhang, Hongbo Liu, Jingwen He, Kai Zou, Lulu Gu, Wei-Shi Zheng, Yinan He, Yuanhan Zhang, Yu Qiao, Ziqi Huang, Ziwei Liu

VBench-2.0 introduces a benchmark that tests video generation models for intrinsic faithfulness to physical laws, human anatomy, and commonsense.

arxiv:2503.21755 v2 · 2025-03-27 · cs.CV

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{F5ML3ZOQZMUC5AV5HCUUZNQI6T}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

To bridge this gap, we introduce VBench-2.0, a next-generation benchmark designed to automatically evaluate video generative models for their intrinsic faithfulness. VBench-2.0 assesses five key dimensions: Human Fidelity, Controllability, Creativity, Physics, and Commonsense, each further broken down into fine-grained capabilities.

C2weakest assumption

That integration of SOTA VLMs, LLMs, and anomaly detection methods, validated by human annotations, will reliably measure intrinsic faithfulness without introducing new biases or missing subtle violations of physical and commonsense rules.

C3one line summary

VBench-2.0 is a benchmark suite that automatically evaluates video generative models on five dimensions of intrinsic faithfulness: Human Fidelity, Controllability, Creativity, Physics, and Commonsense using VLMs, LLMs, and anomaly detection methods.

References

96 extracted · 96 resolved · 22 Pith anchors

[1] Magicedit: High-fidelity and temporally coherent video editing 2023
[2] Stable video diffusion: A novel ap- proach to image-to-video generation.arXiv preprint arXiv:2308.09592, 2023 2023
[3] TokenFlow: Consistent Diffusion Features for Consistent Video Editing 2023 · arXiv:2307.10373
[4] Inve: Interactive neural video editing, 2023
[5] Videdit: Zero-shot and spatially aware text-driven video editing, 2023

Formal links

3 machine-checked theorem links

Cited by

46 papers in Pith

Receipt and verification
First computed 2026-05-17T23:39:22.165900Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

2f58bde5d0cb282e82bd38a94cb608f4ef63deee0c73b7a33b59f7092a56a960

Aliases

arxiv: 2503.21755 · arxiv_version: 2503.21755v2 · doi: 10.48550/arxiv.2503.21755 · pith_short_12: F5ML3ZOQZMUC · pith_short_16: F5ML3ZOQZMUC5AV5 · pith_short_8: F5ML3ZOQ
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/F5ML3ZOQZMUC5AV5HCUUZNQI6T \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 2f58bde5d0cb282e82bd38a94cb608f4ef63deee0c73b7a33b59f7092a56a960
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "c68e17eaaa7dd1d5795a73093939a039493a00a463003e3cc3084962130eca80",
    "cross_cats_sorted": [],
    "license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
    "primary_cat": "cs.CV",
    "submitted_at": "2025-03-27T17:57:01Z",
    "title_canon_sha256": "4d572520ce77819b8c4ae41359fddabd170c5d9a37cb2cfec0b4517dcc41b34a"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2503.21755",
    "kind": "arxiv",
    "version": 2
  }
}