pith. sign in
Pith Number

pith:CRW75HZR

pith:2026:CRW75HZRZFE262S4FAMM6E4K4Y
not attested not anchored not stored refs resolved

EntityBench: Towards Entity-Consistent Long-Range Multi-Shot Video Generation

Meng Wei, Ruozhen He, Vicente Ordonez, Ziyan Yang

Explicit per-entity memory maintains character consistency across long gaps in multi-shot video generation where existing methods fail.

arxiv:2605.15199 v1 · 2026-05-14 · cs.CV · cs.AI

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{CRW75HZRZFE262S4FAMM6E4K4Y}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

Experiments show that cross-shot entity consistency degrades sharply with recurrence distance in existing methods, and that explicit per-entity memory yields the highest character fidelity (Cohen's d = +2.33) and presence among methods evaluated.

C2weakest assumption

The entity schedules extracted from real narrative media and the fidelity gate used for cross-shot scoring accurately reflect the consistency challenges faced by current video generation models.

C3one line summary

EntityBench is a new benchmark with detailed per-shot entity schedules from real media, and the EntityMem baseline using persistent per-entity memory achieves the highest character fidelity with Cohen's d of +2.33.

References

34 extracted · 34 resolved · 9 Pith anchors

[1] Mixture of contexts for long video generation
[2] SkyReels-V2: Infinite-length Film Generative Model · arXiv:2504.13074
[3] Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities · arXiv:2507.06261
[4] Narrlv: Towards a comprehensive narrative-centric evaluation for long video generation.arXiv preprint arXiv:2507.11245,
[5] Longvie: Multimodal-guided controllable ultra-long video generation

Formal links

2 machine-checked theorem links

Receipt and verification
First computed 2026-05-17T21:40:24.971979Z
Last reissued 2026-05-17T21:57:18.363405Z
Builder pith-number-builder-2026-05-17-v1
Signature unsigned_v0
Schema pith-number/v1.0

Canonical hash

146dfe9f31c949af6a5c2818cf138ae635c727e7a65ae7d5e16baa7453d0e54f

Aliases

arxiv: 2605.15199 · arxiv_version: 2605.15199v1 · pith_short_12: CRW75HZRZFE2 · pith_short_16: CRW75HZRZFE262S4 · pith_short_8: CRW75HZR
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/CRW75HZRZFE262S4FAMM6E4K4Y \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 146dfe9f31c949af6a5c2818cf138ae635c727e7a65ae7d5e16baa7453d0e54f
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "4c96a559abd3ffc6454d3041f0dfedd22d0bdd44731caf3124aa9ab26cba1c57",
    "cross_cats_sorted": [
      "cs.AI"
    ],
    "license": "http://creativecommons.org/licenses/by-sa/4.0/",
    "primary_cat": "cs.CV",
    "submitted_at": "2026-05-14T17:59:55Z",
    "title_canon_sha256": "11d992f42dc4675222b71a4aea55c171ea04bf8e28a4ea7ab891a585ae23f18a"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2605.15199",
    "kind": "arxiv",
    "version": 1
  }
}