pith. sign in
Pith Number

pith:L47BPRRK

pith:2026:L47BPRRKTA4UYTHJVTKSHITPJG
not attested not anchored not stored refs resolved

PanoWorld: Geometry-Consistent Panoramic Video World Modeling

Bishoy Galoaa, Caleb James Lee, Edmund Yeh, Jennifer Dy, Le Jiang, Sarah Ostadabbas, Shayda Moezzi, Tooba Imtiaz, Xiangyu Bai, Yanzhi Wang

PanoWorld improves geometric consistency in panoramic videos by enforcing depth and trajectory constraints.

arxiv:2605.15391 v1 · 2026-05-14 · cs.CV · cs.AI

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{L47BPRRKTA4UYTHJVTKSHITPJG}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

By framing panoramic video generation as geometry- and dynamics-consistent latent state modeling and introducing depth consistency loss against pseudo ground-truth panoramic depth plus trajectory consistency loss on 3D world-frame positions, PanoWorld improves geometric consistency over prior panoramic generation methods while maintaining competitive visual realism.

C2weakest assumption

The pseudo ground-truth panoramic depth maps used for the depth consistency loss are accurate enough to enforce genuine 3D consistency without introducing systematic errors or artifacts into the generated video.

C3one line summary

PanoWorld adds depth consistency and trajectory consistency losses plus spherical adaptations to a pre-trained video model, plus a new PanoGeo dataset, to produce geometry-consistent 360 video.

References

30 extracted · 30 resolved · 12 Pith anchors

[1] Cosmos World Foundation Model Platform for Physical AI · arXiv:2501.03575
[2] Videophy: Evaluating physical commonsense for video generation
[3] Lumiere: A space-time diffusion model for video generation 2024
[4] Revisiting Feature Prediction for Learning Visual Representations from Video · arXiv:2404.08471
[5] Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets · arXiv:2311.15127

Formal links

2 machine-checked theorem links

Receipt and verification
First computed 2026-05-20T00:00:56.200573Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

5f3e17c62a98394c4ce9acd523a26f4985c1fdf43294e29b3e0f521290df8a64

Aliases

arxiv: 2605.15391 · arxiv_version: 2605.15391v1 · doi: 10.48550/arxiv.2605.15391 · pith_short_12: L47BPRRKTA4U · pith_short_16: L47BPRRKTA4UYTHJ · pith_short_8: L47BPRRK
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/L47BPRRKTA4UYTHJVTKSHITPJG \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 5f3e17c62a98394c4ce9acd523a26f4985c1fdf43294e29b3e0f521290df8a64
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "31c4172212f18d76e2cbc67b1dc53f5409693b7f9cfeba4305fc4f2f4d568712",
    "cross_cats_sorted": [
      "cs.AI"
    ],
    "license": "http://creativecommons.org/licenses/by-nc-nd/4.0/",
    "primary_cat": "cs.CV",
    "submitted_at": "2026-05-14T20:24:23Z",
    "title_canon_sha256": "dade878fb12c476abe0a05e0820f658609943324088b39a72acfdd57efbae087"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2605.15391",
    "kind": "arxiv",
    "version": 1
  }
}