pith. sign in
Pith Number

pith:GNQPJI4S

pith:2025:GNQPJI4STITBXFP7ZUCQA7D4F7
not attested not anchored not stored refs resolved

Genie Envisioner: A Unified World Foundation Platform for Robotic Manipulation

Donglin Yang, Guanghui Ren, Jianlan Luo, Jingbin Cai, Liliang Chen, Maoqing Yao, Pengfei Zhou, Shengcong Chen, Shuicheng Yan, Si Liu, Siyuan Huang, Yue Hu, Yue Liao, Yuxin Jiang

A single instruction-conditioned video diffusion model unifies policy learning, simulation, and evaluation for robotic manipulation.

arxiv:2508.05635 v3 · 2025-08-07 · cs.RO · cs.CV

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{GNQPJI4STITBXFP7ZUCQA7D4F7}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

GE integrates policy learning, evaluation, and simulation within a single video-generative framework, establishing a scalable and practical foundation for instruction-driven, general-purpose embodied intelligence.

C2weakest assumption

That the instruction-conditioned video diffusion model in GE-Base sufficiently captures real-world spatial, temporal, and semantic dynamics to support accurate action mapping in GE-Act and reliable rollouts in GE-Sim across diverse embodiments.

C3one line summary

Genie Envisioner unifies robotic policy learning, simulation, and evaluation inside one instruction-conditioned video diffusion framework using GE-Base, GE-Act, and GE-Sim.

References

30 extracted · 30 resolved · 20 Pith anchors

[1] Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs · arXiv:2503.01743
[2] Cosmos World Foundation Model Platform for Physical AI · arXiv:2501.03575
[3] Do As I Can, Not As I Say: Grounding Language in Robotic Affordances · arXiv:2204.01691
[4] Qwen2.5-VL Technical Report · arXiv:2502.13923
[5] GR00T N1: An Open Foundation Model for Generalist Humanoid Robots · arXiv:2503.14734

Formal links

3 machine-checked theorem links

Cited by

25 papers in Pith

Receipt and verification
First computed 2026-05-17T23:38:50.107568Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

3360f4a3929a261b95ffcd05007c7c2fdf99d14c8a47b913f375957d457a151f

Aliases

arxiv: 2508.05635 · arxiv_version: 2508.05635v3 · doi: 10.48550/arxiv.2508.05635 · pith_short_12: GNQPJI4STITB · pith_short_16: GNQPJI4STITBXFP7 · pith_short_8: GNQPJI4S
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/GNQPJI4STITBXFP7ZUCQA7D4F7 \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 3360f4a3929a261b95ffcd05007c7c2fdf99d14c8a47b913f375957d457a151f
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "97caad6ad19a11beb1bd896fc1d5c61e7b25d856f5a70d6c789693cb7c831960",
    "cross_cats_sorted": [
      "cs.CV"
    ],
    "license": "http://creativecommons.org/licenses/by/4.0/",
    "primary_cat": "cs.RO",
    "submitted_at": "2025-08-07T17:59:44Z",
    "title_canon_sha256": "5328146171b55d888e894fcfc9a7a0b678ebab9e88c800f3a863ca6e1cdc1e83"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2508.05635",
    "kind": "arxiv",
    "version": 3
  }
}