pith. sign in
Pith Number

pith:3FBW3A4A

pith:2026:3FBW3A4AFRFRBSMCK3MCSDEUVD
not attested not anchored not stored refs pending

4DThinker: Thinking with 4D Imagery for Dynamic Spatial Understanding

Bo Li, Hongyu Li, Manyuan Zhang, Mingze Sun, Ruqi Huang, Shuang Chen, Xiang An, Xiaobin Hu, Xinlei Yu, Xin Xie, Zhangquan Chen, Zidong Wang

4DThinker lets vision-language models simulate evolving scenes inside their latent space for dynamic spatial reasoning from monocular video.

arxiv:2605.05997 v2 · 2026-05-07 · cs.CV

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{3FBW3A4AFRFRBSMCK3MCSDEUVD}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

4DThinker is the first framework that enables VLMs to 'think with 4D' through dynamic latent mental imagery, and extensive experiments demonstrate that it consistently outperforms strong baselines on dynamic spatial reasoning benchmarks.

C2weakest assumption

That the annotation-free 4D data synthesis pipeline produces sufficiently rich and accurate supervision signals, and that jointly training textual tokens with 4D latents via DIFT plus restricting 4DRL policy gradients to text tokens will yield stable and superior intrinsic dynamic reasoning without external geometric modules.

C3one line summary

4DThinker enables VLMs to perform dynamic spatial reasoning by internally simulating 4D imagery in latent space, outperforming prior text-based and modular approaches.

Receipt and verification
First computed 2026-05-25T02:01:22.041210Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

d9436d83802c4b10c98256d8290c94a8f835f836115044eb71be6654a214e369

Aliases

arxiv: 2605.05997 · arxiv_version: 2605.05997v2 · doi: 10.48550/arxiv.2605.05997 · pith_short_12: 3FBW3A4AFRFR · pith_short_16: 3FBW3A4AFRFRBSMC · pith_short_8: 3FBW3A4A
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/3FBW3A4AFRFRBSMCK3MCSDEUVD \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: d9436d83802c4b10c98256d8290c94a8f835f836115044eb71be6654a214e369
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "f4b835433cc3360f396de4aeb7b3cf8f5c7766855e33bf2c12c6d76b1405e159",
    "cross_cats_sorted": [],
    "license": "http://creativecommons.org/licenses/by/4.0/",
    "primary_cat": "cs.CV",
    "submitted_at": "2026-05-07T10:48:46Z",
    "title_canon_sha256": "c76117be01773671c93740f35074247895a1389591a521ba58600e1e6ddd0340"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2605.05997",
    "kind": "arxiv",
    "version": 2
  }
}