pith. sign in
Pith Number

pith:DLYS3JLS

pith:2026:DLYS3JLSQZ7XFN6UTMP335JYMH
not attested not anchored not stored refs resolved

Action Emergence from Streaming Intent

Benjin Zhu, Hengtong Lu, Jifeng Dai, Pengfei Jing, Victor Shea-Jay Huang, Xie Yan

Streaming Intent lets an end-to-end driving model generate distinct, high-quality trajectories by deriving and steering with reasoned intent classes.

arxiv:2605.12622 v2 · 2026-05-12 · cs.RO · cs.CV

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{DLYS3JLSQZ7XFN6UTMP335JYMH}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

SI achieves intent-faithful controllability to our knowledge for the first time in a fully end-to-end VLA: for a fixed scene, varying the intent class at inference yields qualitatively distinct yet consistently high-quality plans arising purely from data-driven learning without any pre-built trajectory bank or hand-coded post-hoc selector.

C2weakest assumption

The assumption that the autoregressive chain-of-thought decoding causally derives semantically appropriate intent from scene understanding in a manner that enables generalization to arbitrary long-tail traffic scenes.

C3one line summary

A new VLA model called SI uses a four-step chain-of-thought to derive driving intent and applies it via classifier-free guidance to a flow-matching trajectory generator, showing competitive Waymo scores and intent-controllable plans.

References

65 extracted · 65 resolved · 9 Pith anchors

[1] Advances in Neural Information Processing Systems , year =
[2] Transactions on Machine Learning Research , year =
[3] Wod-e2e: Waymo open dataset for end-to-end driving in challenging long-tail scenarios
[4] AutoVLA: A Vision-Language-Action Model for End-to-End Autonomous Driving with Adaptive Reasoning and Reinforcement Fine-Tuning · arXiv:2506.13757
[5] arXiv preprint arXiv:2506.11234 (2025)

Formal links

2 machine-checked theorem links

Receipt and verification
First computed 2026-05-18T03:10:00.425675Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

1af12da572867f72b7d49b1fbdf53861f03c969f67bd14bfa560737527c09106

Aliases

arxiv: 2605.12622 · arxiv_version: 2605.12622v2 · doi: 10.48550/arxiv.2605.12622 · pith_short_12: DLYS3JLSQZ7X · pith_short_16: DLYS3JLSQZ7XFN6U · pith_short_8: DLYS3JLS
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/DLYS3JLSQZ7XFN6UTMP335JYMH \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 1af12da572867f72b7d49b1fbdf53861f03c969f67bd14bfa560737527c09106
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "06d3cea63da75478b679b7149446bd9cbffc660f5823cb66c6a73be4b375e780",
    "cross_cats_sorted": [
      "cs.CV"
    ],
    "license": "http://creativecommons.org/licenses/by/4.0/",
    "primary_cat": "cs.RO",
    "submitted_at": "2026-05-12T18:09:04Z",
    "title_canon_sha256": "6bad7dda4b8daf3536989115aef7ed15e5ad0ae7520f41cc9dbee5ef2d6176e5"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2605.12622",
    "kind": "arxiv",
    "version": 2
  }
}