{"paper":{"title":"EponaV2: Driving World Model with Comprehensive Future Reasoning","license":"http://arxiv.org/licenses/nonexclusive-distrib/1.0/","headline":"EponaV2 improves trajectory planning in autonomous driving by training world models to forecast future 3D geometry and semantic maps instead of next-frame images alone.","cross_cats":[],"primary_cat":"cs.CV","authors_text":"Jian Yang, Jia-Wang Bian, Jiawei Xu, Jin Xie, Kaicheng Zhang, Mingkai Jia, Mingxiao Li, Qian Zhang, Wei Yin, Zhijian Shu, Zhizhou Zhong","submitted_at":"2026-05-14T11:12:23Z","abstract_excerpt":"Data scaling plays a pivotal role in the pursuit of general intelligence. However, the prevailing perception-planning paradigm in autonomous driving relies heavily on expensive manual annotations to supervise trajectory planning, which severely limits its scalability. Conversely, although existing perception-free driving world models achieve impressive driving performance, their real-world reasoning ability for planning is solely built on next frame image forecasting. Due to the lack of enough supervision, these models often struggle with comprehensive scene understanding, resulting in unsatis"},"claims":{"count":4,"items":[{"kind":"strongest_claim","text":"The state-of-the-art (SOTA) performances of EponaV2 among perception-free models on three NAVSIM benchmarks (+1.3PDMS, +5.5EPDMS) demonstrate the effectiveness of our methods.","source":"verdict.strongest_claim","status":"machine_extracted","claim_id":"C1","attestation":"unclaimed"},{"kind":"weakest_assumption","text":"That training the model to forecast future 3D geometry and semantic maps (decoded from the world model) will automatically produce superior real-world reasoning and trajectory planning compared to next-frame image forecasting alone.","source":"verdict.weakest_assumption","status":"machine_extracted","claim_id":"C2","attestation":"unclaimed"},{"kind":"one_line_summary","text":"EponaV2 advances perception-free driving world models by forecasting comprehensive future 3D geometry and semantic representations, achieving SOTA planning performance on NAVSIM benchmarks.","source":"verdict.one_line_summary","status":"machine_extracted","claim_id":"C3","attestation":"unclaimed"},{"kind":"headline","text":"EponaV2 improves trajectory planning in autonomous driving by training world models to forecast future 3D geometry and semantic maps instead of next-frame images alone.","source":"verdict.pith_extraction.headline","status":"machine_extracted","claim_id":"C4","attestation":"unclaimed"}],"snapshot_sha256":"955f97c865e4256a01c5142bb93b611d10d5f6d71570d1a8e8b8ea7cd9ea07b5"},"source":{"id":"2605.14696","kind":"arxiv","version":1},"verdict":{"id":"896d5fb1-56e7-4e19-a194-755a060c74bb","model_set":{"reader":"grok-4.3"},"created_at":"2026-05-15T05:10:19.918951Z","strongest_claim":"The state-of-the-art (SOTA) performances of EponaV2 among perception-free models on three NAVSIM benchmarks (+1.3PDMS, +5.5EPDMS) demonstrate the effectiveness of our methods.","one_line_summary":"EponaV2 advances perception-free driving world models by forecasting comprehensive future 3D geometry and semantic representations, achieving SOTA planning performance on NAVSIM benchmarks.","pipeline_version":"pith-pipeline@v0.9.0","weakest_assumption":"That training the model to forecast future 3D geometry and semantic maps (decoded from the world model) will automatically produce superior real-world reasoning and trajectory planning compared to next-frame image forecasting alone.","pith_extraction_headline":"EponaV2 improves trajectory planning in autonomous driving by training world models to forecast future 3D geometry and semantic maps instead of next-frame images alone."},"references":{"count":99,"sample":[{"doi":"","year":2023,"title":"Building normalizing flows with stochastic interpolants","work_id":"c9718ee3-0b76-4d7e-929d-00dbb06b6e74","ref_index":1,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2025,"title":"Qwen3-VL Technical Report","work_id":"1fe243aa-e3c0-4da6-b391-4cbcfc88d5c0","ref_index":2,"cited_arxiv_id":"2511.21631","is_internal_anchor":true},{"doi":"","year":2025,"title":"RoboTron-Sim: Improving real-world driving via simulated hard-case.arXiv preprint arXiv:0000.00000, 2025","work_id":"ce8c02b5-4a89-4a81-bdf9-ac4c2c1936f1","ref_index":3,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":1903,"title":"nuScenes: A multimodal dataset for autonomous driving","work_id":"a687c611-43a9-4af4-bf00-36a2d9fa85a8","ref_index":4,"cited_arxiv_id":"1903.11027","is_internal_anchor":true},{"doi":"","year":2025,"title":"Pseudo-simulation for autonomous driving","work_id":"96d6d46e-aaa9-40e5-9d5e-9546559842f0","ref_index":5,"cited_arxiv_id":"","is_internal_anchor":false}],"resolved_work":99,"snapshot_sha256":"9d3e7dc5a7688006c01086695ce74da2e9c92307229e2c0954ebceeed1d4f234","internal_anchors":23},"formal_canon":{"evidence_count":2,"snapshot_sha256":"9421a9f4c697273750d1f5b1343c6ee09635902c623429c98eaf5712d031c0d2"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"}