Drivedpo: Policy learning via safety dpo for end-to-end autonomous driving.NeurIPS, 2025

Shuyao Shang, Yuntao Chen, Yuqi Wang, Yingyan Li, Zhaoxiang Zhang · 2025

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

Distill to Think, Foresee to Act: Cognitive-Physical Reinforcement Learning for Autonomous Driving

cs.CV · 2026-05-20 · unverdicted · novelty 6.0

CoPhy distills VLM knowledge into a BEV encoder and uses an action-conditioned auto-regressive BEV world model inside GRPO with dual physical-cognitive rewards to reach SOTA on NAVSIM v1/v2 while adding language-based intent control.

citing papers explorer

Showing 1 of 1 citing paper.

Distill to Think, Foresee to Act: Cognitive-Physical Reinforcement Learning for Autonomous Driving cs.CV · 2026-05-20 · unverdicted · none · ref 43
CoPhy distills VLM knowledge into a BEV encoder and uses an action-conditioned auto-regressive BEV world model inside GRPO with dual physical-cognitive rewards to reach SOTA on NAVSIM v1/v2 while adding language-based intent control.

Drivedpo: Policy learning via safety dpo for end-to-end autonomous driving.NeurIPS, 2025

fields

years

verdicts

representative citing papers

citing papers explorer