pith. sign in

arxiv: 2106.11810 · v4 · pith:OHWF4I4Tnew · submitted 2021-06-22 · 💻 cs.CV

NuPlan: A closed-loop ML-based planning benchmark for autonomous vehicles

Pith reviewed 2026-05-15 08:57 UTC · model grok-4.3

classification 💻 cs.CV
keywords autonomous drivingmotion planningclosed-loop evaluationbenchmarkdriving datasetreactive agentsmachine learning
0
0 comments X

The pith

NuPlan establishes the first closed-loop benchmark for machine learning planners in autonomous driving.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper claims that existing open-loop evaluation methods using short-term L2 metrics cannot properly assess long-term planning performance in autonomous vehicles. It introduces NuPlan to address this gap through a large dataset of 1500 hours of real human driving from four cities, a lightweight closed-loop simulator with reactive agents, and planning-specific metrics. A sympathetic reader would care because this setup enables fairer testing of how planners handle dynamic interactions over time, which is essential for advancing safer autonomous systems.

Core claim

We propose the world's first closed-loop ML-based planning benchmark for autonomous driving. The benchmark includes a large-scale driving dataset with 1500h of human driving data from 4 cities across the US and Asia, a closed-loop simulation framework with reactive agents, and a large set of both general and scenario-specific planning metrics.

What carries the argument

The closed-loop simulator with reactive agents that interact dynamically with the planner being tested, shifting evaluation from static short-term forecasts to interactive long-term planning outcomes.

If this is right

  • Planners will be assessed in interactive settings where other agents respond to their actions rather than following fixed trajectories.
  • Evaluation will shift from L2-based short-term prediction scores to metrics tailored for long-term planning success and failure modes.
  • The multi-city dataset will allow testing of how well planners generalize across different traffic patterns and regions.
  • Organized benchmark challenges can standardize comparisons and accelerate development of better ML planning models.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • This benchmark could reveal that many current ML planners perform worse under interactive conditions than open-loop tests suggest.
  • Researchers might extend the reactive agent behaviors using patterns from the collected driving data to increase simulation realism.
  • The framework may support hybrid evaluations that combine simulation results with limited real-vehicle validation to improve correlation.

Load-bearing premise

The chosen metrics and reactive-agent simulator will produce planner rankings that correlate with real-world safety and performance once deployed on physical vehicles.

What would settle it

Deploying several benchmark-ranked planners on physical vehicles in matching scenarios and checking whether their real-world safety records and performance match the simulated rankings.

read the original abstract

In this work, we propose the world's first closed-loop ML-based planning benchmark for autonomous driving. While there is a growing body of ML-based motion planners, the lack of established datasets and metrics has limited the progress in this area. Existing benchmarks for autonomous vehicle motion prediction have focused on short-term motion forecasting, rather than long-term planning. This has led previous works to use open-loop evaluation with L2-based metrics, which are not suitable for fairly evaluating long-term planning. Our benchmark overcomes these limitations by introducing a large-scale driving dataset, lightweight closed-loop simulator, and motion-planning-specific metrics. We provide a high-quality dataset with 1500h of human driving data from 4 cities across the US and Asia with widely varying traffic patterns (Boston, Pittsburgh, Las Vegas and Singapore). We will provide a closed-loop simulation framework with reactive agents and provide a large set of both general and scenario-specific planning metrics. We plan to release the dataset at NeurIPS 2021 and organize benchmark challenges starting in early 2022.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper proposes NuPlan as the world's first closed-loop ML-based planning benchmark for autonomous driving. It presents a 1500-hour multi-city dataset of human driving data, a lightweight closed-loop simulator with reactive agents, and a collection of general and scenario-specific planning metrics intended to address the shortcomings of open-loop L2 evaluation for long-term motion planning.

Significance. If the simulator and metrics can be shown to produce planner rankings that correlate with real-world safety and efficiency, the benchmark would fill an important gap by enabling standardized, realistic evaluation of ML-based planners beyond short-term forecasting tasks. The scale and geographic diversity of the dataset represent a clear strength.

major comments (2)
  1. [Abstract] Abstract: The manuscript contains no closed-loop experiments, ablations of agent reactivity, or comparisons against open-loop L2 baselines and real-vehicle logs. Without such evidence, the claim that the proposed metrics and reactive simulator will produce rankings predictive of real-world performance remains untested and central to the benchmark's value.
  2. [Metrics and Simulator] Metrics and Simulator sections: The general and scenario-specific metrics are described at a high level but lack explicit definitions, formulas, or pseudocode. This prevents assessment of whether they avoid the known pitfalls of open-loop evaluation and whether the simulator rules are sufficiently specified for reproducibility.
minor comments (1)
  1. [Abstract] Abstract: Phrases such as 'we will provide' and 'we plan to release' indicate this is a benchmark proposal paper; the current status of the simulator implementation and metric computation code should be stated explicitly.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback and for recognizing the scale and geographic diversity of the dataset. We agree that additional evidence and detail would strengthen the manuscript and will revise accordingly. We address each major comment below.

read point-by-point responses
  1. Referee: [Abstract] The manuscript contains no closed-loop experiments, ablations of agent reactivity, or comparisons against open-loop L2 baselines and real-vehicle logs. Without such evidence, the claim that the proposed metrics and reactive simulator will produce rankings predictive of real-world performance remains untested and central to the benchmark's value.

    Authors: We acknowledge that the current manuscript is primarily a benchmark definition paper and does not contain closed-loop planner evaluations. In the revision we will add a dedicated experiments section that runs several baseline planners (rule-based and learned) in closed-loop simulation. This will include ablations on agent reactivity levels and side-by-side comparison of closed-loop metric rankings versus open-loop L2 error on the same scenarios from the dataset. These additions will provide concrete evidence of how the benchmark behaves differently from open-loop evaluation. A full statistical correlation with real-world safety outcomes is not possible within this work, as it would require proprietary fleet testing data and deployments beyond the benchmark release. revision: yes

  2. Referee: [Metrics and Simulator] The general and scenario-specific metrics are described at a high level but lack explicit definitions, formulas, or pseudocode. This prevents assessment of whether they avoid the known pitfalls of open-loop evaluation and whether the simulator rules are sufficiently specified for reproducibility.

    Authors: We agree that the current level of detail is insufficient for reproducibility. The revised manuscript will expand both sections with explicit mathematical definitions and formulas for every metric (e.g., collision rate, route progress, comfort, and scenario-specific scores), together with pseudocode for the closed-loop simulation loop, agent state updates, and reactivity model. This will make clear how the metrics penalize unrealistic behaviors that open-loop L2 evaluation overlooks and will allow independent re-implementation of the simulator. revision: yes

Circularity Check

0 steps flagged

No circularity: benchmark proposal with independent definitions

full rationale

The paper introduces a new dataset (1500h from 4 cities), lightweight closed-loop simulator with reactive agents, and planning-specific metrics without any claimed derivations, equations, parameter fittings, or predictions. No load-bearing step reduces by construction to prior inputs or self-citations; the central claim is the proposal of these independent components. This matches the default expectation for non-circular benchmark papers.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The benchmark rests on the domain assumption that 1500 hours across four cities sufficiently covers the distribution of real traffic interactions and that closed-loop simulation with reactive agents approximates physical vehicle dynamics well enough for ranking planners.

axioms (1)
  • domain assumption The collected driving data and reactive simulator produce rankings that generalize to real-world deployment safety.
    Invoked when claiming the benchmark overcomes limitations of open-loop evaluation.

pith-pipeline@v0.9.0 · 5503 in / 1168 out tokens · 40905 ms · 2026-05-15T08:57:39.837798+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 35 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Bench2Drive-Robust: Benchmarking Closed-Loop Autonomous Driving under Deployment Perturbations

    cs.RO 2026-05 unverdicted novelty 7.0

    Bench2Drive-Robust is a new closed-loop benchmark that evaluates end-to-end autonomous driving models under deployment perturbations from camera failures, ego-state errors, and compute delays, showing substantial perf...

  2. MDrive: Benchmarking Closed-Loop Cooperative Driving for End-to-End Multi-agent Systems

    cs.RO 2026-05 unverdicted novelty 7.0

    MDrive benchmark shows multi-agent cooperative driving systems generally outperform single-agent ones in closed-loop settings but perception sharing does not always improve planning and negotiation can harm performanc...

  3. ReflectDrive-2: Reinforcement-Learning-Aligned Self-Editing for Discrete Diffusion Driving

    cs.RO 2026-05 unverdicted novelty 7.0

    ReflectDrive-2 achieves 91.0 PDMS on NAVSIM with camera input by training a discrete diffusion model to self-edit trajectories via RL-aligned AutoEdit.

  4. A global dataset of continuous urban dashcam driving

    cs.CV 2026-04 accept novelty 7.0

    CROWD is a new global dataset of 51,753 continuous urban dashcam segments spanning over 20,000 hours from 238 countries, with manual labels and automated object detections for routine driving analysis.

  5. C-TRAIL: A Commonsense World Framework for Trajectory Planning in Autonomous Driving

    cs.AI 2026-03 unverdicted novelty 7.0

    C-TRAIL combines LLM commonsense with a dual-trust mechanism and Dirichlet-weighted Monte Carlo Tree Search to improve trajectory planning accuracy and safety in autonomous driving.

  6. LongTail Driving Scenarios with Reasoning Traces: The KITScenes LongTail Dataset

    cs.CV 2026-03 unverdicted novelty 7.0

    KITScenes LongTail supplies multimodal driving data and multilingual expert reasoning traces to benchmark models on rare scenarios beyond basic safety metrics.

  7. ReCogDrive: A Reinforced Cognitive Framework for End-to-End Autonomous Driving

    cs.CV 2025-06 unverdicted novelty 7.0

    ReCogDrive unifies VLM scene understanding with a diffusion planner reinforced by DiffGRPO to reach state-of-the-art results on NAVSIM and Bench2Drive benchmarks.

  8. Distill to Think, Foresee to Act: Cognitive-Physical Reinforcement Learning for Autonomous Driving

    cs.CV 2026-05 unverdicted novelty 6.0

    CoPhy distills VLM knowledge into a BEV encoder and uses an action-conditioned auto-regressive BEV world model inside GRPO with dual physical-cognitive rewards to reach SOTA on NAVSIM v1/v2 while adding language-based...

  9. Beyond Imitation: Learning Safe End-to-End Autonomous Driving from Hard Negatives

    cs.RO 2026-05 unverdicted novelty 6.0

    BeyondDrive augments imitation learning with synthesized safety-critical negative trajectories and a repulsive loss to improve safety in autonomous driving, reporting 89.7 PDMS on NAVSIMv1 and generalization to other models.

  10. CoWorld-VLA: Thinking in a Multi-Expert World Model for Autonomous Driving

    cs.CV 2026-05 unverdicted novelty 6.0

    CoWorld-VLA encodes world information into four expert tokens that condition a diffusion-based planner, yielding competitive collision avoidance and trajectory accuracy on the NAVSIM benchmark.

  11. CoWorld-VLA: Thinking in a Multi-Expert World Model for Autonomous Driving

    cs.CV 2026-05 unverdicted novelty 6.0

    CoWorld-VLA extracts semantic, geometric, dynamic, and trajectory expert tokens from multi-source supervision and feeds them into a diffusion-based hierarchical planner, achieving competitive collision avoidance and t...

  12. Temporal Sampling Frequency Matters: A Capacity-Aware Study of End-to-End Driving Trajectory Prediction

    cs.CV 2026-05 unverdicted novelty 6.0

    Smaller end-to-end autonomous driving models achieve optimal 3-second trajectory prediction accuracy at lower or intermediate temporal sampling frequencies, whereas larger VLA-style models perform best at the highest ...

  13. DriveFuture: Future-Aware Latent World Models for Autonomous Driving

    cs.CV 2026-05 unverdicted novelty 6.0

    DriveFuture achieves SOTA results on NAVSIM by conditioning latent world model states on future predictions to directly inform trajectory planning.

  14. SceneFactory: GPU-Accelerated Multi-Agent Driving Simulation with Physics-Based Vehicle Dynamics

    cs.MA 2026-05 accept novelty 6.0

    SceneFactory delivers a batched GPU platform for physics-based multi-agent autonomous driving simulation that achieves 127x higher throughput than non-vectorized PhysX while supporting articulated dynamics and road-co...

  15. Response Time Enhances Alignment with Heterogeneous Preferences

    cs.LG 2026-05 unverdicted novelty 6.0

    Response times modeled as drift-diffusion processes enable consistent estimation of population-average preferences from heterogeneous anonymous binary choices.

  16. ReflectDrive-2: Reinforcement-Learning-Aligned Self-Editing for Discrete Diffusion Driving

    cs.RO 2026-05 unverdicted novelty 6.0

    ReflectDrive-2 combines masked discrete diffusion with RL-aligned self-editing to generate and refine driving trajectories, reaching 91.0 PDMS on NAVSIM camera-only and 94.8 in best-of-6.

  17. ProDrive: Proactive Planning for Autonomous Driving via Ego-Environment Co-Evolution

    cs.RO 2026-04 unverdicted novelty 6.0

    ProDrive couples a query-centric planner with a BEV world model for end-to-end ego-environment co-evolution, enabling future-outcome assessment that improves safety and efficiency over reactive baselines on NAVSIM v1.

  18. OneDrive: Unified Multi-Paradigm Driving with Vision-Language-Action Models

    cs.CV 2026-04 unverdicted novelty 6.0

    OneDrive unifies heterogeneous decoding in a single VLM transformer decoder for end-to-end driving, achieving 0.28 L2 error and 0.18 collision rate on nuScenes plus 86.8 PDMS on NAVSIM.

  19. Mosaic: An Extensible Framework for Composing Rule-Based and Learned Motion Planners

    cs.RO 2026-04 unverdicted novelty 6.0

    Mosaic integrates rule-based and learned planners via arbitration graphs to set new state-of-the-art scores on nuPlan and interPlan benchmarks while cutting at-fault collisions by 30%.

  20. BridgeSim: Unveiling the OL-CL Gap in End-to-End Autonomous Driving

    cs.RO 2026-04 unverdicted novelty 6.0

    The primary OL-CL gap in end-to-end autonomous driving arises from objective mismatch creating structural inability to model reactive behaviors, which a test-time adaptation method can mitigate.

  21. Evaluation as Evolution: Transforming Adversarial Diffusion into Closed-Loop Curricula for Autonomous Vehicles

    cs.RO 2026-04 unverdicted novelty 6.0

    E² uses transport-regularized sparse control on learned reverse-time SDEs with topology-driven selection and Topological Anchoring to generate realistic adversarial scenarios, improving collision discovery by 9.01% on...

  22. Unleashing the Potential of Diffusion Models for End-to-End Autonomous Driving

    cs.RO 2026-02 unverdicted novelty 6.0

    The paper introduces Hyper Diffusion Planner (HDP), a diffusion-based E2E AD framework that identifies insights on loss space, trajectory representation and data scaling, adds RL post-training, and reports 10x perform...

  23. DriveLaW:Unifying Planning and Video Generation in a Latent Driving World

    cs.CV 2025-12 unverdicted novelty 6.0

    DriveLaW unifies video world modeling and trajectory planning by injecting video-generator latents into a diffusion planner, achieving SOTA video prediction and a new record on the NAVSIM planning benchmark.

  24. Optimization-Guided Diffusion for Interactive Scene Generation

    cs.CV 2025-12 unverdicted novelty 6.0

    OMEGA guides diffusion sampling with per-step constrained optimization and game-theoretic adversarial modeling to generate physically valid and interactive driving scenes, raising valid scene ratios from 32% to 72% an...

  25. LiloDriver: A Lifelong Learning Framework for Closed-loop Motion Planning in Long-tail Autonomous Driving Scenarios

    cs.RO 2025-05 unverdicted novelty 6.0

    LiloDriver uses LLMs and memory-augmented planning in a four-stage pipeline to outperform rule-based and learning-based methods on both common and rare scenarios in the nuPlan benchmark.

  26. Enhancing End-to-End Autonomous Driving with Latent World Model

    cs.CV 2024-06 accept novelty 6.0

    LAW introduces a self-supervised prediction task on latent scene features that boosts end-to-end driving performance on nuScenes, NAVSIM, and CARLA benchmarks.

  27. Hydra-MDP: End-to-end Multimodal Planning with Multi-target Hydra-Distillation

    cs.CV 2024-06 unverdicted novelty 6.0

    Hydra-MDP uses multi-teacher distillation and a multi-head decoder to learn diverse, metric-specific trajectories in an end-to-end autonomous-driving planner, winning the Navsim challenge.

  28. HEAT: Heterogeneous End-to-End Autonomous Driving via Trajectory-Guided World Models

    cs.RO 2026-05 unverdicted novelty 5.0

    HEAT uses a trajectory-driven learning paradigm and a world model predicting future latent features from ego actions to enable a single unified end-to-end autonomous driving model to perform well across heterogeneous ...

  29. RLFTSim: Realistic and Controllable Multi-Agent Traffic Simulation via Reinforcement Learning Fine-Tuning

    cs.RO 2026-05 unverdicted novelty 5.0

    RLFTSim uses RL fine-tuning on a pre-trained model with a balanced reward to align traffic simulator rollouts to real data distributions and distill goal-conditioned controllability, reporting SOTA realism on the Waym...

  30. DriveSafer: End-to-End Autonomous Driving with Safety Guidance

    cs.RO 2026-05 unverdicted novelty 5.0

    DriveSafer reduces catastrophic failures (PDMS=0) by 48% and drivable-area compliance failures by over 65% versus DiffusionDrive on the NAVSIM benchmark by combining training-time safety constraints with inference-tim...

  31. Causality-Aware End-to-End Autonomous Driving via Ego-Centric Joint Scene Modeling

    cs.RO 2026-05 unverdicted novelty 5.0

    CaAD adds ego-centric joint-causal modeling and causality-aware policy alignment to end-to-end driving, reporting Driving Score 87.53 and Success Rate 71.81 on Bench2Drive plus PDMS 91.1 on NAVSIM.

  32. Causality-Aware End-to-End Autonomous Driving via Ego-Centric Joint Scene Modeling

    cs.RO 2026-05 unverdicted novelty 5.0

    CaAD adds ego-centric joint-causal modeling and causality-aware policy alignment to end-to-end driving, reporting Driving Score 87.53 and PDMS 91.1 on Bench2Drive and NAVSIM.

  33. Artificial Intelligence for Modeling and Simulation of Mixed Automated and Human Traffic

    cs.AI 2026-04 unverdicted novelty 5.0

    This survey synthesizes AI techniques for mixed autonomy traffic simulation and introduces a taxonomy spanning agent-level behavior models, environment-level methods, and cognitive/physics-informed approaches.

  34. DIVER: Reinforced Diffusion Breaks Imitation Bottlenecks in End-to-End Autonomous Driving

    cs.CV 2025-07 unverdicted novelty 5.0

    DIVER uses RL-guided diffusion to produce diverse feasible trajectories from one ground-truth path, addressing mode collapse in imitation learning for autonomous driving.

  35. CHARMS: A Cognitive Hierarchical Agent for Reasoning and Motion Stylization in Autonomous Driving

    cs.RO 2025-04 unverdicted novelty 5.0

    CHARMS applies Level-k game theory and Poisson cognitive hierarchy theory to autonomous driving agents via a two-stage RL-then-SFT pipeline for human-like decisions and realistic scenario generation.

Reference graph

Works this paper leans on

24 extracted references · 24 canonical work pages · cited by 32 Pith papers

  1. [1]

    CommonRoad: Composable benchmarks for motion plan- ning on roads

    Matthias Althoff, Markus Koschi, and Stefanie Manzinger. CommonRoad: Composable benchmarks for motion plan- ning on roads. In Proc. of the IEEE Intelligent Vehicles Sym- posium, 2017. 2

  2. [2]

    Chauf- feurnet: Learning to drive by imitating the best and synthe- sizing the worst

    Mayank Bansal, Alex Krizhevsky, and Abhijit Ogale. Chauf- feurnet: Learning to drive by imitating the best and synthe- sizing the worst. In RSS, 2019. 2

  3. [3]

    Learning to drive from simulation without real world labels

    Alex Bewley, Jessica Rigley, Yuxuan Liu, Jeffrey Hawke, Richard Shen, Vinh-Dieu Lam, and Alex Kendall. Learning to drive from simulation without real world labels. In ICRA,

  4. [4]

    Lang, Sourabh V ora, Venice Erin Liong, Qiang Xu, Anush Krishnan, Yu Pan, Gi- ancarlo Baldan, and Oscar Beijbom

    Holger Caesar, Varun Bankiti, Alex H. Lang, Sourabh V ora, Venice Erin Liong, Qiang Xu, Anush Krishnan, Yu Pan, Gi- ancarlo Baldan, and Oscar Beijbom. nuscenes: A multi- modal dataset for autonomous driving. In CVPR, 2020. 1, 2

  5. [5]

    MP3: A unified model to map, perceive, predict and plan

    Sergio Casas, Abbas Sadat, and Raquel Urtasun. MP3: A unified model to map, perceive, predict and plan. In CVPR,

  6. [6]

    Argo- verse: 3d tracking and forecasting with rich maps

    Ming-Fang Chang, John W Lambert, Patsorn Sangkloy, Jag- jeet Singh, Slawomir Bak, Andrew Hartnett, De Wang, Peter Carr, Simon Lucey, Deva Ramanan, and James Hays. Argo- verse: 3d tracking and forecasting with rich maps. In CVPR,

  7. [7]

    CARLA: An open urban driving simulator

    Alexey Dosovitskiy, German Ros, Felipe Codevilla, Antonio Lopez, and Vladlen Koltun. CARLA: An open urban driving simulator. CoRR, 2017. 2

  8. [8]

    Large scale interactive motion forecasting for autonomous driving: The Waymo Open Motion Dataset

    Scott Ettinger, Shuyang Cheng, and Benjamin Caine et al. Large scale interactive motion forecasting for autonomous driving: The Waymo Open Motion Dataset. arXiv preprint arXiv:2104.10133, 2021. 1, 2

  9. [9]

    Vision meets robotics: The KITTI dataset

    Andreas Geiger, Philip Lenz, Christoph Stiller, and Raquel Urtasun. Vision meets robotics: The KITTI dataset. IJRR, 32(11):1231–1237, 2013. 1, 2

  10. [10]

    The efficacy of neural planning metrics: A 4 meta-analysis of PKL on nuscenes

    Yiluan Guo, Holger Caesar, Oscar Beijbom, Jonah Philion, and Sanja Fidler. The efficacy of neural planning metrics: A 4 meta-analysis of PKL on nuscenes. In IROS Workshop on Benchmarking Progress in Autonomous Driving, 2020. 3

  11. [11]

    One thousand and one hours: Self-driving motion prediction dataset

    John Houston, Guido Zuidhof, and Luca Bergamini et al. One thousand and one hours: Self-driving motion prediction dataset. arXiv preprint arXiv:2006.14480, 2020. 1, 2, 4

  12. [12]

    Lang, Sourabh V ora, Holger Caesar, Lubing Zhou, Jiong Yang, and Oscar Beijbom

    Alex H. Lang, Sourabh V ora, Holger Caesar, Lubing Zhou, Jiong Yang, and Oscar Beijbom. Pointpillars: Fast encoders for object detection from point clouds. In CVPR, 2019. 3

  13. [13]

    Moustafa, and Jens Honer

    Eraqi Hesham M., Mohamed N. Moustafa, and Jens Honer. Conditional imitation learning driving considering camera and lidar fusion. In NeurIPS, 2020. 3

  14. [14]

    Simulation-based reinforcement learning for real-world autonomous driving

    Blazej Osinski, Adam Jakubowski, Pawel Ziecina, Piotr Milos, Christopher Galias, Silviu Homoceanu, and Henryk Michalewski. Simulation-based reinforcement learning for real-world autonomous driving. In ICRA, 2020. 2

  15. [15]

    Learning to evaluate perception models using planner-centric metrics

    Jonah Philion, Amlan Kar, and Sanja Fidler. Learning to evaluate perception models using planner-centric metrics. In CVPR, 2020. 3

  16. [16]

    Multi- modal fusion transformer for end-to-end autonomous driv- ing

    Aditya Prakash, Kashyap Chitta, and Andreas Geiger. Multi- modal fusion transformer for end-to-end autonomous driv- ing. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021. 3

  17. [17]

    Qi, Yin Zhou, Mahyar Najibi, Pei Sun, Khoa V o, Boyang Deng, and Dragomir Anguelov

    Charles R. Qi, Yin Zhou, Mahyar Najibi, Pei Sun, Khoa V o, Boyang Deng, and Dragomir Anguelov. Offboard 3d ob- ject detection from point cloud sequences. arXiv preprint arXiv:2103.05073, 2021. 2, 3

  18. [18]

    Jointly learnable be- havior and trajectory planning for self-driving vehicles

    Abbas Sadat, Mengye Ren, Andrei Pokrovsky, Yen-Chen Lin, Ersin Yumer, and Raquel Urtasun. Jointly learnable be- havior and trajectory planning for self-driving vehicles. In IROS, 2019. 2

  19. [19]

    AirSim: High-fidelity visual and physical simula- tion for autonomous vehicles

    Shital Shah, Debadeepta Dey, Chris Lovett, and Ashish Kapoor. AirSim: High-fidelity visual and physical simula- tion for autonomous vehicles. In Field and Service Robotics,

  20. [20]

    End-to-end multi-modal sen- sors fusion system for urban automated driving

    Ibrahim Sobh, Loay Amin, Sherif Abdelkarim, Khaled Elmadawy, Mahmoud Saeed, Omar Abdeltawab, Mostafa Gamal, and Ahmad El Sallab. End-to-end multi-modal sen- sors fusion system for urban automated driving. In NeurIPS,

  21. [21]

    Learning to track with object permanence

    Pavel Tokmakov, Jie Li, Wolfram Burgard, and Adrien Gaidon. Learning to track with object permanence. arXiv preprint arXiv:2103.14258, 2021. 2

  22. [22]

    Yi Xiao, Felipe Codevilla, Akhil Gurram, Onay Urfalioglu, and Antonio M. L´opez. Multimodal end-to-end autonomous driving. arXiv preprint arXiv:1906.03199, 2019. 3

  23. [23]

    Center- based 3d object detection and tracking

    Tianwei Yin, Xingyi Zhou, and Philipp Kr ¨ahenb¨uhl. Center- based 3d object detection and tracking. arXiv preprint arXiv:2006.11275, 2020. 3

  24. [24]

    End-to-end inter- pretable neural motion planner

    Wenyuan Zeng, Wenjie Luo, Simon Suo, Abbas Sadat, Bin Yang, Sergio Casas, and Raquel Urtasun. End-to-end inter- pretable neural motion planner. In CVPR, 2021. 2 5