pith. machine review for the scientific record. sign in

arxiv: 2603.17714 · v2 · submitted 2026-03-18 · 💻 cs.AI

Recognition: 1 theorem link

· Lean Theorem

From Virtual Environments to Real-World Trials: Emerging Trends in Autonomous Driving

Authors on Pith no claims yet

Pith reviewed 2026-05-15 08:58 UTC · model grok-4.3

classification 💻 cs.AI
keywords autonomous drivingsynthetic datavirtual environmentsdigital twinsdomain adaptationsimulationperceptionsafety validation
0
0 comments X

The pith

Synthetic data and virtual environments provide scalable scenarios for training and validating autonomous driving systems.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This survey reviews how synthetic data and virtual environments address data scarcity and safety constraints in autonomous driving development. It organizes recent work into three areas: synthetic data for perception and planning tasks, digital twin simulations for system validation, and domain adaptation techniques that connect simulated and real data. The paper also covers vision-language models for better scene understanding and trends in benchmark design. It concludes by outlining open challenges such as Sim2Real transfer and scalable safety validation that must be resolved for real-world deployment.

Core claim

The paper establishes that synthetic data and virtual environments have emerged as powerful enablers for autonomous driving by supplying scalable, controllable, and richly annotated scenarios for training and evaluation, with the landscape organized across synthetic data for perception and planning, digital twin-based validation, and domain adaptation strategies that bridge synthetic and real-world data.

What carries the argument

The taxonomy of datasets, simulation platforms, and domain adaptation methods that organizes the use of synthetic data and digital twins across perception, planning, and validation.

If this is right

  • Synthetic data supports training of perception and planning modules with controllable and annotated scenarios.
  • Digital twin-based simulation enables system-level validation without exposing real vehicles to risk.
  • Domain adaptation strategies improve transfer performance from synthetic training to real-world conditions.
  • Vision-language models enhance scene understanding and generalization within simulated environments.
  • Resolving challenges in Sim2Real transfer and scalable safety validation will support broader deployment of autonomous systems.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • These approaches could reduce the volume of expensive real-world data collection required for development.
  • Multi-agent virtual environments might support testing of cooperative autonomy across fleets.
  • Simulation-driven policy learning could accelerate reinforcement learning for complex driving behaviors.
  • Extending the taxonomy to include more diverse global weather and cultural scenarios would test broader generalization.

Load-bearing premise

That the reviewed strategies in perception, digital twins, and domain adaptation will sufficiently overcome data scarcity, safety requirements, and generalization barriers to enable safe real-world deployment.

What would settle it

A real-world trial in which an autonomous driving system trained primarily on the surveyed synthetic datasets and platforms shows significantly higher failure rates or safety violations than systems trained on matched real data would indicate the enablers are insufficient.

Figures

Figures reproduced from arXiv: 2603.17714 by A. Behera, A. Humnabadkar, A. Sikdar, B. Cave, H. Zhang, N. Bessis.

Figure 1
Figure 1. Figure 1: Illustration of a traditional AV perception-control pipeline. Sensor inputs (cameras, radar, LiDAR, GPS) capture environmental data, which is filtered [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: A unified framework for driving scene understanding where several [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Showcasing a DriveGPT [75] workflow in which natural language inputs from users are interpreted by LLM agents to allocate 3D models, gather relevant [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Illustration of the integration of static, dynamic, and external factors into a unified framework for driving scene understanding. Together, they [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: The Venn diagram shows the distribution of recent driving datasets across annotation types . About 50–60% of datasets provide only 2D bounding [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: A cyclical pipeline of Real2Sim and Sim2Real bridges the domain gap between synthetic training environments and real-world conditions. In the phase [PITH_FULL_IMAGE:figures/full_fig_p010_6.png] view at source ↗
read the original abstract

Autonomous driving technologies have achieved significant advances in recent years, yet their real-world deployment remains constrained by data scarcity, safety requirements, and the need for generalization across diverse environments. In response, synthetic data and virtual environments have emerged as powerful enablers, offering scalable, controllable, and richly annotated scenarios for training and evaluation. This survey presents a comprehensive review of recent developments at the intersection of autonomous driving, simulation technologies, and synthetic datasets. We organize the landscape across three core dimensions: (i) the use of synthetic data for perception and planning, (ii) digital twin-based simulation for system validation, and (iii) domain adaptation strategies bridging synthetic and real-world data. We also highlight the role of vision-language models and simulation realism in enhancing scene understanding and generalization. A detailed taxonomy of datasets, tools, and simulation platforms is provided, alongside an analysis of trends in benchmark design. Finally, we discuss critical challenges and open research directions, including Sim2Real transfer, scalable safety validation, cooperative autonomy, and simulation-driven policy learning, that must be addressed to accelerate the path toward safe, generalizable, and globally deployable autonomous driving systems.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 3 minor

Summary. The manuscript is a literature survey reviewing recent advances at the intersection of autonomous driving, simulation, and synthetic data. It organizes the field into three core dimensions—synthetic data for perception and planning, digital-twin simulation for validation, and domain-adaptation strategies—while providing a taxonomy of datasets and platforms, analyzing benchmark trends, and discussing open challenges including Sim2Real transfer, scalable safety validation, and simulation-driven policy learning.

Significance. If the synthesis accurately captures the cited literature, the survey offers a structured map of an active research area that directly addresses data scarcity and safety barriers in autonomous driving. By consolidating trends across perception, validation, and adaptation, and by naming concrete open directions such as cooperative autonomy, it can serve as a useful reference for prioritizing future work on generalizable, deployable systems.

major comments (2)
  1. [Introduction and taxonomy sections] The central claim that synthetic data and virtual environments supply 'scalable, controllable, and richly annotated scenarios' is presented as a synthesis; however, the survey does not include a quantitative meta-analysis (e.g., average annotation density or scenario coverage statistics across the cited datasets) that would make this claim falsifiable rather than descriptive.
  2. [Domain adaptation strategies] In the domain-adaptation discussion, the weakest assumption—that reviewed strategies will sufficiently overcome generalization barriers—is not stress-tested against failure cases; the manuscript would be strengthened by citing at least one empirical study where Sim2Real transfer degraded performance on a specific real-world distribution shift.
minor comments (3)
  1. [Abstract] The abstract lists three dimensions but does not indicate the temporal scope (e.g., papers from 2020–2024) or the approximate number of works surveyed, which would help readers gauge coverage.
  2. [Taxonomy of datasets and tools] Figure captions for the taxonomy diagrams should explicitly state the inclusion criteria used to select the listed datasets and platforms.
  3. [References] A few citations appear to be preprints; the reference list should note their archival status or update them to published versions where available.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback and the recommendation for minor revision. We address each major comment point by point below, indicating where revisions will be made to improve the manuscript's rigor and balance.

read point-by-point responses
  1. Referee: [Introduction and taxonomy sections] The central claim that synthetic data and virtual environments supply 'scalable, controllable, and richly annotated scenarios' is presented as a synthesis; however, the survey does not include a quantitative meta-analysis (e.g., average annotation density or scenario coverage statistics across the cited datasets) that would make this claim falsifiable rather than descriptive.

    Authors: We agree that adding quantitative support would make the synthesis more robust and falsifiable. A comprehensive meta-analysis re-processing raw data from every cited work exceeds the scope of a survey paper. However, we will add a new table in the taxonomy section that aggregates reported statistics (such as annotation counts, scenario coverage, and diversity metrics) directly from the original dataset papers where these figures are provided. This will ground the claim in concrete numbers from the literature while remaining within survey conventions. revision: partial

  2. Referee: [Domain adaptation strategies] In the domain-adaptation discussion, the weakest assumption—that reviewed strategies will sufficiently overcome generalization barriers—is not stress-tested against failure cases; the manuscript would be strengthened by citing at least one empirical study where Sim2Real transfer degraded performance on a specific real-world distribution shift.

    Authors: We accept this point and will revise the domain-adaptation section to explicitly discuss limitations. We will add citations to empirical studies demonstrating Sim2Real degradation, for example on shifts involving novel weather conditions, lighting variations, or unseen object classes in urban environments. This addition will better stress-test the reviewed strategies and align with the open challenges already noted in the manuscript. revision: yes

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

As a survey the central claims rest on the authors' selection and categorization of existing literature; no free parameters, axioms, or invented entities are introduced by the paper itself.

pith-pipeline@v0.9.0 · 5519 in / 1141 out tokens · 36055 ms · 2026-05-15T08:58:07.785143+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

193 extracted references · 193 canonical work pages · 5 internal anchors

  1. [1]

    Machine learning/artificial intelligence for sensor data fusion–opportunities and challenges,

    E. Blasch, T. Pham, C.-Y . Chonget al., “Machine learning/artificial intelligence for sensor data fusion–opportunities and challenges,”IEEE Aerosp. Electron. Syst. Mag., vol. 36, no. 7, pp. 80–93, 2021

  2. [2]

    Perception of au- tonomous vehicles by the modern society: A survey,

    E. Thomas, C. McCrudden, Z. Whartonet al., “Perception of au- tonomous vehicles by the modern society: A survey,”IET Intell. Transp. Syst., vol. 14, no. 10, pp. 1228–1239, 2020

  3. [3]

    Deep learning for safe autonomous driving: Current challenges and future directions,

    K. Muhammad, A. Ullah, J. Lloretet al., “Deep learning for safe autonomous driving: Current challenges and future directions,”IEEE Trans. Intell. Transp. Syst., vol. 22, no. 7, pp. 4316–4336, 2020

  4. [4]

    Meta-sim: Learning to generate synthetic datasets,

    A. Kar, A. Prakash, M.-Y . Liuet al., “Meta-sim: Learning to generate synthetic datasets,” inProc. of the IEEE/CVF Int. Conf. on Computer Vision, 2019, pp. 4551–4560

  5. [5]

    Carla: An open urban driving simulator,

    A. Dosovitskiy, G. Ros, F. Codevillaet al., “Carla: An open urban driving simulator,” inConf. on Robot Learn.PMLR, 2017, pp. 1–16

  6. [6]

    Virtual worlds as proxy for multi-object tracking analysis,

    A. Gaidon, Q. Wang, Y . Cabonet al., “Virtual worlds as proxy for multi-object tracking analysis,” inProc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2016, pp. 4340–4349

  7. [7]

    The synthia dataset: A large collection of synthetic images for semantic segmentation of urban scenes,

    G. Ros, L. Sellart, J. Materzynskaet al., “The synthia dataset: A large collection of synthetic images for semantic segmentation of urban scenes,” inProc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2016, pp. 3234–3243

  8. [8]

    A real2sim2real method for robust object grasping with neural surface reconstruction,

    L. Wang, R. Guo, Q. Vuonget al., “A real2sim2real method for robust object grasping with neural surface reconstruction,” inIEEE Int. Conf. Autom. Sci. Eng. (CASE), 2023, pp. 1–8

  9. [9]

    Bridging the sim2real gap with CARE: Supervised detection adaptation with conditional alignment and reweighting,

    V . U. Prabhu, D. Acuna, R. Mahmoodet al., “Bridging the sim2real gap with CARE: Supervised detection adaptation with conditional alignment and reweighting,”Trans. Mach. Learn. Res., 2023. [Online]. Available: https://openreview.net/forum?id=lAQQx7hlku

  10. [10]

    Mind the gap! a study on the transferability of virtual versus physical-world testing of autonomous driving systems,

    A. Stocco, B. Pulfer, and P. Tonella, “Mind the gap! a study on the transferability of virtual versus physical-world testing of autonomous driving systems,”IEEE Trans. Softw. Eng., vol. 49, no. 4, pp. 1928– 1940, 2022. JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 17

  11. [11]

    Vision-language models for vision tasks: A survey,

    J. Zhang, J. Huang, S. Jinet al., “Vision-language models for vision tasks: A survey,”IEEE Trans. Pattern Anal. Mach. Intell., 2024

  12. [12]

    End to End Learning for Self-Driving Cars

    M. Bojarski, D. Del Testa, D. Dworakowskiet al., “End to end learning for self-driving cars,”arXiv preprint arXiv:1604.07316, 2016

  13. [13]

    You only look once: Unified, real-time object detection,

    J. Redmon, S. Divvala, R. Girshicket al., “You only look once: Unified, real-time object detection,” inProc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2016, pp. 779–788

  14. [14]

    YOLOv3: An Incremental Improvement

    J. Redmon and A. Farhadi, “Yolov3: An incremental improvement,” arXiv preprint arXiv:1804.02767, 2018

  15. [15]

    Ssd: Single shot multibox detector,

    W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y . Fu, and A. C. Berg, “Ssd: Single shot multibox detector,” inEuropean conference on computer vision. Springer, 2016, pp. 21–37

  16. [16]

    Advances in deep learning-based object detection and tracking for autonomous driving: A review and future directions,

    V . A. Aher, S. R. Jondhale, B. S. Agarkaret al., “Advances in deep learning-based object detection and tracking for autonomous driving: A review and future directions,” inMulti-Strategy Learning Environment. Springer, 2024, pp. 569–581

  17. [17]

    Deep learning-based vehicle behaviour prediction for autonomous driving applications: a review,

    S. Mozaffari, O. Y . Al-Jarrah, M. Dianatiet al., “Deep learning-based vehicle behaviour prediction for autonomous driving applications: a review,”IEEE Trans. Intell. Transp. Syst., vol. 22, no. 7, pp. 3716– 3735, 2021

  18. [18]

    Deep learning frontiers in 3d object detection: A comprehensive review for autonomous driving,

    A. Pravallika, M. F. Hashmi, and A. Gupta, “Deep learning frontiers in 3d object detection: A comprehensive review for autonomous driving,” IEEE Access, vol. 12, pp. 173 936–173 980, 2024

  19. [19]

    Object detectors in autonomous vehicles: Analysis of deep learning techniques,

    L. Du, “Object detectors in autonomous vehicles: Analysis of deep learning techniques,”International Journal of Advanced Computer Science and Applications, vol. 14, no. 10, 2023. [Online]. Available: http://dx.doi.org/10.14569/IJACSA.2023.0141024

  20. [20]

    Pointpillars: Fast encoders for object detection from point clouds,

    A. H. Lang, S. V ora, H. Caesaret al., “Pointpillars: Fast encoders for object detection from point clouds,” inComputer Vision – ECCV 2016. Cham: Springer International Publishing, 2016, pp. 12 697–12 705

  21. [21]

    Lift, splat, shoot: Encoding images from arbitrary camera rigs by implicitly unprojecting to 3d,

    J. Philion and S. Fidler, “Lift, splat, shoot: Encoding images from arbitrary camera rigs by implicitly unprojecting to 3d,” inProc. Eur. Conf. Comput. Vis. (ECCV). Springer, 2020, pp. 194–210

  22. [22]

    Bevfusion: Multi-task multi-sensor fusion with unified bird’s-eye view representation,

    Z. Liu, H. Tang, A. Aminiet al., “Bevfusion: Multi-task multi-sensor fusion with unified bird’s-eye view representation,”arXiv preprint arXiv:2205.13542, 2022

  23. [23]

    A survey on knowledge graph-based methods for automated driving,

    J. Luettin, S. Monka, C. Hensonet al., “A survey on knowledge graph-based methods for automated driving,” inKnowledge Graphs and Semantic Web, 2022, pp. 16–31

  24. [24]

    Geometric deep learning for autonomous driving: Unlocking the power of graph neural networks with commonroad-geometric,

    E. Meyer, M. Brenner, B. Zhanget al., “Geometric deep learning for autonomous driving: Unlocking the power of graph neural networks with commonroad-geometric,” inIEEE Intell. Veh. Symp. (IV), 2023, pp. 1–8

  25. [25]

    Graph neural networks in intelli- gent transportation systems: Advances, applications and trends,

    H. Li, Y . Zhao, Z. Maoet al., “Graph neural networks in intelli- gent transportation systems: Advances, applications and trends,”arXiv preprint arXiv:2401.00713, 2024

  26. [26]

    Integrating knowledge graphs into autonomous vehicle technologies: A survey of current state and future directions,

    S. N. N. Htun and K. Fukuda, “Integrating knowledge graphs into autonomous vehicle technologies: A survey of current state and future directions,”Information, vol. 15, no. 10, p. 645, 2024

  27. [27]

    Pregsu: A generalized traffic scene understanding model for autonomous driving based on pretrained graph attention network,

    Y . Wang, Z. Liu, H. Linet al., “Pregsu: A generalized traffic scene understanding model for autonomous driving based on pretrained graph attention network,”IEEE Trans. Syst. Man Cybern. Syst., vol. 55, no. 12, pp. 9604–9616, 2025

  28. [28]

    ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation

    A. Paszke, A. Chaurasia, S. Kimet al., “Enet: A deep neural net- work architecture for real-time semantic segmentation,”arXiv preprint arXiv:1606.02147, 2016

  29. [29]

    Icnet for real-time semantic segmen- tation on high-resolution images,

    H. Zhao, X. Qi, X. Shenet al., “Icnet for real-time semantic segmen- tation on high-resolution images,” inProc. Int. Conf. Learn. Represent. (ICLR), 2018, pp. 405–420

  30. [30]

    Encoder-decoder with atrous separable convolution for semantic image segmentation,

    L.-C. Chen, Y . Zhu, G. Papandreouet al., “Encoder-decoder with atrous separable convolution for semantic image segmentation,” inProc. of the European Conference on Computer Vision (ECCV), 2018, pp. 801–818

  31. [31]

    High-Resolution Representations for Labeling Pixels and Regions

    J. Sun, Z. Liu, J. Jiaet al., “High-resolution representations for labeling pixels and regions,”arXiv preprint arXiv:1904.04514, 2019

  32. [32]

    Social lstm: Human trajectory prediction in crowded spaces,

    A. Alahi, K. Goel, V . Ramanathanet al., “Social lstm: Human trajectory prediction in crowded spaces,” inProc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2016, pp. 961–971

  33. [33]

    Multi-modal trajectory prediction of surrounding vehicles with maneuver-based lstms,

    N. Deo and M. M. Trivedi, “Multi-modal trajectory prediction of surrounding vehicles with maneuver-based lstms,” inIEEE Intell. Veh. Symp. (IV). IEEE, 2018, pp. 1179–1184

  34. [34]

    Tactical decision-making for au- tonomous driving using dueling double deep q network with double attention,

    S. Zhang, J. Wu, and H. Ogai, “Tactical decision-making for au- tonomous driving using dueling double deep q network with double attention,”IEEE Access, vol. 9, pp. 151 983–151 992, 2021

  35. [35]

    A survey of deep learning techniques for autonomous driving,

    S. Grigorescu, B. Trasnea, T. Cociaset al., “A survey of deep learning techniques for autonomous driving,”J. Field Robot., vol. 37, no. 3, pp. 362–386, 2020

  36. [36]

    Recent advancements in end-to-end au- tonomous driving using deep learning: A survey,

    P. S. Chib and P. Singh, “Recent advancements in end-to-end au- tonomous driving using deep learning: A survey,”IEEE Trans. Intell. Veh., 2023

  37. [37]

    Human action recognition and prediction: A survey,

    Y . Kong and Y . Fu, “Human action recognition and prediction: A survey,”Int. J. Comput. Vis., vol. 130, no. 5, pp. 1366–1401, 2022

  38. [38]

    Learning to drive in a day,

    A. Kendall, J. Hawke, D. Janzet al., “Learning to drive in a day,” in Proc. IEEE Int. Conf. Robot. Autom. (ICRA), 2019, pp. 8248–8254

  39. [39]

    Instant inverse modeling of stochastic driving behavior with deep reinforcement learning,

    D. Lee and M. Kwon, “Instant inverse modeling of stochastic driving behavior with deep reinforcement learning,”IEEE Trans. Consum. Electron., 2024

  40. [40]

    ADAS-RL: Safety learning approach for stable autonomous driving,

    ——, “ADAS-RL: Safety learning approach for stable autonomous driving,”ICT Express, vol. 8, no. 3, pp. 479–483, 2022

  41. [41]

    Stability analysis in mixed-autonomous traffic with deep re- inforcement learning,

    ——, “Stability analysis in mixed-autonomous traffic with deep re- inforcement learning,”IEEE Trans. Veh. Technol., vol. 72, no. 3, pp. 2848–2862, 2022

  42. [42]

    Deep reinforcement learning for autonomous driving: A survey,

    B. R. Kiran, I. Sobh, V . Talpaertet al., “Deep reinforcement learning for autonomous driving: A survey,”IEEE Trans. Intell. Transp. Syst., vol. 23, no. 6, pp. 4909–4926, 2021

  43. [43]

    Survey of deep reinforcement learning for motion planning of autonomous vehicles,

    S. Aradi, “Survey of deep reinforcement learning for motion planning of autonomous vehicles,”IEEE Trans. Intell. Transp. Syst., vol. 23, no. 2, pp. 740–759, 2020

  44. [44]

    A survey on recent advancements in autonomous driving using deep reinforcement learning: Applications, challenges, and solutions,

    R. Zhao, Y . Li, Y . Fanet al., “A survey on recent advancements in autonomous driving using deep reinforcement learning: Applications, challenges, and solutions,”IEEE Trans. Intell. Transp. Syst., 2024

  45. [45]

    Rodus: Robust decom- position of static and dynamic elements in urban scenes,

    T.-A.-Q. Nguyen, L. Rold ˜ao, N. Piascoet al., “Rodus: Robust decom- position of static and dynamic elements in urban scenes,” inProc. Eur. Conf. Comput. Vis. (ECCV). Springer, 2024, pp. 112–130

  46. [46]

    Unsupervised video domain adaptation for action recognition: A disentanglement perspective,

    P. Wei, L. Kong, X. Quet al., “Unsupervised video domain adaptation for action recognition: A disentanglement perspective,”Adv. in Neural Info. Proc. Sys., vol. 36, pp. 17 623–17 642, 2023

  47. [47]

    Dˆ 2nerf: Self-supervised decoupling of dynamic and static objects from a monocular video,

    T. Wu, F. Zhong, A. Tagliasacchiet al., “Dˆ 2nerf: Self-supervised decoupling of dynamic and static objects from a monocular video,” Advances in Neural Information Processing Systems, vol. 35, pp. 32 653–32 666, 2022

  48. [48]

    3d scene generation: A survey,

    B. Wen, H. Xie, Z. Chenet al., “3d scene generation: A survey,”arXiv preprint arXiv:2505.05474, 2025

  49. [49]

    Citydreamer: Compositional genera- tive model of unbounded 3d cities,

    H. Xie, Z. Chen, F. Honget al., “Citydreamer: Compositional genera- tive model of unbounded 3d cities,” inProc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2024, pp. 9666–9675

  50. [50]

    Compositional generative model of unbounded 4d cities,

    ——, “Compositional generative model of unbounded 4d cities,”IEEE Trans. Pattern Anal. Mach. Intell., vol. 48, no. 1, pp. 312–328, 2026

  51. [51]

    Magicdrive: Street view generation with diverse 3d geometry control,

    R. Gao, K. Chen, E. Xieet al., “Magicdrive: Street view generation with diverse 3d geometry control,” inProc. Int. Conf. Learn. Represent. (ICLR), 2024

  52. [52]

    Drivinggaussian: Composite gaussian splatting for surrounding dynamic autonomous driving scenes,

    X. Zhou, Z. Lin, X. Shanet al., “Drivinggaussian: Composite gaussian splatting for surrounding dynamic autonomous driving scenes,” inProc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2024, pp. 21 634– 21 643

  53. [53]

    Vad: Vectorized scene representation for efficient autonomous driving,

    Z. Jiang, Y . Zhanget al., “Vad: Vectorized scene representation for efficient autonomous driving,” inProc. IEEE/CVF Int. Conf. Comput. Vis. (ICCV), 2023

  54. [54]

    An automatic driving trajectory planning approach in complex traffic scenarios based on integrated driver style inference and deep reinforcement learning,

    Y . Liu and S. Diao, “An automatic driving trajectory planning approach in complex traffic scenarios based on integrated driver style inference and deep reinforcement learning,”PLoS ONE, vol. 19, no. 1, p. e0297192, 2024

  55. [55]

    A novel traffic simulation framework for testing autonomous vehicles using sumo and carla,

    P. Li, A. Kusari, and D. J. LeBlanc, “A novel traffic simulation framework for testing autonomous vehicles using sumo and carla,” arXiv preprint arXiv:2110.07111, 2021

  56. [56]

    Multi-modal sensor fusion for auto driving perception: A survey,

    K. Huang, B. Shi, X. Liet al., “Multi-modal sensor fusion for auto driving perception: A survey,”arXiv preprint arXiv:2202.02703, 2022

  57. [57]

    Nuscenes-qa: A multi-modal visual question answering benchmark for autonomous driving scenario,

    T. Qian, J. Chen, L. Zhuoet al., “Nuscenes-qa: A multi-modal visual question answering benchmark for autonomous driving scenario,” in Proc. of the AAAI Conf. on Artificial Intelligence, vol. 38, no. 5, 2024, pp. 4542–4550

  58. [58]

    Exploring generative ai for sim2real in driving data synthesis,

    H. Zhao, Y . Wang, T. Bashford-Rogerset al., “Exploring generative ai for sim2real in driving data synthesis,” inIEEE Intell. Veh. Symp. (IV), 2024, pp. 3071–3077

  59. [59]

    Sim-to-real domain adaptation for lane detection and classification in autonomous driving,

    C. Hu, S. Hudson, M. Ethieret al., “Sim-to-real domain adaptation for lane detection and classification in autonomous driving,” inIEEE Intell. Veh. Symp. (IV). IEEE, 2022, pp. 457–463

  60. [60]

    Sim2real: Issues in transferring autonomous driving model from simulation to real world,

    J. Revell, D. Welch, and J. Hereford, “Sim2real: Issues in transferring autonomous driving model from simulation to real world,” inSouth- eastCon 2022. IEEE, 2022, pp. 296–301

  61. [61]

    How simulation helps autonomous driving: A survey of sim2real, digital twins, and parallel intelligence,

    X. Hu, S. Li, T. Huanget al., “How simulation helps autonomous driving: A survey of sim2real, digital twins, and parallel intelligence,” IEEE Trans. Intell. Veh., vol. 9, no. 1, pp. 593–612, 2024. JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 18

  62. [62]

    Synthetic datasets for autonomous driving: A survey,

    Z. Song, Z. He, X. Liet al., “Synthetic datasets for autonomous driving: A survey,”IEEE Trans. Intell. Veh., 2023

  63. [63]

    Vision language models in autonomous driving: A survey and outlook,

    X. Zhou, M. Liu, E. Yurtseveret al., “Vision language models in autonomous driving: A survey and outlook,”IEEE Trans. Intell. Veh., 2024

  64. [64]

    A survey on autonomous driving datasets: Statistics, annotation quality, and a future outlook,

    M. Liu, E. Yurtsever, J. Fossaertet al., “A survey on autonomous driving datasets: Statistics, annotation quality, and a future outlook,” IEEE Trans. Intell. Veh., pp. 1–29, 2024

  65. [65]

    A comprehensive review on traf- fic datasets and simulators for autonomous vehicles,

    S. Sarker, B. Maples, and W. Li, “A comprehensive review on traf- fic datasets and simulators for autonomous vehicles,”arXiv preprint arXiv:2412.14207, 2024

  66. [66]

    Review and analysis of synthetic dataset generation methods and techniques for application in computer vision,

    G. Paulin and M. Ivasic-Kos, “Review and analysis of synthetic dataset generation methods and techniques for application in computer vision,” Artif. Intell. Rev., vol. 56, no. 9, pp. 9221–9265, 2023

  67. [67]

    Exploring the sim2real gap using digital twins,

    S. Sudhakar, J. Hanzelka, J. Bobillotet al., “Exploring the sim2real gap using digital twins,” inProc. of the IEEE/CVF Int. Conf. on Computer Vision, 2023, pp. 20 418–20 427

  68. [68]

    Digital twins for autonomous driving: A comprehensive implementation and demonstration,

    K. Wang, T. Yu, Z. Liet al., “Digital twins for autonomous driving: A comprehensive implementation and demonstration,” inProc. Int. Conf. Inf. Netw. (ICOIN). IEEE, 2024, pp. 452–457

  69. [69]

    A survey of autonomous driving: Common practices and emerging technologies,

    E. Yurtsever, J. Lambert, A. Carballoet al., “A survey of autonomous driving: Common practices and emerging technologies,”IEEE Access, vol. 8, pp. 58 443–58 469, 2020

  70. [70]

    Explanations in autonomous driving: A survey,

    D. Omeiza, H. Webb, M. Jirotkaet al., “Explanations in autonomous driving: A survey,”IEEE Trans. Intell. Transp. Syst., vol. 23, no. 8, pp. 10 142–10 162, 2021

  71. [71]

    End-to-end autonomous driving: Challenges and frontiers,

    L. Chen, P. Wu, K. Chittaet al., “End-to-end autonomous driving: Challenges and frontiers,”IEEE Trans. Pattern Anal. Mach. Intell., 2024

  72. [72]

    World models for autonomous driving: An initial survey,

    Y . Guan, H. Liao, Z. Liet al., “World models for autonomous driving: An initial survey,”IEEE Trans. Intell. Veh., 2024

  73. [73]

    Milestones in autonomous driving and intelligent vehicles: Survey of surveys,

    L. Chen, Y . Li, C. Huanget al., “Milestones in autonomous driving and intelligent vehicles: Survey of surveys,”IEEE Trans. Intell. Veh., vol. 8, no. 2, pp. 1046–1056, 2022

  74. [74]

    Perspective, survey and trends: Public driving datasets and toolsets for autonomous driving virtual test,

    P. Ji, R. Li, Y . Xueet al., “Perspective, survey and trends: Public driving datasets and toolsets for autonomous driving virtual test,” in Proc. IEEE Int. Intell. Transp. Syst. Conf. (ITSC), 2021, pp. 264–269

  75. [75]

    Drivegpt4: Interpretable end-to-end autonomous driving via large language model,

    Z. Xu, Y . Zhang, E. Xieet al., “Drivegpt4: Interpretable end-to-end autonomous driving via large language model,”IEEE Robot. Autom. Lett., 2024

  76. [76]

    Talk2car: Predicting physical trajectories for natural language commands,

    T. Deruyttere, D. Grujicic, M. B. Blaschkoet al., “Talk2car: Predicting physical trajectories for natural language commands,”IEEE Access, vol. 10, pp. 123 809–123 834, 2022

  77. [77]

    Dilu: A knowledge-driven approach to autonomous driving with large language models,

    L. Wen, D. Fu, X. Liet al., “Dilu: A knowledge-driven approach to autonomous driving with large language models,” inProc. Int. Conf. Learn. Represent. (ICLR), 2024

  78. [78]

    Large language models as decision makers for autonomous driving,

    H. Sha, Y . Mu, Y . Jianget al., “Large language models as decision makers for autonomous driving,” 2024. [Online]. Available: https://openreview.net/forum?id=NkYCuGM7E2

  79. [79]

    Navid: Video-based vlm plans the next step for vision-and-language navigation,

    J. Zhang, K. Wang, R. Xuet al., “Navid: Video-based vlm plans the next step for vision-and-language navigation,” inRobotics: Science & Systems, 2024

  80. [80]

    Drivelm: Driving with graph visual question answering,

    C. Sima, K. Renz, K. Chittaet al., “Drivelm: Driving with graph visual question answering,” inProc. of the European Conf. on Computer Vision (ECCV), 2024

Showing first 80 references.