pith. sign in

arxiv: 2605.05897 · v1 · submitted 2026-05-07 · 💻 cs.RO

Generating Roadside LiDAR Datasets from Vehicle-Side Datasets via Novel View Synthesis

Pith reviewed 2026-05-08 09:20 UTC · model grok-4.3

classification 💻 cs.RO
keywords LiDAR synthesisnovel view synthesisroadside perception3D object detectionpoint cloud completionvisibility constraintV2Xdataset augmentation
0
0 comments X

The pith

Vehicle LiDAR data can be turned into labeled roadside LiDAR data through novel view synthesis to improve 3D object detection.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows how to generate realistic roadside LiDAR point clouds and their labels starting from vehicle-mounted LiDAR scans. Roadside perception for traffic safety suffers from scarce annotated data, while vehicle data is more plentiful. The method completes missing parts of vehicle point clouds and adds an occupancy visibility rule to simulate large viewpoint shifts. When the resulting synthetic roadside data is added to real roadside collections, detection models perform better and handle new sensor positions more reliably. A reader cares because this offers a practical way to expand training sets without new roadside deployments.

Core claim

The Vehicle-to-Roadside LiDAR Synthesis framework generates labeled roadside LiDAR datasets from vehicle-side datasets via LiDAR novel view synthesis. Vehicle point cloud completion compensates for missing geometry while an occupancy-based visibility constraint manages large viewpoint changes during rendering. The framework supports flexible multi-view generation, and mixing the synthetic data with real roadside data raises 3D object detection accuracy and improves generalization to unseen roadside viewpoints.

What carries the argument

The VRS framework, which performs novel view synthesis on completed vehicle point clouds subject to an occupancy-based visibility constraint.

If this is right

  • Synthetic roadside data can be mixed with limited real roadside data to raise detection accuracy.
  • Models trained this way generalize better to roadside viewpoints absent from the original training set.
  • The same vehicle datasets can produce multiple roadside views without additional collection.
  • Flexible rendering supports creation of large-scale labeled roadside datasets at low marginal cost.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The approach could reduce reliance on expensive roadside sensor installations by reusing vehicle data.
  • It might support rapid prototyping of perception models for arbitrary roadside placements.
  • Combining the method with camera or radar synthesis could produce multi-modal roadside datasets.
  • Testing the same pipeline on different LiDAR beam counts or weather conditions would check robustness.

Load-bearing premise

Vehicle point cloud completion combined with the occupancy-based visibility constraint closes the vehicle-to-roadside domain gap enough for the synthetic data to improve real-world detector performance.

What would settle it

Training a roadside 3D object detector on real roadside data alone versus the same real data plus the synthesized data, then measuring lower or equal average precision on a held-out real roadside test set, would falsify the claim.

Figures

Figures reproduced from arXiv: 2605.05897 by Chunxiang Wang, Hanyang Zhuang, Ming Yang, Runxin Zhao, Yuhan Xia.

Figure 1
Figure 1. Figure 1: Overview of VRS. Our method takes annotated vehicle-side point cloud data as input. VRS first decomposes the view at source ↗
Figure 2
Figure 2. Figure 2: The pipeline of vehicle point cloud completion. view at source ↗
Figure 3
Figure 3. Figure 3: For sufficient multi-view coverage, we arrange multiple view at source ↗
Figure 3
Figure 3. Figure 3: Illustration of the multi-view ray sampling strategy. view at source ↗
Figure 4
Figure 4. Figure 4: VRS generates point cloud data for 6 corresponding virtual roadside LiDAR poses based on vehicle-side LiDAR view at source ↗
Figure 5
Figure 5. Figure 5: Illustration of the occupancy-based visibility constraint view at source ↗
Figure 6
Figure 6. Figure 6: 3D object detection results under four different training data configurations. All models are evaluated on the Real-Road view at source ↗
Figure 7
Figure 7. Figure 7: Effect of VRS synthetic data under different real view at source ↗
Figure 8
Figure 8. Figure 8: Qualitative ablation results of key modules. view at source ↗
read the original abstract

Intelligent Transportation Systems (ITS) require reliable environmental perception to support safe and efficient transportation. With the rapid development of Vehicle-to-everything (V2X), roadside perception has become an effective means to extend sensing coverage and improve traffic safety. However, the scarcity of large-scale annotated roadside LiDAR datasets poses a major challenge for training high-performance roadside perception models. In this paper, we introduce Vehicle-to-Roadside LiDAR Synthesis (VRS), a data synthesis framework that generates labeled roadside LiDAR datasets from vehicle-side datasets via LiDAR novel view synthesis. To mitigate the vehicle-to-roadside domain gap, VRS employs vehicle point cloud completion to compensate for missing geometry in vehicle-side observations, and introduces an occupancy-based visibility constraint to handle large viewpoint changes during cross-view rendering. The proposed framework enables flexible multi-view rendering for scalable roadside data generation. Extensive experiments on roadside 3D object detection demonstrate that the synthesized data effectively complements real roadside data, mitigates the limitations of limited real-world roadside data, and improves generalization to unseen roadside viewpoints.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript presents Vehicle-to-Roadside LiDAR Synthesis (VRS), a framework for generating labeled roadside LiDAR datasets from vehicle-side LiDAR data using novel view synthesis techniques. The approach incorporates vehicle point cloud completion to address occlusions and missing geometry, along with an occupancy-based visibility constraint to manage large viewpoint shifts between vehicle and roadside perspectives. The central claim, supported by experiments on 3D object detection tasks, is that the synthesized data can complement limited real roadside datasets, leading to improved model performance and better generalization to unseen roadside viewpoints.

Significance. If validated, this work has high significance for Intelligent Transportation Systems by alleviating the data scarcity issue for roadside perception. It leverages more abundant vehicle-side data to create scalable synthetic roadside data, which is a practical solution. The paper credits the empirical demonstration on real detection benchmarks as a strength, showing measurable improvements rather than relying only on qualitative synthesis quality.

major comments (2)
  1. §3.2: The occupancy-based visibility constraint is described as key to handling viewpoint changes, but without an ablation study removing this constraint and comparing to a baseline without it, it is unclear if this component is load-bearing for the reported performance gains in the detection experiments.
  2. Table 2: The reported mAP improvements when adding synthetic data are given, but the paper does not include a control experiment training on vehicle-side data directly projected to roadside view without completion or visibility, which would test if the proposed mitigations are necessary for the domain gap closure.
minor comments (2)
  1. §2: The related work section could benefit from more recent citations on LiDAR novel view synthesis methods post-2023 to better contextualize the contribution.
  2. Figure 3: The visualization of synthesized point clouds would be clearer with side-by-side comparison to real roadside scans from the same scene.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the positive assessment of our work and the constructive major comments. These points help clarify the necessity of our proposed components. We provide point-by-point responses below and will incorporate the suggested experiments into the revised manuscript.

read point-by-point responses
  1. Referee: §3.2: The occupancy-based visibility constraint is described as key to handling viewpoint changes, but without an ablation study removing this constraint and comparing to a baseline without it, it is unclear if this component is load-bearing for the reported performance gains in the detection experiments.

    Authors: We agree that an explicit ablation isolating the occupancy-based visibility constraint would provide stronger evidence of its contribution. In the revised manuscript, we will add an ablation study that disables this constraint while keeping all other components fixed, and report the resulting changes in synthesized data quality as well as downstream 3D object detection mAP on the roadside benchmarks. This will directly demonstrate whether the constraint is load-bearing for the observed performance gains under large viewpoint shifts. revision: yes

  2. Referee: Table 2: The reported mAP improvements when adding synthetic data are given, but the paper does not include a control experiment training on vehicle-side data directly projected to roadside view without completion or visibility, which would test if the proposed mitigations are necessary for the domain gap closure.

    Authors: We thank the referee for this suggestion. To isolate the effect of our mitigations, the revised manuscript will include a control experiment that trains detectors on vehicle-side data projected to the roadside viewpoint without point-cloud completion or the visibility constraint. We will compare its performance against the full VRS pipeline on the same detection benchmarks, thereby quantifying how much the proposed components contribute to closing the domain gap. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper introduces an independent synthesis pipeline (VRS) that applies vehicle point-cloud completion followed by occupancy-based visibility constraints to render labeled roadside LiDAR from existing vehicle-side data. The derivation consists of standard geometric operations (completion to fill occlusions, visibility masking for viewpoint shift) whose outputs are then fed to an external downstream task—roadside 3D object detection—whose performance improvement serves as the realism metric. No equation or step is shown to be equivalent to its own inputs by construction, no parameter is fitted on a subset and then relabeled a prediction, and no load-bearing premise rests on a self-citation chain. The abstract and method sketch remain self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The abstract provides no explicit free parameters, background axioms, or newly postulated entities; the approach builds on standard novel view synthesis and point cloud processing techniques.

pith-pipeline@v0.9.0 · 5490 in / 1156 out tokens · 30400 ms · 2026-05-08T09:20:07.765428+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

37 extracted references · 1 canonical work pages

  1. [1]

    Vision meets robotics: The kitti dataset,

    A. Geiger, P. Lenz, C. Stiller, and R. Urtasun, “Vision meets robotics: The kitti dataset,”The international journal of robotics research, vol. 32, no. 11, pp. 1231–1237, 2013

  2. [2]

    Scalability in perception for autonomous driving: Waymo open dataset,

    P. Sun, H. Kretzschmar, X. Dotiwalla, A. Chouard, V . Patnaik, P. Tsui, J. Guo, Y . Zhou, Y . Chai, B. Caineet al., “Scalability in perception for autonomous driving: Waymo open dataset,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 2446–2454

  3. [3]

    nuscenes: A multimodal dataset for autonomous driving,

    H. Caesar, V . Bankiti, A. H. Lang, S. V ora, V . E. Liong, Q. Xu, A. Krishnan, Y . Pan, G. Baldan, and O. Beijbom, “nuscenes: A multimodal dataset for autonomous driving,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 11 621–11 631

  4. [4]

    Lidarsim: Realistic lidar simulation by leveraging the real world,

    S. Manivasagam, S. Wang, K. Wong, W. Zeng, M. Sazanovich, S. Tan, B. Yang, W.-C. Ma, and R. Urtasun, “Lidarsim: Realistic lidar simulation by leveraging the real world,” inProceedings of the IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition, 2020, pp. 11 167– 11 176

  5. [5]

    Pcgen: Point cloud generator for lidar simulation,

    C. Li, Y . Ren, and B. Liu, “Pcgen: Point cloud generator for lidar simulation,”arXiv preprint arXiv:2210.08738, 2022

  6. [6]

    Lidar-nerf: Novel lidar view synthesis via neural radiance fields,

    T. Tao, L. Gao, G. Wang, Y . Lao, P. Chen, H. Zhao, D. Hao, X. Liang, M. Salzmann, and K. Yu, “Lidar-nerf: Novel lidar view synthesis via neural radiance fields,” inProceedings of the 32nd ACM International Conference on Multimedia, 2024, pp. 390–398

  7. [7]

    Neural lidar fields for novel view syn- thesis,

    S. Huang, Z. Gojcic, Z. Wang, F. Williams, Y . Kasten, S. Fidler, K. Schindler, and O. Litany, “Neural lidar fields for novel view syn- thesis,” inProceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 18 236–18 246

  8. [8]

    Lidar4d: Dynamic neural fields for novel space-time view lidar synthesis,

    Z. Zheng, F. Lu, W. Xue, G. Chen, and C. Jiang, “Lidar4d: Dynamic neural fields for novel space-time view lidar synthesis,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024, pp. 5145–5154

  9. [9]

    Dynamic lidar re-simulation using compositional neural fields,

    H. Wu, X. Zuo, S. Leutenegger, O. Litany, K. Schindler, and S. Huang, “Dynamic lidar re-simulation using compositional neural fields,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024, pp. 19 988–19 998

  10. [10]

    V2x-seq: A large-scale sequential dataset for vehicle-infrastructure cooperative perception and forecasting,

    H. Yu, W. Yang, H. Ruan, Z. Yang, Y . Tang, X. Gao, X. Hao, Y . Shi, Y . Pan, N. Sunet al., “V2x-seq: A large-scale sequential dataset for vehicle-infrastructure cooperative perception and forecasting,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 5486–5495

  11. [11]

    V2x-vit: Vehicle-to-everything cooperative perception with vision transformer,

    R. Xu, H. Xiang, Z. Tu, X. Xia, M.-H. Yang, and J. Ma, “V2x-vit: Vehicle-to-everything cooperative perception with vision transformer,” inEuropean conference on computer vision. Springer, 2022, pp. 107– 124

  12. [12]

    V2x- sim: Multi-agent collaborative perception dataset and benchmark for autonomous driving,

    Y . Li, D. Ma, Z. An, Z. Wang, Y . Zhong, S. Chen, and C. Feng, “V2x- sim: Multi-agent collaborative perception dataset and benchmark for autonomous driving,”IEEE Robotics and Automation Letters, vol. 7, no. 4, pp. 10 914–10 921, 2022

  13. [13]

    Dolphins: Dataset for collaborative perception enabled harmonious and interconnected self- driving,

    R. Mao, J. Guo, Y . Jia, Y . Sun, S. Zhou, and Z. Niu, “Dolphins: Dataset for collaborative perception enabled harmonious and interconnected self- driving,” inProceedings of the Asian Conference on Computer Vision, 2022, pp. 4361–4377

  14. [14]

    Deepaccident: A motion and accident prediction benchmark for v2x autonomous driving,

    T. Wang, S. Kim, J. Wenxuan, E. Xie, C. Ge, J. Chen, Z. Li, and P. Luo, “Deepaccident: A motion and accident prediction benchmark for v2x autonomous driving,” inProceedings of the AAAI Conference on Artificial Intelligence, vol. 38, no. 6, 2024, pp. 5599–5606

  15. [15]

    Carla: An open urban driving simulator,

    A. Dosovitskiy, G. Ros, F. Codevilla, A. Lopez, and V . Koltun, “Carla: An open urban driving simulator,” inConference on robot learning. PMLR, 2017, pp. 1–16

  16. [16]

    Dair-v2x: A large-scale dataset for vehicle- infrastructure cooperative 3d object detection,

    H. Yu, Y . Luo, M. Shu, Y . Huo, Z. Yang, Y . Shi, Z. Guo, H. Li, X. Hu, J. Yuanet al., “Dair-v2x: A large-scale dataset for vehicle- infrastructure cooperative 3d object detection,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2022, pp. 21 361–21 370

  17. [17]

    Rcooper: A real-world large-scale dataset for roadside cooperative perception,

    R. Hao, S. Fan, Y . Dai, Z. Zhang, C. Li, Y . Wang, H. Yu, W. Yang, J. Yuan, and Z. Nie, “Rcooper: A real-world large-scale dataset for roadside cooperative perception,” inProceedings of the IEEE/CVF con- ference on computer vision and pattern recognition, 2024, pp. 22 347– 22 357

  18. [18]

    Pcn: Point completion network,

    W. Yuan, T. Khot, D. Held, C. Mertz, and M. Hebert, “Pcn: Point completion network,” in2018 international conference on 3D vision (3DV). IEEE, 2018, pp. 728–737

  19. [19]

    Grnet: Grid- ding residual network for dense point cloud completion,

    H. Xie, H. Yao, S. Zhou, J. Mao, S. Zhang, and W. Sun, “Grnet: Grid- ding residual network for dense point cloud completion,” inEuropean conference on computer vision. Springer, 2020, pp. 365–381

  20. [20]

    V oxel-based network for shape completion by leveraging edge generation,

    X. Wang, M. H. Ang, and G. H. Lee, “V oxel-based network for shape completion by leveraging edge generation,” inProceedings of the IEEE/CVF international conference on computer vision, 2021, pp. 13 189–13 198

  21. [21]

    Rfnet: Recurrent forward network for dense point cloud completion,

    T. Huang, H. Zou, J. Cui, X. Yang, M. Wang, X. Zhao, J. Zhang, Y . Yuan, Y . Xu, and Y . Liu, “Rfnet: Recurrent forward network for dense point cloud completion,” inProceedings of the IEEE/CVF international conference on computer vision, 2021, pp. 12 508–12 517

  22. [22]

    Morphing and sampling network for dense point cloud completion,

    M. Liu, L. Sheng, S. Yang, J. Shao, and S.-M. Hu, “Morphing and sampling network for dense point cloud completion,” inProceedings of the AAAI conference on artificial intelligence, vol. 34, no. 07, 2020, pp. 11 596–11 603

  23. [23]

    Variational relational point completion network,

    L. Pan, X. Chen, Z. Cai, J. Zhang, H. Zhao, S. Yi, and Z. Liu, “Variational relational point completion network,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2021, pp. 8524–8533

  24. [24]

    Cascaded refinement network for point cloud completion,

    X. Wang, M. H. Ang Jr, and G. H. Lee, “Cascaded refinement network for point cloud completion,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 790–799

  25. [25]

    Topnet: Structural point cloud decoder,

    L. P. Tchapmi, V . Kosaraju, H. Rezatofighi, I. Reid, and S. Savarese, “Topnet: Structural point cloud decoder,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 383–392

  26. [26]

    Proxyformer: Proxy alignment assisted point cloud completion with missing part sensitive transformer,

    S. Li, P. Gao, X. Tan, and M. Wei, “Proxyformer: Proxy alignment assisted point cloud completion with missing part sensitive transformer,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2023, pp. 9466–9475

  27. [27]

    Vapcnet: viewpoint-aware 3d point cloud completion,

    Z. Fu, L. Wang, L. Xu, Z. Wang, H. Laga, Y . Guo, F. Boussaid, and M. Bennamoun, “Vapcnet: viewpoint-aware 3d point cloud completion,” inProceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 12 108–12 118

  28. [28]

    Seedformer: Patch seeds based point cloud completion with upsample transformer,

    H. Zhou, Y . Cao, W. Chu, J. Zhu, T. Lu, Y . Tai, and C. Wang, “Seedformer: Patch seeds based point cloud completion with upsample transformer,” inEuropean conference on computer vision. Springer, 2022, pp. 416–432

  29. [29]

    Attention is all you need,

    A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,”Advances in neural information processing systems, vol. 30, 2017

  30. [30]

    Pf-net: Point fractal network for 3d point cloud completion,

    Z. Huang, Y . Yu, J. Xu, F. Ni, and X. Le, “Pf-net: Point fractal network for 3d point cloud completion,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 7662– 7670

  31. [31]

    Pointr: Diverse point cloud completion with geometry-aware transformers,

    X. Yu, Y . Rao, Z. Wang, Z. Liu, J. Lu, and J. Zhou, “Pointr: Diverse point cloud completion with geometry-aware transformers,” inProceedings of the IEEE/CVF international conference on computer vision, 2021, pp. 12 498–12 507

  32. [32]

    Symmcompletion: High- fidelity and high-consistency point cloud completion with symmetry guidance,

    H. Yan, Z. Li, K. Luo, L. Lu, and P. Tan, “Symmcompletion: High- fidelity and high-consistency point cloud completion with symmetry guidance,” inProceedings of the AAAI Conference on Artificial Intelli- gence, vol. 39, no. 9, 2025, pp. 9094–9102

  33. [33]

    Large-scale lidar consistent mapping using hierarchical lidar bundle adjustment,

    X. Liu, Z. Liu, F. Kong, and F. Zhang, “Large-scale lidar consistent mapping using hierarchical lidar bundle adjustment,”IEEE Robotics and Automation Letters, vol. 8, no. 3, pp. 1523–1530, 2023

  34. [34]

    Pointpillars: Fast encoders for object detection from point clouds,

    A. H. Lang, S. V ora, H. Caesar, L. Zhou, J. Yang, and O. Beijbom, “Pointpillars: Fast encoders for object detection from point clouds,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 12 697–12 705

  35. [35]

    Pv-rcnn: Point-voxel feature set abstraction for 3d object detection,

    S. Shi, C. Guo, L. Jiang, Z. Wang, J. Shi, X. Wang, and H. Li, “Pv-rcnn: Point-voxel feature set abstraction for 3d object detection,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 10 529–10 538. Yuhan Xiais pursuing the B.S. degree at the School of Automation and Intelligent Sensing, Shanghai Jiao Tong Uni...

  36. [36]

    He has worked as a postdoctoral researcher at Shanghai Jiao Tong University from 2020 to

  37. [37]

    His research interest is in autonomous driving and co- operative driving systems

    He is currently an assistant research profes- sor at Shanghai Jiao Tong University implementing research works related to intelligent vehicles. His research interest is in autonomous driving and co- operative driving systems. Chunxiang Wangreceived a Ph.D. degree in me- chanical engineering from the Harbin Institute of Technology, China, in 1999. She is c...