pith. sign in

arxiv: 2606.20103 · v1 · pith:K66CXYKXnew · submitted 2026-06-18 · 💻 cs.CV

Geometry-Preserving in 3D Gaussian Splatting for LiDAR-Camera Extrinsic Calibration

Pith reviewed 2026-06-26 17:49 UTC · model grok-4.3

classification 💻 cs.CV
keywords LiDAR-camera calibration3D Gaussian Splattingextrinsic calibrationtargetless calibrationdepth supervisiongeometry preservationmulti-modal perception
0
0 comments X

The pith

Locking Gaussian positions to aggregated LiDAR depths while blocking rendering updates from moving them yields more accurate targetless LiDAR-camera calibration.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper seeks to establish that 3D Gaussian Splatting, when used as a bridge between LiDAR and camera data, drifts away from true metric structure if left free to optimize for image quality alone. By collecting depth from multiple LiDAR views to supervise the Gaussians densely and by preventing the photometric loss from changing their spatial parameters, the proxy geometry stays anchored to the LiDAR measurements. This preserved geometry then supplies reliable dense correspondences that let the extrinsic transform between the two sensors be optimized more precisely. A reader would care because targetless calibration is required for practical multi-sensor vehicles, yet current differentiable approaches sacrifice geometric fidelity for visual fidelity and thereby lose calibration accuracy.

Core claim

The paper claims that a 3D Gaussian Splatting proxy can serve as an effective bridge for LiDAR-camera extrinsic optimization only when its metric geometry is explicitly preserved: multi-view LiDAR observations are aggregated to provide dense depth supervision, and photometric gradients are blocked from updating the Gaussian spatial parameters, so that the proxy remains aligned with the true LiDAR structure while still supporting photometric error minimization for the extrinsic parameters.

What carries the argument

The geometry-preserving framework that aggregates multi-view LiDAR observations for dense depth supervision and blocks photometric gradients from updating the Gaussian spatial parameters.

If this is right

  • The Gaussian proxy stays aligned with the true LiDAR point cloud rather than drifting toward photometrically convenient but metrically incorrect locations.
  • Extrinsic optimization uses denser and more geometrically consistent correspondences than methods that allow full gradient flow through the proxy.
  • Calibration accuracy improves consistently over existing targetless approaches when tested on public driving datasets.
  • Rendering fidelity remains sufficient to supply the photometric signal needed for the extrinsic search.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same locking strategy could be applied to other differentiable scene representations that currently trade geometric accuracy for visual quality in multi-sensor settings.
  • The approach suggests a general pattern for any calibration task that must reconcile a dense geometric sensor with a photometric one: anchor the shared representation to the geometric sensor first.
  • If the method works on driving data, it may also reduce the need for hand-crafted targets in indoor or aerial multi-modal rigs where LiDAR and cameras must be aligned without special patterns.

Load-bearing premise

Fixing the Gaussian positions with LiDAR depths while still optimizing other parameters will leave enough rendering quality for the photometric loss to drive reliable extrinsic convergence.

What would settle it

On a public driving dataset the method produces extrinsic errors that are equal to or larger than those of prior targetless baselines.

Figures

Figures reproduced from arXiv: 2606.20103 by Daeho Kim, Hyoseok Hwang, Jeong Woon Lee, Kyoleen Kwak.

Figure 1
Figure 1. Figure 1: Illustration of Geometric Decay: (a) Rendered depth map of 3DGS optimized using Base pipeline with photometric alignment loss. (b) Rendered depth map of 3DGS optimized with only depth loss. (c) Evolution of the depth loss on KITTI-360 Seq. 1. Shaded regions indicate min–max. Reprojection is used when photometric alignment is disabled, and both are enabled after 9000 iterations. most regions unconstrained i… view at source ↗
Figure 2
Figure 2. Figure 2: Overview of the proposed geometry-preserving calibration integration. loss, defined as an L1 discrepancy between rendered depth and LiDAR depth in the inverse-depth domain. The red curve (with photometric alignment) shows a higher mean loss and larger variability, particularly during the shaded interval when photometric alignment is enabled, compared to the run without photomet￾ric alignment. These results… view at source ↗
Figure 3
Figure 3. Figure 3: Calibration error under varying noise levels: (a) rotation and (b) translation. Evaluation Metrics. We quantitatively evaluated calibration accuracy using the translation error Et and the geodesic rotation error Er: \begin {aligned} E_t &= \left \lVert \mathbf {\hat {t}} - \mathbf {t^*} \right \rVert _2, \qquad E_r = \arccos \!\left ( \frac {\mathrm {Tr}\!\left ( (\mathbf {R^*})^\top \mathbf {\hat {R}} \ri… view at source ↗
Figure 4
Figure 4. Figure 4: Qualitative comparison of rendered RGB images on KITTI-360: (a) GT, (b) Base and (c) GeoP-Calib (Ours). PSNR is computed over the entire image to assess photometric consistency. on 7 out of 10 sequences across KITTI-360 and KITTI. In contrast, GST did not perform reliably on these driving scenarios, resulting in large errors and high vari￾ance. CLAIM, which leverages foundation models, attained relatively … view at source ↗
Figure 5
Figure 5. Figure 5: Rendered depth error maps on KITTI-360, measured as the Mean Absolute Error (MAE) between rendered depth and LiDAR depth in the LiDAR-translation and camera-rotation setting: (a) Base and (b) GeoP-Calib (Ours). Colors indicate error magnitude, where blue denotes lower error and red denotes higher error. 5.5 Extended Analysis on Calibration Properties Robustness to Pose Initialization. We evaluated robustne… view at source ↗
read the original abstract

Accurate LiDAR-camera calibration is essential for robust multi-modal perception. Targetless approaches avoid manual setup but remain limited by the scarcity of discriminative cross-modal features. Recent methods address this by reconstructing the scene within a differentiable model, enabling extrinsic optimization through dense photometric supervision. Among these, 3D Gaussian Splatting (3DGS) has been widely adopted as a geometric proxy that bridges LiDAR and camera within a single differentiable framework. However, since 3DGS was originally designed for novel view synthesis, existing methods tend to prioritize rendering quality, causing the proxy geometry to drift from the true LiDAR structure. We propose a framework that preserves the metric geometry of the Gaussian proxy by aggregating multi-view LiDAR observations for dense depth supervision and blocking photometric gradients from updating the Gaussian spatial parameters. We validate our method on public driving datasets, where it consistently outperforms existing targetless methods in calibration accuracy.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper introduces a geometry-preserving variant of 3D Gaussian Splatting for targetless LiDAR-camera extrinsic calibration. It aggregates multi-view LiDAR observations to provide dense depth supervision and blocks photometric gradients from updating the Gaussian spatial parameters (means and covariances) so that the proxy geometry remains anchored to the LiDAR metric structure rather than drifting to optimize rendering quality. The method is evaluated on public driving datasets and reported to outperform prior targetless approaches in calibration accuracy.

Significance. If the quantitative results hold, the approach could improve the reliability of differentiable-rendering-based calibration pipelines by ensuring the geometric proxy stays faithful to the LiDAR measurements, which is valuable for multi-modal perception in robotics and autonomous driving.

major comments (2)
  1. [§3 (Method)] The skeptic concern is valid and load-bearing: blocking photometric gradients on spatial parameters while retaining only per-Gaussian appearance attributes may yield vanishing or noisy gradients w.r.t. the 6-DoF extrinsic when LiDAR-derived geometry does not perfectly match image appearance. The manuscript must demonstrate (via gradient norms, ablation on frozen parameters, or convergence plots) that the photometric term remains informative for extrinsic optimization.
  2. [§4 (Experiments)] §4 (Experiments): the abstract claims consistent outperformance but the provided text supplies no numerical error metrics, error bars, dataset splits, or ablation tables; without these the central claim that the geometry-preserving design improves accuracy cannot be assessed.
minor comments (2)
  1. [§3] Notation for the extrinsic parameters and the exact set of frozen Gaussian attributes should be introduced once in §3 and used consistently.
  2. [§3.3] The weighting between the LiDAR depth loss and the photometric loss is not stated; this hyper-parameter choice should be reported with sensitivity analysis.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. We address each major comment below and will incorporate revisions to strengthen the presentation and validation of our approach.

read point-by-point responses
  1. Referee: [§3 (Method)] The skeptic concern is valid and load-bearing: blocking photometric gradients on spatial parameters while retaining only per-Gaussian appearance attributes may yield vanishing or noisy gradients w.r.t. the 6-DoF extrinsic when LiDAR-derived geometry does not perfectly match image appearance. The manuscript must demonstrate (via gradient norms, ablation on frozen parameters, or convergence plots) that the photometric term remains informative for extrinsic optimization.

    Authors: We agree this requires explicit verification. Although Gaussian spatial parameters (means and covariances) are frozen with respect to the photometric loss, the 6-DoF extrinsic parameters control the camera pose used during differentiable rendering. This directly affects the projected positions of the fixed Gaussians and the resulting rendered image, allowing photometric gradients to flow back to the extrinsics. To demonstrate that these gradients remain informative, we will add gradient norm analysis over iterations, convergence plots, and an ablation on the photometric term's contribution in the revised manuscript. revision: yes

  2. Referee: [§4 (Experiments)] §4 (Experiments): the abstract claims consistent outperformance but the provided text supplies no numerical error metrics, error bars, dataset splits, or ablation tables; without these the central claim that the geometry-preserving design improves accuracy cannot be assessed.

    Authors: We will revise §4 to include comprehensive quantitative results. The updated section will feature tables reporting rotation and translation errors with means and standard deviations, explicit dataset splits, and ablation studies comparing the geometry-preserving design against baselines on the evaluated driving datasets. This will directly support the abstract's claims with the requested details. revision: yes

Circularity Check

0 steps flagged

No circularity detected in derivation chain

full rationale

The paper's abstract and method description outline a proposed framework that aggregates multi-view LiDAR observations for depth supervision while blocking photometric gradients on Gaussian spatial parameters. No equations, derivations, fitted parameters, or self-citations are visible that would reduce any claimed result to its inputs by construction. The central claim is a methodological design choice for geometry preservation rather than a mathematical derivation or prediction that loops back to fitted values. The approach is presented as self-contained without load-bearing reliance on prior self-authored uniqueness theorems or ansatzes.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review; no free parameters, axioms, or invented entities can be extracted or audited.

pith-pipeline@v0.9.1-grok · 5696 in / 1040 out tokens · 25034 ms · 2026-06-26T17:49:58.115698+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

39 extracted references

  1. [1]

    IEEE Transactions on Intelli- gent Transportation Systems23(10), 17677–17689 (2022)

    Beltrán, J., Guindel, C., De La Escalera, A., García, F.: Automatic extrinsic cali- bration method for lidar and camera sensor setups. IEEE Transactions on Intelli- gent Transportation Systems23(10), 17677–17689 (2022)

  2. [2]

    In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

    Caesar, H., Bankiti, V., Lang, A.H., Vora, S., Liong, V.E., Xu, Q., Krishnan, A., Pan, Y., Baldan, G., Beijbom, O.: nuscenes: A multimodal dataset for autonomous driving. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 11621–11631 (2020)

  3. [3]

    IEEE transactions on robotics24(5), 932–945 (2008)

    Civera, J., Davison, A.J., Montiel, J.M.: Inverse depth parametrization for monoc- ular slam. IEEE transactions on robotics24(5), 932–945 (2008)

  4. [4]

    Artificial Intelligence Review58(6), 174 (2025)

    Fan, Z., Zhang, L., Wang, X., Shen, Y., Deng, F.: Lidar, imu, and camera fusion for simultaneous localization and mapping: A systematic review. Artificial Intelligence Review58(6), 174 (2025)

  5. [5]

    In: Conference on Computer Vision and Pattern Recogni- tion (CVPR) (2012)

    Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? the kitti vision benchmark suite. In: Conference on Computer Vision and Pattern Recogni- tion (CVPR) (2012)

  6. [6]

    Machine Intelligence Research22(5), 956–968 (2025)

    Guan, H., Song, C., Zhang, Z.: Lidar-camera cooperative semantic segmentation. Machine Intelligence Research22(5), 956–968 (2025)

  7. [7]

    In: 2024 IEEE/RSJ International Confer- ence on Intelligent Robots and Systems (IROS)

    Herau, Q., Bennehar, M., Moreau, A., Piasco, N., Roldão, L., Tsishkou, D., Migniot, C., Vasseur, P., Demonceaux, C.: 3dgs-calib: 3d gaussian splatting for multimodal spatiotemporal calibration. In: 2024 IEEE/RSJ International Confer- ence on Intelligent Robots and Systems (IROS). pp. 8315–8321. IEEE (2024)

  8. [8]

    In: Proceedings of the IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition

    Herau, Q., Piasco, N., Bennehar, M., Roldao, L., Tsishkou, D., Migniot, C., Vasseur, P., Demonceaux, C.: Soac: Spatio-temporal overlap-aware multi-sensor calibration using neural radiance fields. In: Proceedings of the IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition. pp. 15131–15140 (2024)

  9. [9]

    In: ACM SIGGRAPH 2024 conference papers

    Huang, B., Yu, Z., Chen, A., Geiger, A., Gao, S.: 2d gaussian splatting for geo- metrically accurate radiance fields. In: ACM SIGGRAPH 2024 conference papers. pp. 1–11 (2024)

  10. [10]

    In: 2016 IEEE International Conference on Automation Science and Engineering (CASE)

    Irie, K., Sugiyama, M., Tomono, M.: Target-less camera-lidar extrinsic calibration using a bagged dependence estimator. In: 2016 IEEE International Conference on Automation Science and Engineering (CASE). pp. 1340–1347. IEEE (2016) 16 K. Kwak et al

  11. [11]

    arXiv preprint arXiv:1803.08181 (2018)

    Iyer, G., Murthy, J.K., Krishna, K.M., et al.: Calibnet: self-supervised extrinsic calibration using 3d spatial transformer networks. arXiv preprint arXiv:1803.08181 (2018)

  12. [12]

    Jung,H.,Kim,N.,Kim,J.,Park,J.:Targetlesslidar-cameracalibrationwithneural gaussian splatting (2026)

  13. [13]

    Journal of Field Robotics37(1), 158–179 (2020)

    Kang,J.,Doh,N.L.:Automatictargetlesscamera–lidarcalibrationbyaligningedge with gaussian mixture model. Journal of Field Robotics37(1), 158–179 (2020)

  14. [14]

    ACM Trans

    Kerbl, B., Kopanas, G., Leimkühler, T., Drettakis, G., et al.: 3d gaussian splatting for real-time radiance field rendering. ACM Trans. Graph.42(4), 139–1 (2023)

  15. [15]

    IEEE Robotics and Automation Letters10(2), 883–890 (2024)

    Kim, D., Shin, S., Hwang, H.: Camera-lidar extrinsic calibration using constrained optimization with circle placement. IEEE Robotics and Automation Letters10(2), 883–890 (2024)

  16. [16]

    arXiv preprint arXiv:2302.05094 (2023)

    Koide, K., Oishi, S., Yokozuka, M., Banno, A.: General, single-shot, target- less, and automatic lidar-camera extrinsic calibration toolbox. arXiv preprint arXiv:2302.05094 (2023)

  17. [17]

    In: Robotics: science and systems

    Levinson, J., Thrun, S.: Automatic online calibration of cameras and lasers. In: Robotics: science and systems. vol. 2. Berlin, Germany (2013)

  18. [18]

    arXiv preprint arXiv:2109.13410 (2021)

    Liao, Y., Xie, J., Geiger, A.: KITTI-360: A novel dataset and benchmarks for urban scene understanding in 2d and 3d. arXiv preprint arXiv:2109.13410 (2021)

  19. [19]

    In: Proceedings of the IEEE/CVF Con- ference on Computer Vision and Pattern Recognition

    Lv, X., Wang, B., Dou, Z., Ye, D., Wang, S.: Lccnet: Lidar and camera self- calibration using cost volume network. In: Proceedings of the IEEE/CVF Con- ference on Computer Vision and Pattern Recognition. pp. 2894–2901 (2021)

  20. [20]

    Commu- nications of the ACM65(1), 99–106 (2021)

    Mildenhall, B., Srinivasan, P.P., Tancik, M., Barron, J.T., Ramamoorthi, R., Ng, R.: Nerf: Representing scenes as neural radiance fields for view synthesis. Commu- nications of the ACM65(1), 99–106 (2021)

  21. [21]

    IEEE Transactions on Intelligent Vehicles9(1), 2636–2648 (2023)

    Ou, N., Cai, H., Wang, J.: Targetless lidar-camera calibration via cross-modality structure consistency. IEEE Transactions on Intelligent Vehicles9(1), 2636–2648 (2023)

  22. [22]

    In: Pro- ceedings of the AAAI conference on artificial intelligence

    Pandey, G., McBride, J., Savarese, S., Eustice, R.: Automatic targetless extrinsic calibration of a 3d lidar and camera by maximizing mutual information. In: Pro- ceedings of the AAAI conference on artificial intelligence. vol. 26, pp. 2053–2059 (2012)

  23. [23]

    In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

    Qiu, J., Cui, Z., Zhang, Y., Zhang, X., Liu, S., Zeng, B., Pollefeys, M.: Deepli- dar: Deep surface normal guided depth prediction for outdoor scene from sparse lidar data and single color image. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 3313–3322 (2019)

  24. [24]

    In: CVPR (2020)

    Sarlin, P.E., DeTone, D., Malisiewicz, T., Rabinovich, A.: SuperGlue: Learning feature matching with graph neural networks. In: CVPR (2020)

  25. [25]

    In: European Conference on Computer Vision (ECCV)

    Song, Z., Yang, L., Xu, S., Liu, L., Xu, D., Jia, C., Jia, F., Wang, L.: Graphbev: Towards robust bev feature alignment for multi-modal 3d object detection. In: European Conference on Computer Vision (ECCV). pp. 347–366. Springer (2024)

  26. [26]

    Journal of field Robotics23(9), 661–692 (2006)

    Thrun, S., Montemerlo, M., Dahlkamp, H., Stavens, D., Aron, A., Diebel, J., Fong, P., Gale, J., Halpenny, M., Hoffmann, G., et al.: Stanley: The robot that won the darpa grand challenge. Journal of field Robotics23(9), 661–692 (2006)

  27. [27]

    In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

    Vora, S., Lang, A.H., Helou, B., Beijbom, O.: Pointpainting: Sequential fusion for 3d object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 4604–4612 (2020)

  28. [28]

    International Journal of Computer Vision131(8), 2122–2152 (2023) Geometry-Preserving 3DGS for LiDAR-Camera Calibration 17

    Wang, Y., Mao, Q., Zhu, H., Deng, J., Zhang, Y., Ji, J., Li, H., Zhang, Y.: Multi- modal 3d object detection in autonomous driving: a survey. International Journal of Computer Vision131(8), 2122–2152 (2023) Geometry-Preserving 3DGS for LiDAR-Camera Calibration 17

  29. [29]

    Software Impacts14, 100393 (2022)

    Yan, G., Liu, Z., Wang, C., Shi, C., Wei, P., Cai, X., Ma, T., Liu, Z., Zhong, Z., Liu, Y., et al.: Opencalib: A multi-sensor calibration toolbox for autonomous driving. Software Impacts14, 100393 (2022)

  30. [30]

    In: European Conference on Computer Vision (ECCV)

    Yang, Z., Chen, G., Zhang, H., Ta, K., Bârsan, I.A., Murphy, D., Manivasagam, S., Urtasun, R.: Unical: Unified neural sensor calibration. In: European Conference on Computer Vision (ECCV). pp. 327–345. Springer (2024)

  31. [31]

    IEEE Robotics and Automation Letters6(4), 7517–7524 (2021)

    Yuan, C., Liu, X., Hong, X., Zhang, F.: Pixel-level extrinsic self calibration of high resolution lidar and camera in targetless environments. IEEE Robotics and Automation Letters6(4), 7517–7524 (2021)

  32. [32]

    In: 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)(IEEE Cat

    Zhang, Q., Pless, R.: Extrinsic calibration of a camera and laser range finder (improves camera calibration). In: 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)(IEEE Cat. No. 04CH37566). vol. 3, pp. 2301–2306. IEEE (2004)

  33. [33]

    IEEE Transac- tions on Circuits and Systems for Video Technology (2025)

    Zhang, T., Zhang, L., Wang, H.: Higs-calib: A hierarchical 3d gaussian splatting based targetless local-consistent lidar-camera calibration method. IEEE Transac- tions on Circuits and Systems for Video Technology (2025)

  34. [34]

    In: 2021 IEEE International Conference on Robotics and Automation (ICRA)

    Zhang, X., Zhu, S., Guo, S., Li, J., Liu, H.: Line-based automatic extrinsic cali- bration of lidar and camera. In: 2021 IEEE International Conference on Robotics and Automation (ICRA). pp. 9347–9353. IEEE (2021)

  35. [35]

    In: 2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

    Zhang, Z., Liu, Y., Zhang, M., Tan, F., Ding, Y.: Claim: Camera-lidar alignment with intensity and monodepth. In: 2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). pp. 17921–17926. IEEE (2025)

  36. [36]

    In: 2018 IEEE/RSJ International Conferenceon IntelligentRobots andSystems(IROS).pp

    Zhou, L., Li, Z., Kaess, M.: Automatic extrinsic calibration of a camera and a 3d lidar using line and plane correspondences. In: 2018 IEEE/RSJ International Conferenceon IntelligentRobots andSystems(IROS).pp. 5562–5569. IEEE(2018)

  37. [37]

    IEEE Robotics and Automation Letters (2025)

    Zhou, S., Xie, S., Ishikawa, R., Oishi, T.: Robust lidar-camera calibration with 2d gaussian splatting. IEEE Robotics and Automation Letters (2025)

  38. [38]

    In: 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

    Zhou, S., Xie, S., Ishikawa, R., Sakurada, K., Onishi, M., Oishi, T.: Inf: Implicit neural fusion for lidar and camera. In: 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). pp. 10918–10925. IEEE (2023)

  39. [39]

    In: 2020 IEEE International Conference on Robotics and Automation (ICRA)

    Zhu, Y., Li, C., Zhang, Y.: Online camera-lidar calibration with sensor semantic information. In: 2020 IEEE International Conference on Robotics and Automation (ICRA). pp. 4970–4976. IEEE (2020)