A Joint Optimization Approach of LiDAR-Camera Fusion for Accurate Dense 3D Reconstructions
Pith reviewed 2026-05-25 11:37 UTC · model grok-4.3
The pith
Jointly solving bundle adjustment and cloud registration fuses LiDAR and camera data into dense 3D models at 2.7 mm accuracy.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The method jointly solves a bundle adjustment problem and a cloud registration problem to compute camera poses and the sensor extrinsic calibration, enabling the construction of dense, accurate 3D models from LiDAR and camera data that reach an averaged accuracy of 2.7 mm and a resolution of 70 points per square cm against survey scanner ground truth, with calibration results that outperform the state-of-the-art.
What carries the argument
Joint bundle adjustment and cloud registration optimization that locates correlations between LiDAR geometry and camera texture to refine poses and calibration simultaneously.
If this is right
- The resulting models reach an averaged accuracy of 2.7 mm when compared to survey scanner ground truth.
- Reconstructions achieve a resolution of 70 points per square cm.
- The computed extrinsic calibration outperforms the state-of-the-art method.
- Dense 3D models can be built offline without separate calibration or post-processing stages.
Where Pith is reading between the lines
- The joint formulation could reduce reliance on separate calibration hardware or procedures in multi-sensor platforms.
- If the optimization remains stable across varied environments, the method might support real-time extensions for robotic mapping.
- Similar joint optimization could be tested on other sensor pairs where one provides sparse geometry and the other dense appearance.
Load-bearing premise
Reliable correlations between sparse geometric LiDAR data and dense textural camera data can be found and exploited inside the joint optimization without divergence.
What would settle it
Applying the method to fresh datasets and measuring reconstruction error against independent survey scanner ground truth yields average deviations well above 2.7 mm.
Figures
read the original abstract
Fusing data from LiDAR and camera is conceptually attractive because of their complementary properties. For instance, camera images are higher resolution and have colors, while LiDAR data provide more accurate range measurements and have a wider Field Of View (FOV). However, the sensor fusion problem remains challenging since it is difficult to find reliable correlations between data of very different characteristics (geometry vs. texture, sparse vs. dense). This paper proposes an offline LiDAR-camera fusion method to build dense, accurate 3D models. Specifically, our method jointly solves a bundle adjustment (BA) problem and a cloud registration problem to compute camera poses and the sensor extrinsic calibration. In experiments, we show that our method can achieve an averaged accuracy of 2.7mm and resolution of 70 points per square cm by comparing to the ground truth data from a survey scanner. Furthermore, the extrinsic calibration result is discussed and shown to outperform the state-of-the-art method.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces an offline fusion method for LiDAR and camera data to construct dense and accurate 3D models. The approach jointly optimizes a bundle adjustment (BA) problem for estimating camera poses and a cloud registration problem to determine the extrinsic calibration between the LiDAR and camera. Experimental results show an average accuracy of 2.7 mm and a resolution of 70 points per square cm when validated against ground truth from a survey scanner, and the extrinsic calibration outperforms state-of-the-art methods.
Significance. If the reported accuracy and resolution hold under the joint optimization, the work would advance multi-sensor 3D reconstruction in robotics by providing a principled way to exploit complementary LiDAR geometry and camera texture without separate calibration pipelines. The independent survey-scanner validation and explicit formulation of cost terms linking projected points to image features add credibility to the quantitative claims.
minor comments (2)
- [Method] The balancing weights between the BA and registration terms are listed as free parameters; a brief sensitivity study or default values in the experiments section would clarify robustness.
- [Abstract] The abstract states the 2.7 mm / 70 pts/cm² figures but omits the number of scenes or total points compared; adding this would help readers assess the scope of the validation.
Simulated Author's Rebuttal
We thank the referee for the positive assessment of our work on joint LiDAR-camera optimization for dense 3D reconstruction and for recommending minor revision. No specific major comments were raised in the report.
Circularity Check
No significant circularity
full rationale
The paper presents an empirical method for joint bundle adjustment and point-cloud registration whose accuracy (2.7 mm, 70 pts/cm²) is measured against an independent external survey-scanner ground truth. No equation, cost term, or claimed prediction is shown to be definitionally equivalent to its own fitted inputs or to a self-citation chain; the optimization formulation and convergence are described explicitly and the quantitative claims rest on external validation rather than internal re-labeling of the same data.
Axiom & Free-Parameter Ledger
free parameters (1)
- balancing weights between BA and registration terms
axioms (2)
- domain assumption Reliable feature correspondences exist between camera images and LiDAR point clouds that can be used inside a single optimization.
- domain assumption The LiDAR-camera extrinsic transform is constant and can be recovered by registration.
Reference graph
Works this paper leans on
-
[1]
Automatic extrinsic calibration of a camera and a 3d lidar using line and plane correspondences,
L. Zhou, Z. Li, and M. Kaess, “Automatic extrinsic calibration of a camera and a 3d lidar using line and plane correspondences,” in 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2018, pp. 5562–5569
work page 2018
-
[2]
Automatic camera and range sensor calibration using a single shot,
A. Geiger, F. Moosmann, ¨O. Car, and B. Schuster, “Automatic camera and range sensor calibration using a single shot,” in 2012 IEEE/RSJ International Conference on Robotics and Automation (ICRA). IEEE, 2012, pp. 3936–3943
work page 2012
-
[3]
Accurate calibration of lidar-camera systems using ordinary boxes,
Z. Pusztai and L. Hajder, “Accurate calibration of lidar-camera systems using ordinary boxes,” 2017
work page 2017
-
[4]
Calibration of rgb camera with velodyne lidar,
M. Vel’as, M. ˇSpanˇel, Z. Materna, and A. Herout, “Calibration of rgb camera with velodyne lidar,” 2014
work page 2014
-
[5]
3d lidar-camera extrinsic calibration using an arbitrary trihedron,
X. Gong, Y . Lin, and J. Liu, “3d lidar-camera extrinsic calibration using an arbitrary trihedron,” Sensors, vol. 13, no. 2, pp. 1902–1918, 2013
work page 1902
-
[6]
Automatic online calibration of cameras and lasers
J. Levinson and S. Thrun, “Automatic online calibration of cameras and lasers.” in Robotics: Science and Systems , vol. 2, 2013
work page 2013
-
[7]
Automatic targetless extrinsic calibration of a 3d lidar and camera by maximizing mutual information
G. Pandey, J. R. McBride, S. Savarese, and R. M. Eustice, “Automatic targetless extrinsic calibration of a 3d lidar and camera by maximizing mutual information.” in AAAI, 2012
work page 2012
-
[8]
Lidar and camera calibra- tion using motions estimated by sensor fusion odometry,
R. Ishikawa, T. Oishi, and K. Ikeuchi, “Lidar and camera calibra- tion using motions estimated by sensor fusion odometry,” in 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2018, pp. 7342–7349
work page 2018
-
[9]
Odometry-based online extrinsic sensor calibration,
S. Schneider, T. Luettel, and H.-J. Wuensche, “Odometry-based online extrinsic sensor calibration,” in 2013 IEEE/RSJ International Confer- ence on Intelligent Robots and Systems (IROS) . IEEE, 2013, pp. 1287–1292
work page 2013
-
[10]
Extrinsic calibration from per-sensor egomotion,
J. Brookshire and S. Teller, “Extrinsic calibration from per-sensor egomotion,” Robotics: Science and Systems VIII , pp. 504–512, 2013
work page 2013
-
[11]
A new technique for fully autonomous and efficient 3d robotics hand/eye calibration,
R. Y . Tsai and R. K. Lenz, “A new technique for fully autonomous and efficient 3d robotics hand/eye calibration,” IEEE Transactions on robotics and automation , vol. 5, no. 3, pp. 345–358, 1989
work page 1989
-
[12]
Upsampling range data in dynamic environments,
J. Dolson, J. Baek, C. Plagemann, and S. Thrun, “Upsampling range data in dynamic environments,” in 2010 IEEE Conference on Com- puter Vision and Pattern Recognition (CVPR). IEEE, 2010, pp. 1141– 1148
work page 2010
-
[13]
Sensor fusion of cameras and a laser for city-scale 3d reconstruction,
Y . Bok, D.-G. Choi, and I. S. Kweon, “Sensor fusion of cameras and a laser for city-scale 3d reconstruction,” Sensors, vol. 14, no. 11, pp. 20 882–20 909, 2014
work page 2014
-
[14]
Colourising point clouds using independent cameras,
P. Vechersky, M. Cox, P. Borges, and T. Lowe, “Colourising point clouds using independent cameras,” IEEE Robotics and Automation Letters, vol. 3, no. 4, pp. 3575–3582, 2018
work page 2018
-
[15]
Visual-lidar odometry and mapping: Low- drift, robust, and fast,
J. Zhang and S. Singh, “Visual-lidar odometry and mapping: Low- drift, robust, and fast,” in 2015 IEEE International Conference on Robotics and Automation (ICRA) . IEEE, 2015, pp. 2174–2181
work page 2015
-
[16]
Integrating lidar into stereo for fast and improved disparity computation,
D. Huber, T. Kanade, et al., “Integrating lidar into stereo for fast and improved disparity computation,” in 2011 International Conference on 3D Imaging, Modeling, Processing, Visualization and Transmission (3DIMPVT). IEEE, 2011, pp. 405–412
work page 2011
-
[17]
Incremental dense multi-modal 3d scene reconstruction,
O. Miksik, Y . Amar, V . Vineet, P. P´erez, and P. H. Torr, “Incremental dense multi-modal 3d scene reconstruction,” in 2015 IEEE/RSJ Inter- national Conference on Intelligent Robots and Systems (IROS). IEEE, 2015, pp. 908–915
work page 2015
-
[18]
Real-time probabilistic fusion of sparse 3d lidar and dense stereo,
W. Maddern and P. Newman, “Real-time probabilistic fusion of sparse 3d lidar and dense stereo,” in 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) . IEEE, 2016, pp. 2181– 2188
work page 2016
-
[19]
Fusion of stereo and lidar data for dense depth map computation,
H. Courtois and N. Aouf, “Fusion of stereo and lidar data for dense depth map computation,” in Research, Education and Development of Unmanned Aerial Systems (RED-UAS), 2017 Workshop on . IEEE, 2017, pp. 186–191
work page 2017
-
[20]
Automatic fusion of digital images and laser scanner data for heritage preservation,
W. Moussa, M. Abdel-Wahab, and D. Fritsch, “Automatic fusion of digital images and laser scanner data for heritage preservation,” in Euro-Mediterranean Conference. Springer, 2012, pp. 76–85
work page 2012
-
[21]
Combined high resolution laser scanning and photogrammetrical documentation of the pyramids at giza,
W. Neubauer, M. Doneus, N. Studnicka, and J. Riegl, “Combined high resolution laser scanning and photogrammetrical documentation of the pyramids at giza,” in CIPA XX International Symposium . Citeseer, 2005, pp. 470–475
work page 2005
-
[22]
Towards a 3d true colored space by the fusion of laser scanner point cloud and digital photos,
A. Abdelhafiz, B. Riedel, and W. Niemeier, “Towards a 3d true colored space by the fusion of laser scanner point cloud and digital photos,” in Proceedings of the ISPRS Working Group V/4 Workshop (3D-ARCH . Citeseer, 2005
work page 2005
-
[23]
Stereo processing by semiglobal matching and mutual information,
H. Hirschmuller, “Stereo processing by semiglobal matching and mutual information,” IEEE Transactions on pattern analysis and machine intelligence, vol. 30, no. 2, pp. 328–341, 2008
work page 2008
-
[24]
Surf: Speeded up robust features,
H. Bay, T. Tuytelaars, and L. Van Gool, “Surf: Speeded up robust features,” in European conference on computer vision. Springer, 2006, pp. 404–417
work page 2006
-
[25]
A comprehensive performance evaluation of 3d local feature descriptors,
Y . Guo, M. Bennamoun, F. Sohel, M. Lu, J. Wan, and N. M. Kwok, “A comprehensive performance evaluation of 3d local feature descriptors,” International Journal of Computer Vision , vol. 116, no. 1, pp. 66–89, 2016
work page 2016
-
[26]
Method for registration of 3-d shapes,
P. J. Besl and N. D. McKay, “Method for registration of 3-d shapes,” in Sensor Fusion IV: Control Paradigms and Data Structures , vol. 1611. International Society for Optics and Photonics, 1992, pp. 586–607
work page 1992
-
[27]
A novel binary shape context for 3d local surface description,
Z. Dong, B. Yang, Y . Liu, F. Liang, B. Li, and Y . Zang, “A novel binary shape context for 3d local surface description,” ISPRS Journal of Photogrammetry and Remote Sensing, vol. 130, pp. 431–452, 2017
work page 2017
-
[28]
G. Bradski, “The OpenCV Library,” Dr. Dobb’s Journal of Software Tools, 2000
work page 2000
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.