ProBA: Probabilistic Bundle Adjustment with the Bhattacharyya Coefficient
Pith reviewed 2026-05-19 12:54 UTC · model grok-4.3
The pith
Probabilistic Bundle Adjustment enables joint optimization of camera poses and 3D geometry from random initialization using uncertain Gaussian landmarks.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
ProBA re-parameterizes the bundle adjustment manifold so that extrinsics, focal lengths, and scene geometry are optimized together from a strict cold start. Landmarks are represented as 3D Gaussians and the objective is a unified Negative Log-Likelihood that incorporates the Bhattacharyya coefficient to measure spatial consistency. A sparse view graph is optimized with an iterative adaptive edge-weighting scheme that prunes erroneous topological links, and a dual-hypothesis regularization resolves mirror ambiguities.
What carries the argument
The representation of landmarks as 3D Gaussians combined with a unified Negative Log-Likelihood objective that uses the Bhattacharyya coefficient to weight correspondences by statistical confidence.
If this is right
- Joint optimization of extrinsics, focal lengths, and geometry becomes feasible without metric initialization.
- The basin of attraction for convergence is expanded, allowing recovery from poorer starting points.
- Erroneous links are pruned from the view graph while global consistency is maintained.
- Mirror ambiguities are resolved, enabling prior-free SfM to succeed on symmetric scenes.
Where Pith is reading between the lines
- The same Gaussian uncertainty model could be applied incrementally to support online updates in SLAM systems.
- Replacing rigid track-building with probabilistic edge weighting may simplify integration of dense matchers into existing SfM pipelines.
- The dual-hypothesis mechanism suggests a general pattern for disambiguating other geometric symmetries in multi-view reconstruction.
Load-bearing premise
The assumption that an iterative adaptive edge-weighting mechanism on a sparse view graph can reliably prune erroneous topological links while preserving global consistency, without introducing new biases or disconnecting the graph.
What would settle it
Running ProBA from random camera poses and unknown focal lengths on a benchmark SfM dataset with ground-truth poses and checking whether the final reconstruction error is higher than that obtained by classical bundle adjustment started from approximate poses.
Figures
read the original abstract
Classical Bundle Adjustment (BA) is fundamentally limited by its reliance on precise metric initialization and prior camera intrinsics. While modern dense matchers offer high-fidelity correspondences, traditional Structure-from-Motion (SfM) pipelines struggle to leverage them, as rigid track-building heuristics fail in the presence of their inherent noise. We present \textbf{ProBA (Probabilistic Bundle Adjustment)}, a probabilistic re-parameterization of the BA manifold that enables joint optimization of extrinsics, focal lengths, and geometry from a strict cold start. By replacing fragile point tracks with a flexible kinematic pose graph and representing landmarks as 3D Gaussians, our framework explicitly models spatial uncertainty through a unified Negative Log-Likelihood (NLL) objective. This volumetric formulation smooths the non-convex optimization landscape and naturally weights correspondences by their statistical confidence. To maintain global consistency, we optimize over a sparse view graph using an iterative, adaptive edge-weighting mechanism to prune erroneous topological links. Furthermore, we resolve mirror ambiguities inherent to prior-free SfM via a dual-hypothesis regularization strategy. Extensive evaluations show that our approach significantly expands the basin of attraction and achieves superior accuracy over both classical and learning-based baselines, providing a scalable foundation that greatly benefits SfM and SLAM robustness in unstructured environments.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript presents ProBA, a probabilistic reformulation of bundle adjustment. It replaces point tracks with a kinematic pose graph, models landmarks as 3D Gaussians, and optimizes a unified negative log-likelihood objective that incorporates the Bhattacharyya coefficient to explicitly account for spatial uncertainty. The method claims to perform joint optimization of extrinsics, focal lengths, and geometry from a strict cold start, using an iterative adaptive edge-weighting scheme on a sparse view graph to prune erroneous links and a dual-hypothesis regularization to resolve mirror ambiguities. Experiments are reported to show an expanded basin of attraction and higher accuracy than both classical and learning-based baselines on SfM and SLAM tasks in unstructured environments.
Significance. If the central claims are supported by the derivations and experiments, the work would be significant for SfM and SLAM pipelines that must operate without reliable initialization or clean tracks. The volumetric Gaussian representation and Bhattacharyya-based weighting provide a principled way to smooth the non-convex landscape and down-weight noisy correspondences, which could improve robustness when dense matchers are used. The absence of any mention of machine-checked proofs, fully reproducible code releases, or parameter-free derivations limits the immediate strength of the contribution.
major comments (2)
- [Abstract and §3 (Method)] Abstract and §3 (Method): the iterative adaptive edge-weighting mechanism on the sparse view graph is presented as the safeguard that prunes erroneous topological links while preserving global consistency. No explicit adaptation rule, threshold schedule, or connectivity invariant is supplied. This is load-bearing for the cold-start robustness claim; without such safeguards the weighting can sever valid edges or bias the graph in high-noise regions typical of dense matchers, directly risking the reported gains in basin of attraction and accuracy.
- [§4 (Experiments)] §4 (Experiments): the superiority claims rest on comparisons with classical and learning-based baselines, yet the manuscript supplies no ablation isolating the contribution of the adaptive weighting versus the Gaussian landmark model or the dual-hypothesis term. Without these controls it is impossible to attribute the accuracy improvements to the probabilistic formulation rather than post-hoc tuning.
minor comments (2)
- [Abstract] Abstract: the unified NLL objective is described at a high level but no equations are shown, making it difficult for readers to verify how the Bhattacharyya coefficient is combined with the Gaussian landmark covariances.
- [§2 (Related Work) or §3 (Method)] Notation: the kinematic pose graph is introduced without a clear definition of its state variables or how it differs from a standard pose graph; a short table or diagram would improve clarity.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback and detailed comments on our manuscript. We address each major comment point by point below, acknowledging where clarifications and additions are needed to strengthen the presentation of the adaptive weighting mechanism and experimental analysis.
read point-by-point responses
-
Referee: [Abstract and §3 (Method)] Abstract and §3 (Method): the iterative adaptive edge-weighting mechanism on the sparse view graph is presented as the safeguard that prunes erroneous topological links while preserving global consistency. No explicit adaptation rule, threshold schedule, or connectivity invariant is supplied. This is load-bearing for the cold-start robustness claim; without such safeguards the weighting can sever valid edges or bias the graph in high-noise regions typical of dense matchers, directly risking the reported gains in basin of attraction and accuracy.
Authors: We agree that the current description of the iterative adaptive edge-weighting mechanism is insufficiently detailed for a load-bearing component of the cold-start robustness claim. The manuscript introduces the mechanism conceptually but does not supply the explicit adaptation rule, threshold schedule, or connectivity invariant. In the revised version we will expand §3 with the precise mathematical update rule for edge weights (based on the Bhattacharyya coefficient between landmark Gaussians), the schedule for threshold adaptation across iterations, and a short proof sketch showing that the pruning step preserves a connected view graph under the stated assumptions. These additions will directly address the risk of severing valid edges in high-noise regimes. revision: yes
-
Referee: [§4 (Experiments)] §4 (Experiments): the superiority claims rest on comparisons with classical and learning-based baselines, yet the manuscript supplies no ablation isolating the contribution of the adaptive weighting versus the Gaussian landmark model or the dual-hypothesis term. Without these controls it is impossible to attribute the accuracy improvements to the probabilistic formulation rather than post-hoc tuning.
Authors: The referee is correct that the absence of component-wise ablations makes it difficult to attribute gains specifically to the probabilistic formulation. The reported experiments compare the full ProBA pipeline against baselines but do not isolate the adaptive weighting, the 3D Gaussian landmark representation, or the dual-hypothesis regularization. We will add a dedicated ablation study in the revised §4 that reports performance when each of these three elements is disabled in turn, using the same evaluation protocol on the SfM and SLAM benchmarks. This will allow readers to quantify the individual contributions and reduce the possibility that results stem from post-hoc tuning. revision: yes
Circularity Check
No significant circularity; derivation relies on external match confidences and standard probabilistic modeling
full rationale
The paper defines its core objective as a unified Negative Log-Likelihood over 3D Gaussian landmarks and a kinematic pose graph, with edge weights derived from Bhattacharyya coefficients on external dense matcher outputs. No equation reduces a fitted parameter to a renamed prediction, no self-citation supplies a load-bearing uniqueness theorem, and the adaptive weighting is presented as an iterative mechanism operating on an independently supplied sparse view graph rather than being defined tautologically by the target accuracy metric. The cold-start claim is supported by the volumetric smoothing property of the NLL, which is independent of the final reported numbers.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption The non-convex optimization landscape is smoothed sufficiently by the volumetric Gaussian formulation to allow reliable convergence from cold start.
- domain assumption Erroneous topological links can be identified and pruned by iterative adaptive edge weighting without disconnecting the global graph.
invented entities (2)
-
3D Gaussian landmarks
no independent evidence
-
kinematic pose graph
no independent evidence
Reference graph
Works this paper leans on
-
[1]
In: European Conference on Computer Vision (ECCV)
Agarwal, S., Snavely, N., Seitz, S.M., Szeliski, R.: Bundle adjustment in the large. In: European Conference on Computer Vision (ECCV). pp. 29–42. Springer (2010)
work page 2010
-
[2]
In: Proceedings of the IEEE International Conference on Computer Vision (ICCV)
Agarwal, S., Snavely, N., Simon, I., Seitz, S.M., Szeliski, R.: Building rome in a day. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV). pp. 72–79 (2009)
work page 2009
-
[3]
SIAM Journal on Imaging Sciences9(4), 1963–1990 (2016)
Arrigoni, F., Rossi, B., Fusiello, A.: Spectral synchronization of multiple views in SE(3). SIAM Journal on Imaging Sciences9(4), 1963–1990 (2016)
work page 1963
-
[4]
Barron, J.T., Mildenhall, B., Verbin, D., Srinivasan, P.P., Hedman, P.: Mip-NeRF 360: Unbounded anti-aliased neural radiance fields. In: CVPR (2022)
work page 2022
-
[5]
Besl, P., McKay, N.D.: A method for registration of 3-d shapes. IEEE Transactions onPatternAnalysisandMachineIntelligence14(2),239–256(1992).https://doi. org/10.1109/34.121791
- [6]
-
[7]
IEEE TPAMI29(6), 1052–1067 (2007).https://doi.org/10
Davison, A.J., Reid, I.D., Molton, N.D., Stasse, O.: MonoSLAM: Real-time single camera SLAM. IEEE TPAMI29(6), 1052–1067 (2007).https://doi.org/10. 1109/TPAMI.2007.1049
-
[8]
In: International Conference on 3D Vision (3DV) (2025)
Duisterhof, B., Žust, L., Weinzaepfel, P., Leroy, V., Cabon, Y., Revaud, J.: MASt3R-SfM: A fully-integrated solution for unconstrained structure-from- motion. In: International Conference on 3D Vision (3DV) (2025)
work page 2025
-
[9]
In: Proceedings of the IEEE/CVF Con- ference on Computer Vision and Pattern Recognition (CVPR)
Edstedt, J., Athanasiadis, I., Wadenbäck, M., Felsberg, M.: Dkm: Dense kernelized feature matching for geometry estimation. In: Proceedings of the IEEE/CVF Con- ference on Computer Vision and Pattern Recognition (CVPR). pp. 17765–17775 (2023)
work page 2023
- [10]
-
[11]
In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2024)
Fu, Y., Liu, S., Kulkarni, A., Kautz, J., Efros, A.A., Wang, X.: COLMAP-Free 3d gaussian splatting. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2024)
work page 2024
-
[12]
arXiv preprint arXiv:2502.04640 (2025)
Han, H., Yang, H.: Building rome with convex optimization. arXiv preprint arXiv:2502.04640 (2025)
-
[13]
Harris, M., Sengupta, S., Owens, J.D.: Parallel prefix sum (scan) with cuda. In: GPU gems 3, vol. 3, pp. 851–876. Addison-Wesley Professional (2007)
work page 2007
-
[14]
Cam- bridge University Press (2003)
Hartley, R., Zisserman, A.: Multiple View Geometry in Computer Vision. Cam- bridge University Press (2003)
work page 2003
-
[15]
In: 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
He, X., Sun, J., Wang, Y., Peng, S., Huang, Q., Bao, H., Zhou, X.: Detector-Free Structure from Motion . In: 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 21594–21603 (2024)
work page 2024
-
[16]
In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Heinly, J., Schonberger, J.L., Chapman, E., Frahm, J.M.: Reconstructing the world∗ in six days. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). pp. 3287–3295 (2015)
work page 2015
-
[17]
Hong, J.H., Zach, C., Fitzgibbon, A., Cipolla, R.: Projective bundle adjustment from arbitrary initialization using the variable projection method. In: ECCV. vol. 9905, pp. 477–493 (2016).https://doi.org/10.1007/978-3-319-46448-0_29
-
[18]
Motiondiffuser: Controllable multi-agent motion prediction using diffusion
Iglesias, J.P., Nilsson, A., Olsson, C.: expOSE: Accurate initialization-free pro- jective factorization using exponential regularization. In: CVPR. pp. 8959–8968 (2023).https://doi.org/10.1109/CVPR52729.2023.00865 16 J. Chui and H. Andrade-Loarca and D. Cremers
-
[19]
Jensen, R., Dahl, A., Vogiatzis, G., Tola, E., Aanaes, H.: Large scale multi-view stereopsis evaluation. In: CVPR (2014)
work page 2014
-
[20]
International Journal of Computer Vision129(10), 2671–2704 (2021)
Jin, Y., Mishkin, D., Mishchuk, A., Matas, J., Fua, P., Yi, K.M., Trulls, E.: Image matching across wide baselines: From paper to practice. International Journal of Computer Vision129(10), 2671–2704 (2021)
work page 2021
-
[21]
Kaess, M., Johannsson, H., Roberts, R., Ila, V., Leonard, J., Dellaert, F.: iSAM2: Incremental smoothing and mapping with fluid relinearization and incremental variable reordering. In: IEEE Int. Conf. Robot. Autom. pp. 3281–3288 (2011). https://doi.org/10.1109/ICRA.2011.5979641
-
[22]
Kerbl, B., Kopanas, G., Leimkühler, T., Drettakis, G.: 3D gaussian splatting for real-time radiance field rendering. ACM TOG (2023)
work page 2023
-
[23]
In: IEEE and ACM International Symposium on Mixed and Augmented Reality (ISMAR)
Klein, G., Murray, D.: Parallel tracking and mapping for small AR workspaces. In: IEEE and ACM International Symposium on Mixed and Augmented Reality (ISMAR). pp. 225–234 (2007)
work page 2007
-
[24]
In: European Conference on Computer Vision (ECCV) (2024)
Leroy, V., Cabon, Y., Revaud, J.: Grounding image matching in 3D with MASt3R. In: European Conference on Computer Vision (ECCV) (2024)
work page 2024
-
[25]
ACM Transactions on Mathematical Software (TOMS)36(1), 1–30 (2009)
Lourakis, M.I., Argyros, A.A.: Sba: A software package for generic sparse bundle adjustment. ACM Transactions on Mathematical Software (TOMS)36(1), 1–30 (2009)
work page 2009
-
[26]
Com- munications of the ACM65(1), 99–106 (2021)
Mildenhall, B., Srinivasan, P.P., Tancik, M., Barron, J.T., Ramamoorthi, R., Ng, R.: NeRF: Representing scenes as neural radiance fields for view synthesis. Com- munications of the ACM65(1), 99–106 (2021)
work page 2021
-
[27]
IEEE Transactions on Robotics31(5), 1147–1163 (2015)
Mur-Artal, R., Montiel, J.M.M., Tardos, J.D.: ORB-SLAM: a versatile and accu- rate monocular SLAM system. IEEE Transactions on Robotics31(5), 1147–1163 (2015)
work page 2015
-
[28]
IEEE TPAMI26(6), 756–770 (2004)
Nistér, D.: An efficient solution to the five-point relative pose problem. IEEE TPAMI26(6), 756–770 (2004)
work page 2004
-
[29]
Acta Numerica26, 305–364 (2017)
Özyeşil,O.,Voroninski,V.,Basri,R.,Singer,A.:Asurveyofstructurefrommotion. Acta Numerica26, 305–364 (2017)
work page 2017
-
[30]
In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Schönberger, J.L., Frahm, J.M.: Structure-from-motion revisited. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). pp. 4104–4113 (2016)
work page 2016
-
[31]
In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Schöps, T., Schönberger, J.L., Galliani, S., Sattler, T., Schindler, K., Pollefeys, M., Geiger, A.: A multi-view stereo benchmark with high-resolution images and multi-camera videos. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). pp. 3239–3248 (2017)
work page 2017
-
[32]
Snavely, N., Seitz, S.M., Szeliski, R.: Photo tourism: Exploring photo collections in 3D. In: ACM SIGGRAPH 2006 Papers. pp. 835–846 (2006)
work page 2006
-
[33]
In: Robotics: Science and Systems (RSS) (2010)
Strasdat, H., Montiel, J.M.M., Davison, A.J.: Scale drift-aware large scale monoc- ular slam. In: Robotics: Science and Systems (RSS) (2010)
work page 2010
-
[34]
Sun, J., Shen, Z., Wang, Y., Bao, H., Zhou, X.: LoFTR: Detector-free local feature matching with transformers. In: CVPR (2021)
work page 2021
-
[35]
In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
Sünderhauf, N., Protzel, P.: Switchable constraints for robust pose graph slam. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). pp. 1879–1884. IEEE (2012)
work page 2012
-
[36]
In: International Workshop on Vision Algorithms
Triggs, B., McLauchlan, P.F., Hartley, R.I., Fitzgibbon, A.W.: Bundle adjustment — a modern synthesis. In: International Workshop on Vision Algorithms. pp. 298–
-
[37]
In: Proceedings of the IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition (CVPR)
Truong, P., Rakotosaona, M.J., Manhardt, F., Tombari, F.: SPARF: Neural radi- ance fields from sparse and noisy poses. In: Proceedings of the IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition (CVPR). pp. 4190–4200 (2023) ProBA 17
work page 2023
-
[38]
In: Proceedings of the IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition (2025)
Wang, J., Chen, M., Karaev, N., Vedaldi, A., Rupprecht, C., Novotny, D.: Vggt: Visual geometry grounded transformer. In: Proceedings of the IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition (2025)
work page 2025
-
[39]
Wang, J., Karaev, N., Rupprecht, C., Novotny, D.: Vggsfm: Visual geometry group structure-from-motion.In:ProceedingsoftheIEEE/CVFConferenceonComputer Vision and Pattern Recognition (CVPR) (2024)
work page 2024
-
[40]
In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Wang, S., Leroy, V., Cabon, Y., Chidlovskii, B., Revaud, J.: DUSt3R: Geometric 3D vision made easy. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 20697–20709 (2024)
work page 2024
-
[41]
Weber, S., Hong, J.H., Cremers, D.: Power variable projection for initialization-free large-scale bundle adjustment. In: ECCV (2024)
work page 2024
-
[42]
Wilson, K., Wehrwein, S.: Visualizing spectral bundle adjustment uncertainty. In: Int. Conf. 3D Vis. pp. 663–671 (2020).https://doi.org/10.1109/3DV50981. 2020.00076
-
[43]
IEEE Transactions on Robotics37(2), 314–333 (2020).https://doi.org/ 10.1109/TRO.2020.3033695
Yang, H., Shi, J., Carlone, L.: TEASER: Fast and certifiable point cloud registra- tion. IEEE Transactions on Robotics37(2), 314–333 (2020).https://doi.org/ 10.1109/TRO.2020.3033695
-
[44]
Zach, C., Hong, J.H.: pOSE: Pseudo object space error for initialization-free bundle adjustment. In: CVPR. pp. 1876–1885 (2018).https://doi.org/10.1109/CVPR. 2018.00201 18 J. Chui and H. Andrade-Loarca and D. Cremers A Appendix A.1 Minimum Spanning Tree and View Graph Construction Algorithm 1Minimum Spanning Tree Construction Require:Set of uncalibrated i...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.