ProBA: Probabilistic Bundle Adjustment with the Bhattacharyya Coefficient

Daniel Cremers; Hector Andrade-Loarca; Jason Chui

arxiv: 2505.20858 · v2 · submitted 2025-05-27 · 💻 cs.CV

ProBA: Probabilistic Bundle Adjustment with the Bhattacharyya Coefficient

Jason Chui , Hector Andrade-Loarca , Daniel Cremers This is my paper

Pith reviewed 2026-05-19 12:54 UTC · model grok-4.3

classification 💻 cs.CV

keywords bundle adjustmentstructure from motionprobabilistic optimization3D Gaussian landmarksnegative log-likelihoodBhattacharyya coefficientview graphcamera pose estimation

0 comments

The pith

Probabilistic Bundle Adjustment enables joint optimization of camera poses and 3D geometry from random initialization using uncertain Gaussian landmarks.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes a probabilistic re-parameterization of bundle adjustment that removes the need for precise initial camera poses or known intrinsics. Landmarks are modeled as 3D Gaussians whose uncertainty is folded into a single Negative Log-Likelihood objective. Correspondences are automatically down-weighted according to how much their distributions overlap, and a sparse view graph is maintained by iteratively re-weighting edges to drop bad connections. Mirror ambiguities are handled by keeping two opposing hypotheses during optimization. If the approach works, structure-from-motion and SLAM systems could start from noisy dense matches and still produce accurate results in environments where classical pipelines fail.

Core claim

ProBA re-parameterizes the bundle adjustment manifold so that extrinsics, focal lengths, and scene geometry are optimized together from a strict cold start. Landmarks are represented as 3D Gaussians and the objective is a unified Negative Log-Likelihood that incorporates the Bhattacharyya coefficient to measure spatial consistency. A sparse view graph is optimized with an iterative adaptive edge-weighting scheme that prunes erroneous topological links, and a dual-hypothesis regularization resolves mirror ambiguities.

What carries the argument

The representation of landmarks as 3D Gaussians combined with a unified Negative Log-Likelihood objective that uses the Bhattacharyya coefficient to weight correspondences by statistical confidence.

If this is right

Joint optimization of extrinsics, focal lengths, and geometry becomes feasible without metric initialization.
The basin of attraction for convergence is expanded, allowing recovery from poorer starting points.
Erroneous links are pruned from the view graph while global consistency is maintained.
Mirror ambiguities are resolved, enabling prior-free SfM to succeed on symmetric scenes.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same Gaussian uncertainty model could be applied incrementally to support online updates in SLAM systems.
Replacing rigid track-building with probabilistic edge weighting may simplify integration of dense matchers into existing SfM pipelines.
The dual-hypothesis mechanism suggests a general pattern for disambiguating other geometric symmetries in multi-view reconstruction.

Load-bearing premise

The assumption that an iterative adaptive edge-weighting mechanism on a sparse view graph can reliably prune erroneous topological links while preserving global consistency, without introducing new biases or disconnecting the graph.

What would settle it

Running ProBA from random camera poses and unknown focal lengths on a benchmark SfM dataset with ground-truth poses and checking whether the final reconstruction error is higher than that obtained by classical bundle adjustment started from approximate poses.

Figures

Figures reproduced from arXiv: 2505.20858 by Daniel Cremers, Hector Andrade-Loarca, Jason Chui.

**Figure 2.** Figure 2: System Overview. Our framework processes unordered images in three stages: (1) View Graph Construction via LoFTR [34] to build a topological prior (solid: minimum spanning tree edges, dash: auxiliary edges); (2) Finding Correspondences via DKM [9] to extract dense pairwise constraints; and (3) ProBA Optimization, which iteratively refines the edge weights and the global scene from a strict cold start. Th… view at source ↗

**Figure 3.** Figure 3: Visualizing Mirror Ambiguity Resolution. [PITH_FULL_IMAGE:figures/full_fig_p009_3.png] view at source ↗

**Figure 4.** Figure 4: Hyperparameter sensitivity and runtime. (Left) Impact of L3D weight (W3D) on relative translation accuracy (RTA) across varying graph densities (Nneig). (Right) Runtime comparison on ETH3D; ProBA maintains a predictable footprint, bypassing the severe scaling bottlenecks of dense track-building [PITH_FULL_IMAGE:figures/full_fig_p012_4.png] view at source ↗

**Figure 5.** Figure 5: Qualitative Reconstructions. Estimated (red) and ground-truth (grey) camera poses and point clouds across the tested datasets (DTU, Mip-NeRF 360, ETH3D, and IMC2021). Left to right: Input, COLMAP, COLMAP+DKM, COLMAP+RoMa, MASt3R-SfM, VGGT, and ProBA. While classical methods yield sparse or shattered geometries when forced to use dense matches, ProBA consistently recovers coherent topology and accurate tra… view at source ↗

**Figure 6.** Figure 6: Joint evolution of camera poses and 3D Gaussians. [PITH_FULL_IMAGE:figures/full_fig_p020_6.png] view at source ↗

**Figure 7.** Figure 7: Evolution of the two hypothesis worlds and their corresponding loss [PITH_FULL_IMAGE:figures/full_fig_p022_7.png] view at source ↗

**Figure 8.** Figure 8: Robustness to Controlled Noise. Relative Translation Accuracy (RTA@10◦ , left) and Relative Rotation Accuracy (RRA@10◦ , right) evaluated across varying noise magnitudes (σ, x-axis) and ratios (r, colored lines) on the ETH3D kicker scene. The method maintains high accuracy under both widespread mild noise and sparse gross outliers, degrading only when both parameters are extreme. To evaluate robustness to … view at source ↗

**Figure 9.** Figure 9: Evolution of Isotropic vs. Anisotropic 3D Gaussians during joint [PITH_FULL_IMAGE:figures/full_fig_p025_9.png] view at source ↗

**Figure 10.** Figure 10: Convergence Analysis on ETH3D. Mean Relative Rotation Accuracy (RRA@5◦ ) and Relative Translation Accuracy (RTA@5◦ ) across 30,000 iterations. Grey lines represent individual scenes in ETH3D dataset, while the blue line denotes the mean of them. The optimization demonstrates rapid initial progress up to ∼15,000 iterations, followed by a slower "long tail" convergence characteristic of first-order optimiza… view at source ↗

**Figure 11.** Figure 11: Visualization of Focal-Depth Ambiguity. Left: Input images. Right: Estimated poses (red), ground truth poses (grey), and the reconstructed scene. angles, yielding RRA comparable to baselines. However, lacking 3D structure, translation remains ambiguous. The optimizer freely pushes cameras away along the optical axis while increasing focal length, minimizing reprojection loss, but distorting estimated dis… view at source ↗

read the original abstract

Classical Bundle Adjustment (BA) is fundamentally limited by its reliance on precise metric initialization and prior camera intrinsics. While modern dense matchers offer high-fidelity correspondences, traditional Structure-from-Motion (SfM) pipelines struggle to leverage them, as rigid track-building heuristics fail in the presence of their inherent noise. We present \textbf{ProBA (Probabilistic Bundle Adjustment)}, a probabilistic re-parameterization of the BA manifold that enables joint optimization of extrinsics, focal lengths, and geometry from a strict cold start. By replacing fragile point tracks with a flexible kinematic pose graph and representing landmarks as 3D Gaussians, our framework explicitly models spatial uncertainty through a unified Negative Log-Likelihood (NLL) objective. This volumetric formulation smooths the non-convex optimization landscape and naturally weights correspondences by their statistical confidence. To maintain global consistency, we optimize over a sparse view graph using an iterative, adaptive edge-weighting mechanism to prune erroneous topological links. Furthermore, we resolve mirror ambiguities inherent to prior-free SfM via a dual-hypothesis regularization strategy. Extensive evaluations show that our approach significantly expands the basin of attraction and achieves superior accuracy over both classical and learning-based baselines, providing a scalable foundation that greatly benefits SfM and SLAM robustness in unstructured environments.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

ProBA tries a probabilistic rewrite of bundle adjustment with 3D Gaussians and Bhattacharyya NLL to handle cold starts, but the adaptive view-graph pruning is the part that still needs proof it won't break connectivity.

read the letter

ProBA re-parameterizes bundle adjustment to work from a cold start by treating landmarks as 3D Gaussians and using the Bhattacharyya coefficient in a unified negative log-likelihood. It swaps out point tracks for a kinematic pose graph and adds adaptive edge weighting to clean up bad connections in the view graph, plus a dual-hypothesis fix for mirror flips. The new part is pulling together Gaussian landmarks with that specific coefficient for the loss, which lets the optimization naturally down-weight uncertain matches. This seems like a direct response to the problems dense matchers create for traditional SfM pipelines. The idea of smoothing the landscape through volumetric uncertainty modeling is sensible and could help in unstructured scenes where classical methods fail. On the positive side, the framework tries to optimize extrinsics, focals, and geometry all at once without priors, which is ambitious. If the math holds, it could make SfM more reliable for robotics applications. The main concern is the adaptive edge-weighting. The description says it prunes erroneous topological links iteratively to keep global consistency, but there's no detail on how the weights adapt or what prevents the graph from falling apart in high-noise regions. If good edges get down-weighted too much, you end up with locally good but globally wrong solutions, which would explain why the robustness claims need more backing. The abstract also skips the actual equations and any error analysis, so it's tough to judge if the accuracy improvements are real or just from careful tuning. This paper is aimed at computer vision folks doing SfM or SLAM who want probabilistic tools for noisy data. A reader working on robust reconstruction methods would get something out of the formulation, even if they end up adapting parts of it. It has enough substance to warrant a serious referee who can dig into the derivations and experiments. I would recommend sending it to peer review.

Referee Report

2 major / 2 minor

Summary. The manuscript presents ProBA, a probabilistic reformulation of bundle adjustment. It replaces point tracks with a kinematic pose graph, models landmarks as 3D Gaussians, and optimizes a unified negative log-likelihood objective that incorporates the Bhattacharyya coefficient to explicitly account for spatial uncertainty. The method claims to perform joint optimization of extrinsics, focal lengths, and geometry from a strict cold start, using an iterative adaptive edge-weighting scheme on a sparse view graph to prune erroneous links and a dual-hypothesis regularization to resolve mirror ambiguities. Experiments are reported to show an expanded basin of attraction and higher accuracy than both classical and learning-based baselines on SfM and SLAM tasks in unstructured environments.

Significance. If the central claims are supported by the derivations and experiments, the work would be significant for SfM and SLAM pipelines that must operate without reliable initialization or clean tracks. The volumetric Gaussian representation and Bhattacharyya-based weighting provide a principled way to smooth the non-convex landscape and down-weight noisy correspondences, which could improve robustness when dense matchers are used. The absence of any mention of machine-checked proofs, fully reproducible code releases, or parameter-free derivations limits the immediate strength of the contribution.

major comments (2)

[Abstract and §3 (Method)] Abstract and §3 (Method): the iterative adaptive edge-weighting mechanism on the sparse view graph is presented as the safeguard that prunes erroneous topological links while preserving global consistency. No explicit adaptation rule, threshold schedule, or connectivity invariant is supplied. This is load-bearing for the cold-start robustness claim; without such safeguards the weighting can sever valid edges or bias the graph in high-noise regions typical of dense matchers, directly risking the reported gains in basin of attraction and accuracy.
[§4 (Experiments)] §4 (Experiments): the superiority claims rest on comparisons with classical and learning-based baselines, yet the manuscript supplies no ablation isolating the contribution of the adaptive weighting versus the Gaussian landmark model or the dual-hypothesis term. Without these controls it is impossible to attribute the accuracy improvements to the probabilistic formulation rather than post-hoc tuning.

minor comments (2)

[Abstract] Abstract: the unified NLL objective is described at a high level but no equations are shown, making it difficult for readers to verify how the Bhattacharyya coefficient is combined with the Gaussian landmark covariances.
[§2 (Related Work) or §3 (Method)] Notation: the kinematic pose graph is introduced without a clear definition of its state variables or how it differs from a standard pose graph; a short table or diagram would improve clarity.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback and detailed comments on our manuscript. We address each major comment point by point below, acknowledging where clarifications and additions are needed to strengthen the presentation of the adaptive weighting mechanism and experimental analysis.

read point-by-point responses

Referee: [Abstract and §3 (Method)] Abstract and §3 (Method): the iterative adaptive edge-weighting mechanism on the sparse view graph is presented as the safeguard that prunes erroneous topological links while preserving global consistency. No explicit adaptation rule, threshold schedule, or connectivity invariant is supplied. This is load-bearing for the cold-start robustness claim; without such safeguards the weighting can sever valid edges or bias the graph in high-noise regions typical of dense matchers, directly risking the reported gains in basin of attraction and accuracy.

Authors: We agree that the current description of the iterative adaptive edge-weighting mechanism is insufficiently detailed for a load-bearing component of the cold-start robustness claim. The manuscript introduces the mechanism conceptually but does not supply the explicit adaptation rule, threshold schedule, or connectivity invariant. In the revised version we will expand §3 with the precise mathematical update rule for edge weights (based on the Bhattacharyya coefficient between landmark Gaussians), the schedule for threshold adaptation across iterations, and a short proof sketch showing that the pruning step preserves a connected view graph under the stated assumptions. These additions will directly address the risk of severing valid edges in high-noise regimes. revision: yes
Referee: [§4 (Experiments)] §4 (Experiments): the superiority claims rest on comparisons with classical and learning-based baselines, yet the manuscript supplies no ablation isolating the contribution of the adaptive weighting versus the Gaussian landmark model or the dual-hypothesis term. Without these controls it is impossible to attribute the accuracy improvements to the probabilistic formulation rather than post-hoc tuning.

Authors: The referee is correct that the absence of component-wise ablations makes it difficult to attribute gains specifically to the probabilistic formulation. The reported experiments compare the full ProBA pipeline against baselines but do not isolate the adaptive weighting, the 3D Gaussian landmark representation, or the dual-hypothesis regularization. We will add a dedicated ablation study in the revised §4 that reports performance when each of these three elements is disabled in turn, using the same evaluation protocol on the SfM and SLAM benchmarks. This will allow readers to quantify the individual contributions and reduce the possibility that results stem from post-hoc tuning. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation relies on external match confidences and standard probabilistic modeling

full rationale

The paper defines its core objective as a unified Negative Log-Likelihood over 3D Gaussian landmarks and a kinematic pose graph, with edge weights derived from Bhattacharyya coefficients on external dense matcher outputs. No equation reduces a fitted parameter to a renamed prediction, no self-citation supplies a load-bearing uniqueness theorem, and the adaptive weighting is presented as an iterative mechanism operating on an independently supplied sparse view graph rather than being defined tautologically by the target accuracy metric. The cold-start claim is supported by the volumetric smoothing property of the NLL, which is independent of the final reported numbers.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 2 invented entities

The central claim rests on modeling choices (3D Gaussians for landmarks, kinematic pose graph, adaptive edge weighting) that are introduced without independent evidence or prior derivation in the abstract; no explicit free parameters are named but the pruning and regularization steps implicitly introduce tunable mechanisms.

axioms (2)

domain assumption The non-convex optimization landscape is smoothed sufficiently by the volumetric Gaussian formulation to allow reliable convergence from cold start.
Invoked when claiming expansion of the basin of attraction.
domain assumption Erroneous topological links can be identified and pruned by iterative adaptive edge weighting without disconnecting the global graph.
Central to maintaining consistency in the sparse view graph.

invented entities (2)

3D Gaussian landmarks no independent evidence
purpose: Represent spatial uncertainty of points instead of fixed 3D coordinates
New representation introduced to enable the unified NLL objective.
kinematic pose graph no independent evidence
purpose: Replace fragile point tracks with flexible pose connections
Introduced to handle noise from dense matchers.

pith-pipeline@v0.9.0 · 5755 in / 1551 out tokens · 47188 ms · 2026-05-19T12:54:40.869385+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

44 extracted references · 44 canonical work pages

[1]

In: European Conference on Computer Vision (ECCV)

Agarwal, S., Snavely, N., Seitz, S.M., Szeliski, R.: Bundle adjustment in the large. In: European Conference on Computer Vision (ECCV). pp. 29–42. Springer (2010)

work page 2010
[2]

In: Proceedings of the IEEE International Conference on Computer Vision (ICCV)

Agarwal, S., Snavely, N., Simon, I., Seitz, S.M., Szeliski, R.: Building rome in a day. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV). pp. 72–79 (2009)

work page 2009
[3]

SIAM Journal on Imaging Sciences9(4), 1963–1990 (2016)

Arrigoni, F., Rossi, B., Fusiello, A.: Spectral synchronization of multiple views in SE(3). SIAM Journal on Imaging Sciences9(4), 1963–1990 (2016)

work page 1963
[4]

In: CVPR (2022)

Barron, J.T., Mildenhall, B., Verbin, D., Srinivasan, P.P., Hedman, P.: Mip-NeRF 360: Unbounded anti-aliased neural radiance fields. In: CVPR (2022)

work page 2022
[5]

Besl and Neil D

Besl, P., McKay, N.D.: A method for registration of 3-d shapes. IEEE Transactions onPatternAnalysisandMachineIntelligence14(2),239–256(1992).https://doi. org/10.1109/34.121791

work page doi:10.1109/34.121791 1992
[6]

In: ICCV

Chatterjee, A., Govindu, V.M.: Efficient and robust large-scale rotation averaging. In: ICCV. pp. 521–528 (2013)

work page 2013
[7]

IEEE TPAMI29(6), 1052–1067 (2007).https://doi.org/10

Davison, A.J., Reid, I.D., Molton, N.D., Stasse, O.: MonoSLAM: Real-time single camera SLAM. IEEE TPAMI29(6), 1052–1067 (2007).https://doi.org/10. 1109/TPAMI.2007.1049

work page arXiv 2007
[8]

In: International Conference on 3D Vision (3DV) (2025)

Duisterhof, B., Žust, L., Weinzaepfel, P., Leroy, V., Cabon, Y., Revaud, J.: MASt3R-SfM: A fully-integrated solution for unconstrained structure-from- motion. In: International Conference on 3D Vision (3DV) (2025)

work page 2025
[9]

In: Proceedings of the IEEE/CVF Con- ference on Computer Vision and Pattern Recognition (CVPR)

Edstedt, J., Athanasiadis, I., Wadenbäck, M., Felsberg, M.: Dkm: Dense kernelized feature matching for geometry estimation. In: Proceedings of the IEEE/CVF Con- ference on Computer Vision and Pattern Recognition (CVPR). pp. 17765–17775 (2023)

work page 2023
[10]

In: CVPR

Edstedt, J., Sun, Q., Bökman, G., Wadenbäck, M., Felsberg, M.: RoMa: Robust dense feature matching. In: CVPR. pp. 19790–19800 (2024)

work page 2024
[11]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2024)

Fu, Y., Liu, S., Kulkarni, A., Kautz, J., Efros, A.A., Wang, X.: COLMAP-Free 3d gaussian splatting. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2024)

work page 2024
[12]

arXiv preprint arXiv:2502.04640 (2025)

Han, H., Yang, H.: Building rome with convex optimization. arXiv preprint arXiv:2502.04640 (2025)

work page arXiv 2025
[13]

In: GPU gems 3, vol

Harris, M., Sengupta, S., Owens, J.D.: Parallel prefix sum (scan) with cuda. In: GPU gems 3, vol. 3, pp. 851–876. Addison-Wesley Professional (2007)

work page 2007
[14]

Cam- bridge University Press (2003)

Hartley, R., Zisserman, A.: Multiple View Geometry in Computer Vision. Cam- bridge University Press (2003)

work page 2003
[15]

In: 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

He, X., Sun, J., Wang, Y., Peng, S., Huang, Q., Bao, H., Zhou, X.: Detector-Free Structure from Motion . In: 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 21594–21603 (2024)

work page 2024
[16]

In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Heinly, J., Schonberger, J.L., Chapman, E., Frahm, J.M.: Reconstructing the world∗ in six days. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). pp. 3287–3295 (2015)

work page 2015
[17]

In: ECCV

Hong, J.H., Zach, C., Fitzgibbon, A., Cipolla, R.: Projective bundle adjustment from arbitrary initialization using the variable projection method. In: ECCV. vol. 9905, pp. 477–493 (2016).https://doi.org/10.1007/978-3-319-46448-0_29

work page doi:10.1007/978-3-319-46448-0_29 2016
[18]

Motiondiffuser: Controllable multi-agent motion prediction using diffusion

Iglesias, J.P., Nilsson, A., Olsson, C.: expOSE: Accurate initialization-free pro- jective factorization using exponential regularization. In: CVPR. pp. 8959–8968 (2023).https://doi.org/10.1109/CVPR52729.2023.00865 16 J. Chui and H. Andrade-Loarca and D. Cremers

work page doi:10.1109/cvpr52729.2023.00865 2023
[19]

In: CVPR (2014)

Jensen, R., Dahl, A., Vogiatzis, G., Tola, E., Aanaes, H.: Large scale multi-view stereopsis evaluation. In: CVPR (2014)

work page 2014
[20]

International Journal of Computer Vision129(10), 2671–2704 (2021)

Jin, Y., Mishkin, D., Mishchuk, A., Matas, J., Fua, P., Yi, K.M., Trulls, E.: Image matching across wide baselines: From paper to practice. International Journal of Computer Vision129(10), 2671–2704 (2021)

work page 2021
[21]

In: IEEE Int

Kaess, M., Johannsson, H., Roberts, R., Ila, V., Leonard, J., Dellaert, F.: iSAM2: Incremental smoothing and mapping with fluid relinearization and incremental variable reordering. In: IEEE Int. Conf. Robot. Autom. pp. 3281–3288 (2011). https://doi.org/10.1109/ICRA.2011.5979641

work page doi:10.1109/icra.2011.5979641 2011
[22]

ACM TOG (2023)

Kerbl, B., Kopanas, G., Leimkühler, T., Drettakis, G.: 3D gaussian splatting for real-time radiance field rendering. ACM TOG (2023)

work page 2023
[23]

In: IEEE and ACM International Symposium on Mixed and Augmented Reality (ISMAR)

Klein, G., Murray, D.: Parallel tracking and mapping for small AR workspaces. In: IEEE and ACM International Symposium on Mixed and Augmented Reality (ISMAR). pp. 225–234 (2007)

work page 2007
[24]

In: European Conference on Computer Vision (ECCV) (2024)

Leroy, V., Cabon, Y., Revaud, J.: Grounding image matching in 3D with MASt3R. In: European Conference on Computer Vision (ECCV) (2024)

work page 2024
[25]

ACM Transactions on Mathematical Software (TOMS)36(1), 1–30 (2009)

Lourakis, M.I., Argyros, A.A.: Sba: A software package for generic sparse bundle adjustment. ACM Transactions on Mathematical Software (TOMS)36(1), 1–30 (2009)

work page 2009
[26]

Com- munications of the ACM65(1), 99–106 (2021)

Mildenhall, B., Srinivasan, P.P., Tancik, M., Barron, J.T., Ramamoorthi, R., Ng, R.: NeRF: Representing scenes as neural radiance fields for view synthesis. Com- munications of the ACM65(1), 99–106 (2021)

work page 2021
[27]

IEEE Transactions on Robotics31(5), 1147–1163 (2015)

Mur-Artal, R., Montiel, J.M.M., Tardos, J.D.: ORB-SLAM: a versatile and accu- rate monocular SLAM system. IEEE Transactions on Robotics31(5), 1147–1163 (2015)

work page 2015
[28]

IEEE TPAMI26(6), 756–770 (2004)

Nistér, D.: An efficient solution to the five-point relative pose problem. IEEE TPAMI26(6), 756–770 (2004)

work page 2004
[29]

Acta Numerica26, 305–364 (2017)

Özyeşil,O.,Voroninski,V.,Basri,R.,Singer,A.:Asurveyofstructurefrommotion. Acta Numerica26, 305–364 (2017)

work page 2017
[30]

In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Schönberger, J.L., Frahm, J.M.: Structure-from-motion revisited. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). pp. 4104–4113 (2016)

work page 2016
[31]

In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Schöps, T., Schönberger, J.L., Galliani, S., Sattler, T., Schindler, K., Pollefeys, M., Geiger, A.: A multi-view stereo benchmark with high-resolution images and multi-camera videos. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). pp. 3239–3248 (2017)

work page 2017
[32]

In: ACM SIGGRAPH 2006 Papers

Snavely, N., Seitz, S.M., Szeliski, R.: Photo tourism: Exploring photo collections in 3D. In: ACM SIGGRAPH 2006 Papers. pp. 835–846 (2006)

work page 2006
[33]

In: Robotics: Science and Systems (RSS) (2010)

Strasdat, H., Montiel, J.M.M., Davison, A.J.: Scale drift-aware large scale monoc- ular slam. In: Robotics: Science and Systems (RSS) (2010)

work page 2010
[34]

In: CVPR (2021)

Sun, J., Shen, Z., Wang, Y., Bao, H., Zhou, X.: LoFTR: Detector-free local feature matching with transformers. In: CVPR (2021)

work page 2021
[35]

In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

Sünderhauf, N., Protzel, P.: Switchable constraints for robust pose graph slam. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). pp. 1879–1884. IEEE (2012)

work page 2012
[36]

In: International Workshop on Vision Algorithms

Triggs, B., McLauchlan, P.F., Hartley, R.I., Fitzgibbon, A.W.: Bundle adjustment — a modern synthesis. In: International Workshop on Vision Algorithms. pp. 298–

work page
[37]

In: Proceedings of the IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition (CVPR)

Truong, P., Rakotosaona, M.J., Manhardt, F., Tombari, F.: SPARF: Neural radi- ance fields from sparse and noisy poses. In: Proceedings of the IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition (CVPR). pp. 4190–4200 (2023) ProBA 17

work page 2023
[38]

In: Proceedings of the IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition (2025)

Wang, J., Chen, M., Karaev, N., Vedaldi, A., Rupprecht, C., Novotny, D.: Vggt: Visual geometry grounded transformer. In: Proceedings of the IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition (2025)

work page 2025
[39]

Wang, J., Karaev, N., Rupprecht, C., Novotny, D.: Vggsfm: Visual geometry group structure-from-motion.In:ProceedingsoftheIEEE/CVFConferenceonComputer Vision and Pattern Recognition (CVPR) (2024)

work page 2024
[40]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Wang, S., Leroy, V., Cabon, Y., Chidlovskii, B., Revaud, J.: DUSt3R: Geometric 3D vision made easy. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 20697–20709 (2024)

work page 2024
[41]

In: ECCV (2024)

Weber, S., Hong, J.H., Cremers, D.: Power variable projection for initialization-free large-scale bundle adjustment. In: ECCV (2024)

work page 2024
[42]

Wilson, K., Wehrwein, S.: Visualizing spectral bundle adjustment uncertainty. In: Int. Conf. 3D Vis. pp. 663–671 (2020).https://doi.org/10.1109/3DV50981. 2020.00076

work page doi:10.1109/3dv50981 2020
[43]

IEEE Transactions on Robotics37(2), 314–333 (2020).https://doi.org/ 10.1109/TRO.2020.3033695

Yang, H., Shi, J., Carlone, L.: TEASER: Fast and certifiable point cloud registra- tion. IEEE Transactions on Robotics37(2), 314–333 (2020).https://doi.org/ 10.1109/TRO.2020.3033695

work page doi:10.1109/tro.2020.3033695 2020
[44]

long tail

Zach, C., Hong, J.H.: pOSE: Pseudo object space error for initialization-free bundle adjustment. In: CVPR. pp. 1876–1885 (2018).https://doi.org/10.1109/CVPR. 2018.00201 18 J. Chui and H. Andrade-Loarca and D. Cremers A Appendix A.1 Minimum Spanning Tree and View Graph Construction Algorithm 1Minimum Spanning Tree Construction Require:Set of uncalibrated i...

work page doi:10.1109/cvpr 2018

[1] [1]

In: European Conference on Computer Vision (ECCV)

Agarwal, S., Snavely, N., Seitz, S.M., Szeliski, R.: Bundle adjustment in the large. In: European Conference on Computer Vision (ECCV). pp. 29–42. Springer (2010)

work page 2010

[2] [2]

In: Proceedings of the IEEE International Conference on Computer Vision (ICCV)

Agarwal, S., Snavely, N., Simon, I., Seitz, S.M., Szeliski, R.: Building rome in a day. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV). pp. 72–79 (2009)

work page 2009

[3] [3]

SIAM Journal on Imaging Sciences9(4), 1963–1990 (2016)

Arrigoni, F., Rossi, B., Fusiello, A.: Spectral synchronization of multiple views in SE(3). SIAM Journal on Imaging Sciences9(4), 1963–1990 (2016)

work page 1963

[4] [4]

In: CVPR (2022)

Barron, J.T., Mildenhall, B., Verbin, D., Srinivasan, P.P., Hedman, P.: Mip-NeRF 360: Unbounded anti-aliased neural radiance fields. In: CVPR (2022)

work page 2022

[5] [5]

Besl and Neil D

Besl, P., McKay, N.D.: A method for registration of 3-d shapes. IEEE Transactions onPatternAnalysisandMachineIntelligence14(2),239–256(1992).https://doi. org/10.1109/34.121791

work page doi:10.1109/34.121791 1992

[6] [6]

In: ICCV

Chatterjee, A., Govindu, V.M.: Efficient and robust large-scale rotation averaging. In: ICCV. pp. 521–528 (2013)

work page 2013

[7] [7]

IEEE TPAMI29(6), 1052–1067 (2007).https://doi.org/10

Davison, A.J., Reid, I.D., Molton, N.D., Stasse, O.: MonoSLAM: Real-time single camera SLAM. IEEE TPAMI29(6), 1052–1067 (2007).https://doi.org/10. 1109/TPAMI.2007.1049

work page arXiv 2007

[8] [8]

In: International Conference on 3D Vision (3DV) (2025)

Duisterhof, B., Žust, L., Weinzaepfel, P., Leroy, V., Cabon, Y., Revaud, J.: MASt3R-SfM: A fully-integrated solution for unconstrained structure-from- motion. In: International Conference on 3D Vision (3DV) (2025)

work page 2025

[9] [9]

In: Proceedings of the IEEE/CVF Con- ference on Computer Vision and Pattern Recognition (CVPR)

Edstedt, J., Athanasiadis, I., Wadenbäck, M., Felsberg, M.: Dkm: Dense kernelized feature matching for geometry estimation. In: Proceedings of the IEEE/CVF Con- ference on Computer Vision and Pattern Recognition (CVPR). pp. 17765–17775 (2023)

work page 2023

[10] [10]

In: CVPR

Edstedt, J., Sun, Q., Bökman, G., Wadenbäck, M., Felsberg, M.: RoMa: Robust dense feature matching. In: CVPR. pp. 19790–19800 (2024)

work page 2024

[11] [11]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2024)

Fu, Y., Liu, S., Kulkarni, A., Kautz, J., Efros, A.A., Wang, X.: COLMAP-Free 3d gaussian splatting. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2024)

work page 2024

[12] [12]

arXiv preprint arXiv:2502.04640 (2025)

Han, H., Yang, H.: Building rome with convex optimization. arXiv preprint arXiv:2502.04640 (2025)

work page arXiv 2025

[13] [13]

In: GPU gems 3, vol

Harris, M., Sengupta, S., Owens, J.D.: Parallel prefix sum (scan) with cuda. In: GPU gems 3, vol. 3, pp. 851–876. Addison-Wesley Professional (2007)

work page 2007

[14] [14]

Cam- bridge University Press (2003)

Hartley, R., Zisserman, A.: Multiple View Geometry in Computer Vision. Cam- bridge University Press (2003)

work page 2003

[15] [15]

In: 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

He, X., Sun, J., Wang, Y., Peng, S., Huang, Q., Bao, H., Zhou, X.: Detector-Free Structure from Motion . In: 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 21594–21603 (2024)

work page 2024

[16] [16]

In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Heinly, J., Schonberger, J.L., Chapman, E., Frahm, J.M.: Reconstructing the world∗ in six days. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). pp. 3287–3295 (2015)

work page 2015

[17] [17]

In: ECCV

Hong, J.H., Zach, C., Fitzgibbon, A., Cipolla, R.: Projective bundle adjustment from arbitrary initialization using the variable projection method. In: ECCV. vol. 9905, pp. 477–493 (2016).https://doi.org/10.1007/978-3-319-46448-0_29

work page doi:10.1007/978-3-319-46448-0_29 2016

[18] [18]

Motiondiffuser: Controllable multi-agent motion prediction using diffusion

Iglesias, J.P., Nilsson, A., Olsson, C.: expOSE: Accurate initialization-free pro- jective factorization using exponential regularization. In: CVPR. pp. 8959–8968 (2023).https://doi.org/10.1109/CVPR52729.2023.00865 16 J. Chui and H. Andrade-Loarca and D. Cremers

work page doi:10.1109/cvpr52729.2023.00865 2023

[19] [19]

In: CVPR (2014)

Jensen, R., Dahl, A., Vogiatzis, G., Tola, E., Aanaes, H.: Large scale multi-view stereopsis evaluation. In: CVPR (2014)

work page 2014

[20] [20]

International Journal of Computer Vision129(10), 2671–2704 (2021)

Jin, Y., Mishkin, D., Mishchuk, A., Matas, J., Fua, P., Yi, K.M., Trulls, E.: Image matching across wide baselines: From paper to practice. International Journal of Computer Vision129(10), 2671–2704 (2021)

work page 2021

[21] [21]

In: IEEE Int

Kaess, M., Johannsson, H., Roberts, R., Ila, V., Leonard, J., Dellaert, F.: iSAM2: Incremental smoothing and mapping with fluid relinearization and incremental variable reordering. In: IEEE Int. Conf. Robot. Autom. pp. 3281–3288 (2011). https://doi.org/10.1109/ICRA.2011.5979641

work page doi:10.1109/icra.2011.5979641 2011

[22] [22]

ACM TOG (2023)

Kerbl, B., Kopanas, G., Leimkühler, T., Drettakis, G.: 3D gaussian splatting for real-time radiance field rendering. ACM TOG (2023)

work page 2023

[23] [23]

In: IEEE and ACM International Symposium on Mixed and Augmented Reality (ISMAR)

Klein, G., Murray, D.: Parallel tracking and mapping for small AR workspaces. In: IEEE and ACM International Symposium on Mixed and Augmented Reality (ISMAR). pp. 225–234 (2007)

work page 2007

[24] [24]

In: European Conference on Computer Vision (ECCV) (2024)

Leroy, V., Cabon, Y., Revaud, J.: Grounding image matching in 3D with MASt3R. In: European Conference on Computer Vision (ECCV) (2024)

work page 2024

[25] [25]

ACM Transactions on Mathematical Software (TOMS)36(1), 1–30 (2009)

Lourakis, M.I., Argyros, A.A.: Sba: A software package for generic sparse bundle adjustment. ACM Transactions on Mathematical Software (TOMS)36(1), 1–30 (2009)

work page 2009

[26] [26]

Com- munications of the ACM65(1), 99–106 (2021)

Mildenhall, B., Srinivasan, P.P., Tancik, M., Barron, J.T., Ramamoorthi, R., Ng, R.: NeRF: Representing scenes as neural radiance fields for view synthesis. Com- munications of the ACM65(1), 99–106 (2021)

work page 2021

[27] [27]

IEEE Transactions on Robotics31(5), 1147–1163 (2015)

Mur-Artal, R., Montiel, J.M.M., Tardos, J.D.: ORB-SLAM: a versatile and accu- rate monocular SLAM system. IEEE Transactions on Robotics31(5), 1147–1163 (2015)

work page 2015

[28] [28]

IEEE TPAMI26(6), 756–770 (2004)

Nistér, D.: An efficient solution to the five-point relative pose problem. IEEE TPAMI26(6), 756–770 (2004)

work page 2004

[29] [29]

Acta Numerica26, 305–364 (2017)

Özyeşil,O.,Voroninski,V.,Basri,R.,Singer,A.:Asurveyofstructurefrommotion. Acta Numerica26, 305–364 (2017)

work page 2017

[30] [30]

In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Schönberger, J.L., Frahm, J.M.: Structure-from-motion revisited. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). pp. 4104–4113 (2016)

work page 2016

[31] [31]

In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Schöps, T., Schönberger, J.L., Galliani, S., Sattler, T., Schindler, K., Pollefeys, M., Geiger, A.: A multi-view stereo benchmark with high-resolution images and multi-camera videos. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). pp. 3239–3248 (2017)

work page 2017

[32] [32]

In: ACM SIGGRAPH 2006 Papers

Snavely, N., Seitz, S.M., Szeliski, R.: Photo tourism: Exploring photo collections in 3D. In: ACM SIGGRAPH 2006 Papers. pp. 835–846 (2006)

work page 2006

[33] [33]

In: Robotics: Science and Systems (RSS) (2010)

Strasdat, H., Montiel, J.M.M., Davison, A.J.: Scale drift-aware large scale monoc- ular slam. In: Robotics: Science and Systems (RSS) (2010)

work page 2010

[34] [34]

In: CVPR (2021)

Sun, J., Shen, Z., Wang, Y., Bao, H., Zhou, X.: LoFTR: Detector-free local feature matching with transformers. In: CVPR (2021)

work page 2021

[35] [35]

In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

Sünderhauf, N., Protzel, P.: Switchable constraints for robust pose graph slam. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). pp. 1879–1884. IEEE (2012)

work page 2012

[36] [36]

In: International Workshop on Vision Algorithms

Triggs, B., McLauchlan, P.F., Hartley, R.I., Fitzgibbon, A.W.: Bundle adjustment — a modern synthesis. In: International Workshop on Vision Algorithms. pp. 298–

work page

[37] [37]

In: Proceedings of the IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition (CVPR)

Truong, P., Rakotosaona, M.J., Manhardt, F., Tombari, F.: SPARF: Neural radi- ance fields from sparse and noisy poses. In: Proceedings of the IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition (CVPR). pp. 4190–4200 (2023) ProBA 17

work page 2023

[38] [38]

In: Proceedings of the IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition (2025)

Wang, J., Chen, M., Karaev, N., Vedaldi, A., Rupprecht, C., Novotny, D.: Vggt: Visual geometry grounded transformer. In: Proceedings of the IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition (2025)

work page 2025

[39] [39]

Wang, J., Karaev, N., Rupprecht, C., Novotny, D.: Vggsfm: Visual geometry group structure-from-motion.In:ProceedingsoftheIEEE/CVFConferenceonComputer Vision and Pattern Recognition (CVPR) (2024)

work page 2024

[40] [40]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Wang, S., Leroy, V., Cabon, Y., Chidlovskii, B., Revaud, J.: DUSt3R: Geometric 3D vision made easy. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 20697–20709 (2024)

work page 2024

[41] [41]

In: ECCV (2024)

Weber, S., Hong, J.H., Cremers, D.: Power variable projection for initialization-free large-scale bundle adjustment. In: ECCV (2024)

work page 2024

[42] [42]

Wilson, K., Wehrwein, S.: Visualizing spectral bundle adjustment uncertainty. In: Int. Conf. 3D Vis. pp. 663–671 (2020).https://doi.org/10.1109/3DV50981. 2020.00076

work page doi:10.1109/3dv50981 2020

[43] [43]

IEEE Transactions on Robotics37(2), 314–333 (2020).https://doi.org/ 10.1109/TRO.2020.3033695

Yang, H., Shi, J., Carlone, L.: TEASER: Fast and certifiable point cloud registra- tion. IEEE Transactions on Robotics37(2), 314–333 (2020).https://doi.org/ 10.1109/TRO.2020.3033695

work page doi:10.1109/tro.2020.3033695 2020

[44] [44]

long tail

Zach, C., Hong, J.H.: pOSE: Pseudo object space error for initialization-free bundle adjustment. In: CVPR. pp. 1876–1885 (2018).https://doi.org/10.1109/CVPR. 2018.00201 18 J. Chui and H. Andrade-Loarca and D. Cremers A Appendix A.1 Minimum Spanning Tree and View Graph Construction Algorithm 1Minimum Spanning Tree Construction Require:Set of uncalibrated i...

work page doi:10.1109/cvpr 2018