pith. sign in

arxiv: 2604.23704 · v1 · submitted 2026-04-26 · 💻 cs.CV

A Pose-only Geometric Constraint for Multi-Camera Pose Adjustment

Pith reviewed 2026-05-08 06:31 UTC · model grok-4.3

classification 💻 cs.CV
keywords multi-camerapose adjustmentgeometric constraintbundle adjustmentgeneralized camera modelvisual navigation3D reconstructioncomputational efficiency
0
0 comments X

The pith

A pose-only geometric constraint for multi-camera systems allows pose adjustment without optimizing 3D points.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper develops a geometric constraint that lets multi-camera pose estimation skip explicit 3D point modeling. Using the generalized camera model, it expresses each scene point through two base observations and the camera poses alone. The resulting pose adjustment algorithm optimizes only the poses, cutting down on variables in the nonlinear solver. Tests on synthetic and real datasets show it runs faster than standard bundle adjustment while keeping or boosting accuracy. The reduction in computation addresses the feature redundancy problem in multi-camera setups for navigation and reconstruction.

Core claim

The central discovery is the formulation of a multi-camera pose-only constraint based on the generalized camera model. This constraint implicitly represents a 3D scene point using two base observations and their associated poses, providing a complete pose-only representation of the projection geometry. Consequently, the authors propose a pose adjustment algorithm that removes all 3D points from the optimization parameters, focusing computation on the system poses for greater efficiency.

What carries the argument

The multi-camera pose-only constraint that implicitly represents a 3D scene point using two base observations and their associated poses under the generalized camera model.

Load-bearing premise

The derived pose-only constraint captures the full projection geometry from the multi-camera observations without any approximation or loss of information relative to optimizing over explicit 3D points.

What would settle it

If applying the pose-only adjustment to a dataset yields pose estimates with substantially higher reprojection errors than full bundle adjustment on the same inputs, that would indicate the constraint does not fully capture the geometry.

Figures

Figures reproduced from arXiv: 2604.23704 by Banglei Guan, Bin Li, Qifeng Yu, Shunkun Liang, Yang Shang.

Figure 1
Figure 1. Figure 1: Schematic diagram of multi-camera pose-only con view at source ↗
Figure 2
Figure 2. Figure 2: Schematic diagram of the uncertainty ellipsoid for 3D view at source ↗
Figure 3
Figure 3. Figure 3: Two types of extrinsic configurations for the multi view at source ↗
Figure 4
Figure 4. Figure 4: Multi-camera system motion trajectories and 3D view at source ↗
Figure 6
Figure 6. Figure 6: Accuracy comparison of triangulation methods on the view at source ↗
Figure 7
Figure 7. Figure 7: Accuracy comparison of four pose optimization algorithms on four synthetic datasets. The charts show the average view at source ↗
Figure 3
Figure 3. Figure 3: fig. 3. The dataset provides camera intrinsics but not extrin view at source ↗
Figure 8
Figure 8. Figure 8: Comparison of optimized trajectories for representative scenes Electro, Forest, Terrains, and Playground from the view at source ↗
Figure 9
Figure 9. Figure 9: Trajectory optimization and scene reconstruction results for Seq.01 (top row) and Seq.07 (bottom row) of the KITTI view at source ↗
Figure 10
Figure 10. Figure 10: Comparison of trajectory optimization results for sequences Seq.01 (top row) and Seq.07 (bottom row) of the KITTI view at source ↗
Figure 11
Figure 11. Figure 11: Comparison of trajectory optimization results on KITTI sequences Seq.07 (top row) and Seq.09 (bottom row), view at source ↗
read the original abstract

Multi-camera systems offer rich observation capabilities for visual navigation and 3D scene reconstruction; however, the resulting feature redundancy often compromises computational efficiency. This challenge is particularly pronounced during bundle adjustment, where the non-linear optimization of both system poses and scene points incurs substantial computational overhead. To address this challenge, this paper introduces a pose-only geometric constraint for multi-camera systems and proposes a corresponding pose adjustment algorithm. Specifically, we use generalized camera model to establish a unified representation of the multi-camera system. Building upon this model, we formulate the multi-camera pose-only constraint, which implicitly represents a 3D scene point using two base observations and their associated poses, thereby achieving a pose-only representation of the projection geometry. Subsequently, we introduce a multi-camera pose adjustment algorithm that eliminates 3D points from the parameter space, thereby achieving efficient and focused pose optimization. Experimental results on both synthetic and real-world datasets demonstrate that the proposed algorithm outperforms baseline bundle adjustment methods in computational efficiency, while maintaining or even improving pose estimation accuracy.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper introduces a pose-only geometric constraint for multi-camera systems under the generalized camera model. By implicitly representing each 3D scene point via exactly two base observations and their poses, the method eliminates 3D points from the optimization, yielding a pose-only adjustment algorithm claimed to be more efficient than standard bundle adjustment while preserving or improving accuracy. Experiments on synthetic and real datasets are asserted to support these efficiency and accuracy gains.

Significance. A rigorously equivalent pose-only formulation would enable substantial speedups in multi-camera pose estimation for SLAM and reconstruction without the overhead of joint point-pose optimization. The approach builds on established generalized camera geometry and could be valuable if the constraint derivation avoids information loss.

major comments (2)
  1. [Method (pose-only constraint formulation)] The central derivation of the pose-only constraint (implicitly using two base observations to eliminate the 3D point parameter) must explicitly prove or demonstrate equivalence to the standard multi-view reprojection error after marginalization. The construction risks information loss or approximation when observations exceed two per point or contain noise, as the remaining observations are reduced to pose-only constraints without a shown guarantee of preserving the full geometry.
  2. [Experiments] Experimental claims of outperformance in efficiency and accuracy lack supporting details on baselines, metrics (e.g., RMSE, runtime), error bars, dataset selection criteria, or statistical significance. Without these, the results cannot substantiate the efficiency-accuracy tradeoff asserted in the abstract.
minor comments (1)
  1. Clarify notation for the generalized camera model and base observation selection process to ensure reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback on our manuscript. The comments highlight important aspects of the derivation and experimental validation that we will address to strengthen the paper. We respond to each major comment below.

read point-by-point responses
  1. Referee: [Method (pose-only constraint formulation)] The central derivation of the pose-only constraint (implicitly using two base observations to eliminate the 3D point parameter) must explicitly prove or demonstrate equivalence to the standard multi-view reprojection error after marginalization. The construction risks information loss or approximation when observations exceed two per point or contain noise, as the remaining observations are reduced to pose-only constraints without a shown guarantee of preserving the full geometry.

    Authors: We appreciate the referee's emphasis on rigorous equivalence. The pose-only constraint is obtained by algebraically solving for the 3D point from the two base observations under the generalized camera model and substituting the expression into the reprojection equations of all other views. This substitution yields an exact algebraic constraint on the poses alone; it is not an approximation. When a point has more than two observations, each additional view contributes an independent pose-only constraint derived in the same manner, so the overall objective remains equivalent to the original multi-view least-squares problem after exact elimination of the point parameters. Noise is handled naturally because the derivation operates directly on the reprojection error terms. To make this explicit, we will add a new subsection in the revised manuscript that (i) derives the constraint from the standard reprojection error, (ii) proves equivalence to marginalization by showing that the critical points of the pose-only objective coincide with those of the joint pose-point optimization, and (iii) discusses the case of redundant observations and noisy measurements. This addition will remove any ambiguity regarding information preservation. revision: partial

  2. Referee: [Experiments] Experimental claims of outperformance in efficiency and accuracy lack supporting details on baselines, metrics (e.g., RMSE, runtime), error bars, dataset selection criteria, or statistical significance. Without these, the results cannot substantiate the efficiency-accuracy tradeoff asserted in the abstract.

    Authors: We agree that the experimental section requires more comprehensive reporting to substantiate the claims. In the revised manuscript we will expand the experiments to include: explicit baselines (standard bundle adjustment implemented with Ceres and a generalized-camera BA variant); quantitative metrics (pose RMSE, absolute trajectory error, and wall-clock runtime measured on identical hardware); error bars and standard deviations obtained from repeated trials with varied random seeds; detailed dataset specifications (synthetic data generation parameters such as number of cameras, points, and noise levels, plus the exact real-world sequences employed); and statistical significance tests (e.g., paired t-tests) comparing the proposed method against the baselines. These additions will provide clear, reproducible evidence for the reported efficiency gains while confirming that accuracy is preserved or improved. revision: yes

Circularity Check

0 steps flagged

No circularity: geometric elimination is self-contained

full rationale

The derivation uses the generalized camera model to algebraically eliminate 3D points via two base observations, producing a pose-only constraint. This is a direct first-principles construction from projection equations rather than a fit, self-definition, or imported ansatz. No load-bearing self-citations, uniqueness theorems, or renamed empirical patterns appear in the abstract or method outline. The efficiency/accuracy claims rest on separate experiments, not on the constraint definition itself. The chain is independent of its outputs.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claim depends on the generalized camera model as a unifying representation and the validity of the implicit 3D-point encoding via two observations; no free parameters or invented entities with external evidence are described.

axioms (1)
  • domain assumption Generalized camera model provides a unified representation for the multi-camera system
    Invoked to establish the base for the pose-only constraint formulation.
invented entities (1)
  • pose-only geometric constraint no independent evidence
    purpose: Implicitly represents a 3D scene point using two base observations and associated poses
    New formulation introduced to eliminate 3D points from the optimization parameter space.

pith-pipeline@v0.9.0 · 5480 in / 1221 out tokens · 29632 ms · 2026-05-08T06:31:31.988381+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

64 extracted references · 64 canonical work pages

  1. [1]

    Towards robust visual odometry with a multi-camera system,

    P . Liu, M. Geppert, L. Heng, T. Sattler, A. Geiger, and M. Pollefeys, “Towards robust visual odometry with a multi-camera system,” in IEEE International Conference on Intelligent Robots and Systems, 2018, pp. 1154–1161

  2. [2]

    Svo: Semidirect visual odometry for monocular and multicamera systems,

    C. Forster, Z. Zhang, M. Gassner, M. Werlberger, and D. Scara- muzza, “Svo: Semidirect visual odometry for monocular and multicamera systems,”IEEE T ransactions on Robotics, vol. 33, no. 2, pp. 249–265, 2017

  3. [3]

    Mcsfm: Multi-camera-based incre- mental structure-from-motion,

    H. Cui, X. Gao, and S. Shen, “Mcsfm: Multi-camera-based incre- mental structure-from-motion,”IEEE T ransactions on Image Process- ing, vol. 32, pp. 6441–6456, 2023

  4. [4]

    R3d3: Dense 3d reconstruction of dynamic scenes from multiple cameras,

    A. Schmied, T. Fischer, M. Danelljan, M. Pollefeys, and F. Yu, “R3d3: Dense 3d reconstruction of dynamic scenes from multiple cameras,” inIEEE International Conference on Computer Vision, 2023, pp. 3193–3203

  5. [5]

    Avp-slam: Semantic visual mapping and localization for autonomous vehicles in the park- ing lot,

    T. Qin, T. Chen, Y. Chen, and Q. Su, “Avp-slam: Semantic visual mapping and localization for autonomous vehicles in the park- ing lot,” inIEEE International Conference on Intelligent Robots and Systems, 2020, pp. 5939–5945

  6. [6]

    Project autovision: Localization and 3d scene percep- tion for an autonomous vehicle with a multi-camera system,

    L. Heng, B. Choi, Z. Cui, M. Geppert, S. Hu, B. Kuan, P . Liu, R. Nguyen, Y. C. Yeo, A. Geiger, G. H. Lee, M. Pollefeys, and T. Sattler, “Project autovision: Localization and 3d scene percep- tion for an autonomous vehicle with a multi-camera system,” in International Conference on Robotics and Automation, 2019, pp. 4695– 4702

  7. [7]

    Bundle adjustment—a modern synthesis,

    B. Triggs, P . F. McLauchlan, R. I. Hartley, and A. W. Fitzgibbon, “Bundle adjustment—a modern synthesis,” inVision Algorithms: Theory and Practice: International Workshop on Vision Algorithms Corfu, Greece, September 21–22, 1999 Proceedings. Springer, 2000, pp. 298–372

  8. [8]

    Bundle adjustment in the large,

    S. Agarwal, N. Snavely, S. M. Seitz, and R. Szeliski, “Bundle adjustment in the large,” inEuropean Conference on Computer Vision. Springer, 2010, pp. 29–42

  9. [9]

    Ceres Solver,

    S. Agarwal, K. Mierle, and T. C. S. Team, “Ceres Solver,” 10 2023. [Online]. Available: https://github.com/ceres-solver/ ceres-solver

  10. [10]

    Coli-ba: Compact linearization based solver for bundle adjustment,

    Z. Ye, G. Li, H. Liu, Z. Cui, H. Bao, and G. Zhang, “Coli-ba: Compact linearization based solver for bundle adjustment,”IEEE T ransactions on Visualization and Computer Graphics, vol. 28, no. 11, pp. 3727–3736, 2022

  11. [11]

    Power bundle adjustment for large-scale 3d reconstruction,

    S. Weber, N. Demmel, T. C. Chan, and D. Cremers, “Power bundle adjustment for large-scale 3d reconstruction,” inIEEE Conference on Computer Vision and Pattern Recognition, 2023, pp. 281–289

  12. [12]

    Bundle ad- justment for multi-camera systems with points at infinity,

    J. Schneider, F. Schindler, T. L ¨abe, and W. F ¨orstner, “Bundle ad- justment for multi-camera systems with points at infinity,”ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, vol. 1, pp. 75–80, 2012

  13. [13]

    Multicol bundle adjustment: a generic method for pose estimation, simultaneous self-calibration and reconstruction for arbitrary multi-camera sys- tems,

    S. Urban, S. Wursthorn, J. Leitloff, and S. Hinz, “Multicol bundle adjustment: a generic method for pose estimation, simultaneous self-calibration and reconstruction for arbitrary multi-camera sys- tems,”International Journal of Computer Vision, vol. 121, pp. 234– 252, 2017

  14. [14]

    A pose-only solution to visual reconstruction and navigation,

    Q. Cai, L. Zhang, Y. Wu, W. Yu, and D. Hu, “A pose-only solution to visual reconstruction and navigation,”IEEE T ransactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 1, pp. 73–86, 2023

  15. [15]

    Equivalent constraints for two-view geometry: Pose solution/pure rotation identification and 3d reconstruction,

    Q. Cai, Y. Wu, L. Zhang, and P . Zhang, “Equivalent constraints for two-view geometry: Pose solution/pure rotation identification and 3d reconstruction,”International Journal of Computer Vision, vol. 127, pp. 163–180, 2019

  16. [16]

    Using many cameras as one,

    R. Pless, “Using many cameras as one,” inIEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2. IEEE, 2003, pp. II–587

  17. [17]

    Monoslam: Real-time single camera slam,

    A. J. Davison, I. D. Reid, N. D. Molton, and O. Stasse, “Monoslam: Real-time single camera slam,”IEEE T ransactions on Pattern Analy- sis and Machine Intelligence, vol. 29, no. 6, pp. 1052–1067, 2007

  18. [18]

    Svo: Fast semi-direct monocular visual odometry,

    C. Forster, M. Pizzoli, and D. Scaramuzza, “Svo: Fast semi-direct monocular visual odometry,” inIEEE international conference on robotics and automation. IEEE, 2014, pp. 15–22

  19. [19]

    Orb-slam2: An open-source slam system for monocular, stereo, and rgb-d cameras,

    R. Mur-Artal and J. D. Tard ´os, “Orb-slam2: An open-source slam system for monocular, stereo, and rgb-d cameras,”IEEE T ransac- tions on Robotics, vol. 33, no. 5, pp. 1255–1262, 2017

  20. [20]

    Stereo dso: Large- scale direct sparse visual odometry with stereo cameras,

    R. Wang, M. Schworer, and D. Cremers, “Stereo dso: Large- scale direct sparse visual odometry with stereo cameras,” inIEEE international conference on computer vision, 2017, pp. 3903–3911

  21. [21]

    Vins-mono: A robust and versatile monocular visual-inertial state estimator,

    T. Qin, P . Li, and S. Shen, “Vins-mono: A robust and versatile monocular visual-inertial state estimator,”IEEE transactions on robotics, vol. 34, no. 4, pp. 1004–1020, 2018

  22. [22]

    Keyframe-based visual–inertial odometry using nonlinear opti- mization,

    S. Leutenegger, S. Lynen, M. Bosse, R. Siegwart, and P . Furgale, “Keyframe-based visual–inertial odometry using nonlinear opti- mization,”The International Journal of Robotics Research, vol. 34, no. 3, pp. 314–334, 2015

  23. [23]

    Multi-camera parallel tracking and mapping with non- overlapping fields of view,

    M. J. Tribou, A. Harmat, D. W. Wang, I. Sharf, and S. L. Waslan- der, “Multi-camera parallel tracking and mapping with non- overlapping fields of view,”The International Journal of Robotics Research, vol. 34, no. 12, pp. 1480–1500, 2015

  24. [24]

    Parallel tracking and mapping for small ar workspaces,

    G. Klein and D. Murray, “Parallel tracking and mapping for small ar workspaces,” inIEEE and ACM international symposium on mixed and augmented reality. IEEE, 2007, pp. 225–234

  25. [25]

    Multicol-slam-a modular real-time multi- camera slam system,

    S. Urban and S. Hinz, “Multicol-slam-a modular real-time multi- camera slam system,”arXiv preprint arXiv:1610.07336, 2016

  26. [26]

    Towards robust visual- inertial odometry with multiple non-overlapping monocular cam- eras,

    Y. He, H. Yu, W. Yang, and S. Scherer, “Towards robust visual- inertial odometry with multiple non-overlapping monocular cam- eras,” inIEEE International Conference on Intelligent Robots and Systems. IEEE, 2022, pp. 9452–9458

  27. [27]

    Bamf-slam: Bundle adjusted multi-fisheye visual-inertial slam using recurrent field transforms,

    W. Zhang, S. Wang, X. Dong, R. Guo, and N. Haala, “Bamf-slam: Bundle adjusted multi-fisheye visual-inertial slam using recurrent field transforms,” inIEEE international conference on robotics and automation. IEEE, 2023, pp. 6232–6238

  28. [28]

    Mavis: Multi-camera augmented visual-inertial slam using se 2 (3) based exact imu pre-integration,

    Y. Wang, Y. Ng, I. Sa, A. Parra, C. Rodriguez-Opazo, T. Lin, and H. Li, “Mavis: Multi-camera augmented visual-inertial slam using se 2 (3) based exact imu pre-integration,” inIEEE International Conference on Robotics and Automation. IEEE, 2024, pp. 1694–1700

  29. [29]

    Balancing the budget: Feature selection and tracking for multi-camera visual- inertial odometry,

    L. Zhang, D. Wisth, M. Camurri, and M. Fallon, “Balancing the budget: Feature selection and tracking for multi-camera visual- inertial odometry,”IEEE Robotics and Automation Letters, vol. 7, no. 2, pp. 1182–1189, 2021

  30. [30]

    Redesigning slam for arbitrary multi-camera systems,

    J. Kuo, M. Muglikar, Z. Zhang, and D. Scaramuzza, “Redesigning slam for arbitrary multi-camera systems,” inIEEE International Conference on Robotics and Automation. IEEE, 2020, pp. 2116–2122

  31. [31]

    Design and evaluation of a generic visual slam framework for multi camera systems,

    P . Kaveti, S. N. Vaidyanathan, A. T. Chelvan, and H. Singh, “Design and evaluation of a generic visual slam framework for multi camera systems,”IEEE Robotics and Automation Letters, vol. 8, no. 11, pp. 7368–7375, 2023

  32. [32]

    Hartley and A

    R. Hartley and A. Zisserman,Multiple view geometry in computer vision. Cambridge university press, 2003

  33. [33]

    Photometric bundle adjustment for dense multi-view 3d modeling,

    A. Delaunoy and M. Pollefeys, “Photometric bundle adjustment for dense multi-view 3d modeling,” inIEEE Conference on Com- puter Vision and Pattern Recognition, 2014, pp. 1486–1493

  34. [34]

    Direct sparse odometry,

    J. Engel, V . Koltun, and D. Cremers, “Direct sparse odometry,” IEEE transactions on pattern analysis and machine intelligence, vol. 40, no. 3, pp. 611–625, 2017

  35. [35]

    Affine correspondences between multi- camera systems for relative pose estimation,

    B. Guan and J. Zhao, “Affine correspondences between multi- camera systems for relative pose estimation,”IEEE T ransactions on Pattern Analysis and Machine Intelligence, 2025

  36. [36]

    Structure-from-motion revis- ited,

    J. L. Sch ¨onberger and J.-M. Frahm, “Structure-from-motion revis- ited,” inIEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 4104–4113

  37. [37]

    Camera calibration using a collimator system,

    S. Liang, B. Guan, Z. Yu, P . Sun, and Y. Shang, “Camera calibration using a collimator system,” inEuropean Conference on Computer Vision. Springer, 2024, pp. 374–390

  38. [38]

    Optimal pose guidance for stereo calibra- tion in 3d deformation measurement,

    D. Tan, S. Liang, B. Li, B. Guan, A. Su, Y. Lin, D. Zhang, M. Wan, Z. Liu, C. Wanget al., “Optimal pose guidance for stereo calibra- tion in 3d deformation measurement,”Experimental Mechanics, pp. 1–14, 2026

  39. [39]

    Square root bundle adjustment for large-scale reconstruction,

    N. Demmel, C. Sommer, D. Cremers, and V . Usenko, “Square root bundle adjustment for large-scale reconstruction,” inIEEE Conference on Computer Vision and Pattern Recognition, 2021, pp. 11 718–11 727

  40. [40]

    Stochastic bundle adjustment for efficient and scalable 3d reconstruction,

    L. Zhou, Z. Luo, M. Zhen, T. Shen, S. Li, Z. Huang, T. Fang, and 18 L. Quan, “Stochastic bundle adjustment for efficient and scalable 3d reconstruction,” inEuropean Conference on Computer Vision. Springer, 2020, pp. 364–379

  41. [41]

    Multicore bundle adjustment,

    C. Wu, S. Agarwal, B. Curless, and S. M. Seitz, “Multicore bundle adjustment,” inIEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2011, pp. 3057–3064

  42. [42]

    Megba: A gpu-based distributed library for large-scale bundle adjustment,

    J. Ren, W. Liang, R. Yan, L. Mai, S. Liu, and X. Liu, “Megba: A gpu-based distributed library for large-scale bundle adjustment,” inEuropean Conference on Computer Vision. Springer, 2022, pp. 715–731

  43. [43]

    Globally consistent range scan alignment for environment mapping,

    F. Lu and E. Milios, “Globally consistent range scan alignment for environment mapping,”Autonomous robots, vol. 4, pp. 333–349, 1997

  44. [44]

    Rotation-only bundle adjustment,

    S. H. Lee and J. Civera, “Rotation-only bundle adjustment,” in IEEE Conference on Computer Vision and Pattern Recognition, 2021, pp. 424–433

  45. [45]

    Direct optimization of frame-to-frame rotation,

    L. Kneip and S. Lynen, “Direct optimization of frame-to-frame rotation,” inIEEE International Conference on Computer Vision, 2013, pp. 2352–2359

  46. [46]

    Pipo-slam: Lightweight visual-inertial slam with preintegration merging theory and pose- only descriptions of multiple view geometry,

    Y. Ge, L. Zhang, Y. Wu, and D. Hu, “Pipo-slam: Lightweight visual-inertial slam with preintegration merging theory and pose- only descriptions of multiple view geometry,”IEEE T ransactions on Robotics, vol. 40, pp. 2046–2059, 2024

  47. [47]

    Po-kf: A pose-only representation-based kalman filter for visual inertial odometry,

    L. Wang, H. Tang, T. Zhang, Y. Wang, Q. Zhang, and X. Niu, “Po-kf: A pose-only representation-based kalman filter for visual inertial odometry,”IEEE Internet of Things Journal, 2025

  48. [48]

    A flexible tech- nique for accurate omnidirectional camera calibration and struc- ture from motion,

    D. Scaramuzza, A. Martinelli, and R. Siegwart, “A flexible tech- nique for accurate omnidirectional camera calibration and struc- ture from motion,” inIEEE International Conference on Computer Vision Systems, 2006, pp. 45–45

  49. [49]

    A generic camera model and calibration method for conventional, wide-angle, and fish-eye lenses,

    J. Kannala and S. Brandt, “A generic camera model and calibration method for conventional, wide-angle, and fish-eye lenses,”IEEE T ransactions on Pattern Analysis and Machine Intelligence, vol. 28, no. 8, pp. 1335–1340, 2006

  50. [50]

    Parallaxba: bundle adjustment using parallax angle feature parametrization,

    L. Zhao, S. Huang, Y. Sun, L. Yan, and G. Dissanayake, “Parallaxba: bundle adjustment using parallax angle feature parametrization,”The International Journal of Robotics Research, vol. 34, no. 4-5, pp. 493–516, 2015

  51. [51]

    Determining an initial image pair for fixing the scale of a 3d reconstruction from an image sequence,

    C. Beder and R. Steffen, “Determining an initial image pair for fixing the scale of a 3d reconstruction from an image sequence,” in Joint Pattern Recognition Symposium. Springer, 2006, pp. 657–666

  52. [52]

    F. R. Helmert,Die Ausgleichungsrechnung nach der Methode der kleinsten Quadrate: mit Anwendungen auf die Geod¨ asie und die Theorie der Messinstrumente. Leipzig: BG Teubner, 1872

  53. [53]

    Uncertainty and projective geometry,

    E. B. Corrochano and W. F ¨orstner, “Uncertainty and projective geometry,”Handbook of Geometric Computing: Applications in Pattern Recognition, Computer Vision, Neuralcomputing, and Robotics, pp. 493–534, 2005

  54. [54]

    F ¨orstner and B

    W. F ¨orstner and B. P . Wrobel,Photogrammetric computer vision. Springer, 2016, vol. 6

  55. [55]

    Minimal representations for uncertainty and estima- tion in projective spaces,

    W. F ¨orstner, “Minimal representations for uncertainty and estima- tion in projective spaces,” inAsian Conference on Computer Vision. Springer, 2010, pp. 619–632

  56. [56]

    A generic structure- from-motion framework,

    S. Ramalingam, S. K. Lodha, and P . Sturm, “A generic structure- from-motion framework,”Computer Vision and Image Understand- ing, vol. 103, no. 3, pp. 218–228, 2006

  57. [57]

    Triangulation,

    R. I. Hartley and P . Sturm, “Triangulation,”Computer vision and image understanding, vol. 68, no. 2, pp. 146–157, 1997

  58. [58]

    Iteratively reweighted midpoint method for fast multiple view triangulation,

    K. Yang, W. Fang, Y. Zhao, and N. Deng, “Iteratively reweighted midpoint method for fast multiple view triangulation,”IEEE Robotics and Automation Letters, vol. 4, no. 2, pp. 708–715, 2019

  59. [59]

    A multi-view stereo benchmark with high-resolution images and multi-camera videos,

    T. Sch ¨ops, J. L. Sch ¨onberger, S. Galliani, T. Sattler, K. Schindler, M. Pollefeys, and A. Geiger, “A multi-view stereo benchmark with high-resolution images and multi-camera videos,” inIEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 2538–2547

  60. [60]

    Are we ready for autonomous driving? the kitti vision benchmark suite,

    A. Geiger, P . Lenz, and R. Urtasun, “Are we ready for autonomous driving? the kitti vision benchmark suite,” inIEEE Conference on Computer Vision and Pattern Recognition, 2012

  61. [61]

    Estimation of location uncertainty for scale invariant features points

    B. Zeisl, P . F. Georgel, F. Schweiger, E. G. Steinbach, N. Navab, and G. Munich, “Estimation of location uncertainty for scale invariant features points.” inBritish Machine Vision Conference, 2009, pp. 1– 12

  62. [62]

    Learning correspondence uncertainty via differentiable nonlinear least squares,

    D. Muhle, L. Koestler, K. M. Jatavallabhula, and D. Cremers, “Learning correspondence uncertainty via differentiable nonlinear least squares,” inIEEE Conference on Computer Vision and Pattern Recognition, 2023, pp. 13 102–13 112

  63. [63]

    A linear approach to motion esti- mation using generalized camera models,

    H. Li, R. Hartley, and J.-h. Kim, “A linear approach to motion esti- mation using generalized camera models,” in2008 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2008, pp. 1–8

  64. [64]

    Opengv: A unified and generalized approach to real-time calibrated geometric vision,

    L. Kneip and P . Furgale, “Opengv: A unified and generalized approach to real-time calibrated geometric vision,” inIEEE inter- national conference on robotics and automation. IEEE, 2014, pp. 1–8