A linear method for camera pair self-calibration and multi-view reconstruction with geometrically verified correspondences

Nikos Melanitis; Petros Maragos

arxiv: 1906.12075 · v1 · pith:LQJNH6RCnew · submitted 2019-06-28 · 💻 cs.CV

A linear method for camera pair self-calibration and multi-view reconstruction with geometrically verified correspondences

Nikos Melanitis , Petros Maragos This is my paper

Pith reviewed 2026-05-25 13:57 UTC · model grok-4.3

classification 💻 cs.CV

keywords self-calibrationmetric reconstructioncamera pairscheirality conditioncorrespondence verificationmulti-view reconstructionfocal length estimationprojective to metric

0 comments

The pith

A linear method recovers metric reconstructions of camera pairs from projective ones when focal lengths are unknown and different.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces a linear solver that converts a known projective reconstruction of two cameras into a metric one, assuming all internal parameters except the focal lengths are known. It produces two candidate camera placements that sit in mirror positions relative to each other and applies the cheirality condition to select the configuration in which all reconstructed 3D points lie in front of both cameras. The same framework adds a geometric test that discards erroneous point matches by checking whether the points preserve their order along the image x and y axes. These pair-wise metric reconstructions are then combined across an image collection using rotation averaging and focal-length averaging to produce a full scene model. The approach achieves rotation accuracy comparable to the established Kruppa-equation method.

Core claim

Under the stated assumptions a linear method yields two mirror-image metric camera configurations from the projective pair; the viewing directions of the two solutions stand in explicit geometric relation; the correct configuration is identified by the cheirality condition; and a separate ordering test on image-axis coordinates removes inconsistent correspondences before reconstruction proceeds.

What carries the argument

Linear recovery of the two mirror-position camera configurations from the projective pair, followed by cheirality disambiguation and axis-order verification of correspondences.

If this is right

The two recovered solutions are always mirror images of each other with fixed relations between their viewing directions.
Point correspondences that violate order along both image axes can be rejected before metric reconstruction.
Multiple verified pair reconstructions can be fused into a consistent multi-view model by rotation averaging and focal-length averaging.
The linear solver runs at accuracy comparable to the nonlinear Kruppa self-calibration plus five-point algorithm pipeline.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Pair-wise results could serve as input to global bundle adjustment that refines all focal lengths simultaneously.
The axis-order test might improve match filtering in any two-view geometry task that assumes locally consistent image ordering.
The mirror-solution property could simplify initialization of larger camera graphs by propagating orientation constraints.

Load-bearing premise

All camera internal parameters except the two focal lengths are known in advance and a projective reconstruction of the pair is already given.

What would settle it

On a collection of image pairs supplied with ground-truth metric poses, measure whether the linear solver selects the cheirality-correct solution and produces median rotation error no larger than that of the Kruppa baseline.

Figures

Figures reproduced from arXiv: 1906.12075 by Nikos Melanitis, Petros Maragos.

**Figure 1.** Figure 1: We visualise the geometric relations of Theorems 1,2. In the graph, we display the centers of projection (C1 , C2 ) and viewing directions (v1, v2), for each of the two solutions of Eq. 17. In pink, we display solution 1 and in red solution 2. The common plane of C1, C2, v1, v2 is highlighted . Solutions . x 1 0 = ( b1 b2 b3 p3 f 2 1 p1 f 2 1 p2 )T x 2 0 = ( b1 b2 b3 p ′ 3 f 2 1 p ′ 1 f 2 1 p ′ 2 )T b3 = … view at source ↗

**Figure 2.** Figure 2: An illustration of the intermediate results leading to Theorem 1 12 [PITH_FULL_IMAGE:figures/full_fig_p012_2.png] view at source ↗

**Figure 3.** Figure 3: An illustration of the intermediate results leading to Theorem 2 denote the reconstructions derived from (17). Then, cameras P 1 m2 , P2 m2 are in mirror positions with respect to the origin (position of P 1 m1 , P2 m1 ). The centers of projection C1 m2, C2 m2 satisfy C1 m2 = −C2 m2 Theorem 2. Let camera 1 be positioned on the origin of the world coordinate system, with a viewing direction aligned to z axi… view at source ↗

**Figure 4.** Figure 4: Effect of camera coordinate system rotation to depiction of parallel lines. The figures were created by projecting a 3D structure at a constant depth (z = zconst) to the image plane, thus effects of different scene depth are not shown 190 We propose an approximate solution by the following method: 1. We find the largest Consistent-x subset, Sx, solving an LIS problem 2. We find the largest Consistent-y sub… view at source ↗

**Figure 5.** Figure 5: Our pipeline to reconstruct a 3D scene from an unordered set of 2D photographs. In the first row, we display a flow diagram of the algorithm stages. Novel parts are displayed in green. The second row outlines the core methods we use. We highlight methods we introduced in preceding sections. The two final rows contain a visualization of data type and most important references per stage 18 [PITH_FULL_IMAGE:… view at source ↗

**Figure 6.** Figure 6: f estimates distribution. f estimates were collected from all available camera pairs. We observe that for some cameras (right), focal length can be readily determined. The opposite holds for other cameras (left). Data from castle-P30 [42] . 10% [PITH_FULL_IMAGE:figures/full_fig_p019_6.png] view at source ↗

**Figure 7.** Figure 7: A demonstration of confidence count computation. Each disk represents a fi estimate. We compute the cc for the central value, depicted here with a bold border. This cc depends on the number of estimates within a β = 10% range, depicted with a red square in the picture that in each image pair that contains image i, we receive a number of correct and a number of erroneous estimates for fi , and that erroneou… view at source ↗

**Figure 8.** Figure 8: A demonstration of joint confidence count computation. Each disk represents a f1 estimate. We compute the Jcc for the central value, depicted here with a bold border. This time, each disk is divided in half, to demonstrate that each f1 estimate is paired with one f2 estimate. We use different disk colors for different cameras. Jcc depends on the sum of elements within range (inside the red square). In cont… view at source ↗

**Figure 9.** Figure 9: Demonstration of tentative correspondences verification. Left: Initial correspondences. Right: Verified correspondences using the geometric verification method we introduced 305 To quantify the reconstruction error, we used the angle (∆R) between the relative rotation estimate and the true relative rotation, Rij , between two paired views i, j. The initialization of BA is important, to improve convergence … view at source ↗

**Figure 10.** Figure 10: Joint confidence count distribution, which displays a clear peak near correct f value. Data from castleP30 [42] 6.4. Multi-view reconstruction in unordered image sets In Tab. 6, we provide quantitative performance measures for multi-view reconstructions that were acquired applying the proposed pipeline ( [PITH_FULL_IMAGE:figures/full_fig_p024_10.png] view at source ↗

**Figure 11.** Figure 11: 3D reconstruction results, as dense point clouds. Top Row: Datasets [42], Herz-Jesu-P25 (left) and Fountain-P11. Middle Row: Dataset castle-P19 [42] (left) and photo set of Monument of Lysicrates, Athens (right). Bottom Row: Photo sets of locations in Athens, Parthenon (left) and Karyatids (right). The scenes in Athens were photographed by the authors with a simple compact camera 25 [PITH_FULL_IMAGE:figu… view at source ↗

read the original abstract

We examine 3D reconstruction of architectural scenes in unordered sets of uncalibrated images. We introduce a linear method to self-calibrate and find the metric reconstruction of a camera pair. We assume unknown and different focal lengths but otherwise known internal camera parameters and a known projective reconstruction of the camera pair. We recover two possible camera configurations in space and use the Cheirality condition, that all 3D scene points are in front of both cameras, to disambiguate the solution. We show in two Theorems, first that the two solutions are in mirror positions and then the relations between their viewing directions. Our new method performs on par (median rotation error $\Delta R = 3.49^{\circ}$) with the standard approach of Kruppa equations ($\Delta R = 3.77^{\circ}$) for self-calibration and 5-Point algorithm for calibrated metric reconstruction of a camera pair. We reject erroneous image correspondences by introducing a method to examine whether point correspondences appear in the same order along $x, y$ image axes in image pairs. We evaluate this method by its precision and recall and show that it improves the robustness of point matches in architectural and general scenes. Finally, we integrate all the introduced methods to a 3D reconstruction pipeline. We utilize the numerous camera pair metric recontructions using rotation-averaging algorithms and a novel method to average focal length estimates.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper gives a linear upgrade from projective to metric for camera pairs with unknown focal lengths, plus a simple axis-ordering check for correspondences, and reports rotation error close to Kruppa.

read the letter

The paper's main offering is a linear solver that takes a known projective reconstruction of a camera pair and produces the metric version when only the two focal lengths are unknown. They derive two mirror-image solutions, state theorems on their viewing-direction relations, and use cheirality to pick the correct one. They also add a correspondence filter that checks whether matched points keep the same order along the image x and y axes, and they fold the pair results into a pipeline using rotation averaging and focal-length averaging for unordered architectural image sets. The reported median rotation error of 3.49° is slightly better than the Kruppa baseline at 3.77°, which is a reasonable outcome for staying linear. The ordering test is evaluated on precision and recall and appears to clean matches in both architectural and general scenes. These pieces are presented as new relative to the cited baselines. The work stays within its stated assumptions and does not claim broader applicability. The central limitation is that performance figures are given without dataset details, trial counts, or error bars, so the numbers are hard to assess from the abstract alone. The method also requires a good projective reconstruction up front and known values for all other intrinsics, which narrows where it can be dropped in. This is aimed at people building practical reconstruction pipelines for scenes like buildings, where pair-wise linear steps and simple match verification could save time. A reader working on multi-view geometry tools might find the linear formulation or the ordering test worth trying. It deserves peer review because the claims are scoped, the comparisons are direct, and the ideas are concrete enough for referees to check the derivations and run their own tests.

Referee Report

2 major / 1 minor

Summary. The paper claims a linear method to upgrade a known projective reconstruction of a camera pair (unknown distinct focal lengths, other intrinsics known) to metric form by recovering two mirror configurations disambiguated via cheirality, supported by two theorems on their spatial relations and viewing directions. It also presents a geometric verification technique for rejecting erroneous correspondences based on consistent ordering along image axes, evaluates this via precision/recall, and integrates the pair-wise metric reconstructions into a multi-view pipeline using rotation averaging and focal-length averaging. Performance is reported as comparable to Kruppa equations plus 5-point algorithm (median rotation error 3.49° vs 3.77°).

Significance. If the linear derivation and theorems hold under the stated assumptions, the approach supplies an efficient, direct alternative to nonlinear self-calibration for camera pairs and could streamline robust reconstruction pipelines for architectural scenes; the ordering-based correspondence filter offers a simple geometric prior that may complement RANSAC-style methods.

major comments (2)

[Abstract] Abstract: the central performance claim (median ΔR = 3.49° on par with 3.77°) is presented without any dataset description, number of image pairs, or error-bar statistics, rendering the numerical comparison unverifiable and load-bearing for the assertion that the method performs on par with Kruppa + 5-point.
[Abstract] Abstract: the two theorems on mirror configurations and viewing-direction relations are announced but no derivation steps, intermediate equations, or proof outlines are supplied, so the soundness of the linear upgrade cannot be assessed from the given text.

minor comments (1)

[Abstract] Abstract: the final pipeline sentence contains a typo ('recontructions').

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their detailed review and constructive comments. We address each major comment below.

read point-by-point responses

Referee: [Abstract] Abstract: the central performance claim (median ΔR = 3.49° on par with 3.77°) is presented without any dataset description, number of image pairs, or error-bar statistics, rendering the numerical comparison unverifiable and load-bearing for the assertion that the method performs on par with Kruppa + 5-point.

Authors: We agree that the abstract would benefit from additional context to support the performance claim. The full experimental details, including the number of image pairs and dataset description from architectural scenes, along with error statistics, are provided in the Experiments section. We will revise the abstract to include a brief reference to the evaluation, such as the number of pairs tested, to make the comparison more verifiable. revision: yes
Referee: [Abstract] Abstract: the two theorems on mirror configurations and viewing-direction relations are announced but no derivation steps, intermediate equations, or proof outlines are supplied, so the soundness of the linear upgrade cannot be assessed from the given text.

Authors: The abstract serves as a high-level summary of the contributions. The two theorems are fully stated and proven in Section 3, with all derivation steps, intermediate equations, and proof outlines included there. To improve the abstract, we will add a short note referring readers to the theorems' proofs in the main text. However, due to length constraints, detailed derivations cannot be included in the abstract itself. revision: partial

Circularity Check

0 steps flagged

No significant circularity; derivation is self-contained algebraic upgrade

full rationale

The paper's central derivation is a linear algebraic procedure that takes as explicit input a known projective reconstruction of a camera pair (plus known intrinsics except distinct focal lengths) and produces metric camera poses via direct solution of the stated equations, followed by Cheirality disambiguation and two theorems on mirror configurations that follow from those equations. No step reduces a fitted parameter to a prediction by construction, renames an empirical pattern, or relies on a load-bearing self-citation whose content is itself unverified. The reported median rotation error is an external empirical comparison against Kruppa/5-point baselines and does not participate in the derivation chain. The correspondence verification step is likewise an independent geometric test. The pipeline integration (rotation averaging, focal-length averaging) is downstream application, not part of the core self-calibration claim. The derivation is therefore self-contained against the stated assumptions.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the assumption of known projective reconstruction and known intrinsics except focal length; no free parameters, invented entities, or additional axioms are stated in the abstract.

axioms (1)

domain assumption Internal camera parameters except focal length are known; projective reconstruction of the camera pair is given.
Stated explicitly in the abstract as the starting point for the linear method.

pith-pipeline@v0.9.0 · 5783 in / 1148 out tokens · 32869 ms · 2026-05-25T13:57:32.685345+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

59 extracted references · 59 canonical work pages

[1]

R. I. Hartley, A. Zisserman, Multiple View Geometry in Computer Vision, 2nd Edition, Cambridge University Press, ISBN: 0521540518, 2004

work page 2004
[2]

M. A. Fischler, R. C. Bolles, Random sample consensus: a paradigm for model ﬁtting with applications to image analysis and automated cartography, Communications of the ACM 24 (6) (1981) 381–395

work page 1981
[3]

R. I. Hartley, Kruppa’s equations derived from the fundamental matrix, IEEE Transactions on pattern365 analysis and machine intelligence 19 (2) (1997) 133–135

work page 1997
[4]

Nist´ er, An eﬃcient solution to the ﬁve-point relative pose problem, Pattern Analysis and Machine Intelligence, IEEE Transactions on 26 (6) (2004) 756–770

D. Nist´ er, An eﬃcient solution to the ﬁve-point relative pose problem, Pattern Analysis and Machine Intelligence, IEEE Transactions on 26 (6) (2004) 756–770

work page 2004
[5]

Pollefeys, R

M. Pollefeys, R. Koch, L. Van Gool, Self-calibration and metric reconstruction inspite of varying and unknown intrinsic camera parameters, International Journal of Computer Vision 32 (1) (1999) 7–25.370

work page 1999
[6]

Y. Seo, A. Heyden, R. Cipolla, A linear iterative method for auto-calibration using the dac equation, in: Computer Vision and Pattern Recognition, 2001. CVPR 2001. Proceedings of the 2001 IEEE Computer Society Conference on, Vol. 1, IEEE, 2001, pp. I–880. 26

work page 2001
[7]

Olsson, O

C. Olsson, O. Enqvist, Stable structure from motion for unordered image collections, in: Image Analysis, Springer, 2011, pp. 524–535.375

work page 2011
[8]

Snavely, et al., Bundler: Structure from motion (sfm) for unordered image collections, Available online: phototour

N. Snavely, et al., Bundler: Structure from motion (sfm) for unordered image collections, Available online: phototour. cs. washington. edu/bundler/(accessed on 12 July 2013)

work page 2013
[9]

Martinec, T

D. Martinec, T. Pajdla, Robust rotation and translation estimation in multiview reconstruction, in: Computer Vision and Pattern Recognition, 2007. CVPR’07. IEEE Conference on, IEEE, 2007, pp. 1–8

work page 2007
[10]

Stew´ enius, D

H. Stew´ enius, D. Nist´ er, F. Kahl, F. Schaﬀalitzky, A minimal solution for relative pose with unknown380 focal length, in: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), Vol. 2, IEEE, 2005, pp. 789–794

work page 2005
[11]

S. N. Sinha, D. Steedly, R. Szeliski, A multi-stage linear approach to structure from motion, in: ECCV 2010 Workshop on Reconstruction and Modeling of Large-Scale 3D Virtual Environments, Vol. 3002, 2010, pp. 3003–3005.385

work page 2010
[12]

Hartley, K

R. Hartley, K. Aftab, J. Trumpf, L1 rotation averaging using the weiszfeld algorithm, in: Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on, IEEE, 2011, pp. 3041–3048

work page 2011
[13]

V. M. Govindu, Combining two-view constraints for motion estimation, in: Computer Vision and Pattern Recognition, 2001. CVPR 2001. Proceedings of the 2001 IEEE Computer Society Conference on, Vol. 2, IEEE, 2001, pp. II–218.390

work page 2001
[14]

V. M. Govindu, Lie-algebraic averaging for globally consistent motion estimation, in: Computer Vision and Pattern Recognition, 2004. CVPR 2004. Proceedings of the 2004 IEEE Computer Society Conference on, Vol. 1, IEEE, 2004, pp. I–684

work page 2004
[15]

Hartley, J

R. Hartley, J. Trumpf, Y. Dai, H. Li, Rotation averaging, International Journal of Computer Vision (2013) 1–39.395

work page 2013
[16]

F. Kahl, R. Hartley, Multiple-view geometry under the {L inf ty}-norm, Pattern Analysis and Machine Intelligence, IEEE Transactions on 30 (9) (2008) 1603–1617

work page 2008
[17]

O. Chum, J. Matas, Homography estimation from correspondences of local elliptical features, in: Pattern Recognition (ICPR), 2012 21st International Conference on, IEEE, 2012, pp. 3236–3239

work page 2012
[18]

Snavely, S

N. Snavely, S. M. Seitz, R. Szeliski, Photo tourism: exploring photo collections in 3d, in: ACM trans-400 actions on graphics (TOG), Vol. 25, ACM, 2006, pp. 835–846

work page 2006
[19]

M. L. Fredman, On computing the length of longest increasing subsequences, Discrete Mathematics 11 (1) (1975) 29–35. 27

work page 1975
[20]

Faugeras, Q.-T

O. Faugeras, Q.-T. Luong, T. Papadopoulo, The geometry of multiple images: the laws that govern the formation of multiple images of a scene and some of their applications, MIT press, 2004.405

work page 2004
[21]

D. G. Lowe, Distinctive image features from scale-invariant keypoints, International journal of computer vision 60 (2) (2004) 91–110

work page 2004
[22]

Turcot, D

P. Turcot, D. G. Lowe, Better matching with fewer features: The selection of useful features in large database recognition problems, in: Computer Vision Workshops (ICCV Workshops), 2009 IEEE 12th International Conference on, IEEE, 2009, pp. 2109–2116.410

work page 2009
[23]

Philbin, O

J. Philbin, O. Chum, M. Isard, J. Sivic, A. Zisserman, Object retrieval with large vocabularies and fast spatial matching, in: Computer Vision and Pattern Recognition, 2007. CVPR’07. IEEE Conference on, IEEE, 2007, pp. 1–8

work page 2007
[24]

Perd’och, O

M. Perd’och, O. Chum, J. Matas, Eﬃcient representation of local geometry for large scale object re- trieval, in: Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on, IEEE,415 2009, pp. 9–16

work page 2009
[25]

O. Chum, J. Matas, S. Obdrzalek, Enhancing ransac by generalized model optimization, in: Proc. of the ACCV, Vol. 2, 2004, pp. 812–817

work page 2004
[26]

O. Chum, J. Matas, Geometric hashing with local aﬃne frames, in: Computer Vision and Pattern Recognition, 2006 IEEE Computer Society Conference on, Vol. 1, IEEE, 2006, pp. 879–884.420

work page 2006
[27]

Sivic, A

J. Sivic, A. Zisserman, Video google: A text retrieval approach to object matching in videos, in: Computer Vision, 2003. Proceedings. Ninth IEEE International Conference on, IEEE, 2003, pp. 1470– 1477

work page 2003
[28]

Z. Wu, Q. Ke, M. Isard, J. Sun, Bundling features for large scale partial-duplicate web image search, in: Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on, IEEE, 2009,425 pp. 25–32

work page 2009
[29]

M. I. A. Lourakis, A. A. Argyros, Sba: a software package for generic sparse bundle adjustment, ACM Transactions on Mathematical Software (2009) 1–30

work page 2009
[30]

C. Wu, S. Agarwal, B. Curless, S. M. Seitz, Multicore bundle adjustment, in: Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on, IEEE, 2011, pp. 3057–3064.430

work page 2011
[31]

Dalalyan, R

A. Dalalyan, R. Keriven, l 1-penalized robust estimation for a class of inverse problems arising in multiview geometry, in: Advances in Neural Information Processing Systems, 2009, pp. 441–449. 28

work page 2009
[32]

Hartley, F

R. Hartley, F. Kahl, Optimal algorithms in multiview geometry, Computer Vision–ACCV 2007 (2007) 13–34

work page 2007
[33]

Olsson, F

C. Olsson, F. Kahl, Generalized convexity in multiple view geometry, Journal of Mathematical Imaging435 and Vision 38 (1) (2010) 35–51

work page 2010
[34]

C. Zach, M. Pollefeys, Practical methods for convex multi-view reconstruction, in: Computer Vision– ECCV 2010, Springer, 2010, pp. 354–367

work page 2010
[35]

Enqvist, F

O. Enqvist, F. Kahl, C. Olsson, Non-sequential structure from motion, in: Computer Vision Workshops (ICCV Workshops), 2011 IEEE International Conference on, IEEE, 2011, pp. 264–271.440

work page 2011
[36]

Furukawa, J

Y. Furukawa, J. Ponce, Accurate, dense, and robust multiview stereopsis, Pattern Analysis and Machine Intelligence, IEEE Transactions on 32 (8) (2010) 1362–1376

work page 2010
[37]

Kazhdan, M

M. Kazhdan, M. Bolitho, H. Hoppe, Poisson surface reconstruction, in: Proceedings of the fourth Eurographics symposium on Geometry processing, 2006

work page 2006
[38]

Aldous, P

D. Aldous, P. Diaconis, Longest increasing subsequences: from patience sorting to the baik-deift-445 johansson theorem, Bulletin of the American Mathematical Society 36 (4) (1999) 413–432

work page 1999
[39]

J. W. Hunt, T. G. Szymanski, A fast algorithm for computing longest common subsequences, Commu- nications of the ACM 20 (5) (1977) 350–353

work page 1977
[40]

van Emde Boas, Preserving order in a forest in less than logarithmic time, in: FOCS, 1975, pp

P. van Emde Boas, Preserving order in a forest in less than logarithmic time, in: FOCS, 1975, pp. 75–84

work page 1975
[41]

A. C. Gallagher, Using vanishing points to correct camera rotation in images, in: Computer and Robot450 Vision, 2005. Proceedings. The 2nd Canadian Conference on, IEEE, 2005, pp. 460–467

work page 2005
[42]

Strecha, W

C. Strecha, W. Von Hansen, L. Van Gool, P. Fua, U. Thoennessen, On benchmarking camera calibration and multi-view stereo for high resolution imagery, in: Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on, IEEE, 2008, pp. 1–8

work page 2008
[43]

Strecha, W

C. Strecha, W. Von Hansen, L. Van Gool, P. Fua, U. Thoennessen, On benchmarking camera calibration455 and multi-view stereo for high resolution imagery, in: Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on, IEEE, 2008, pp. 1–8

work page 2008
[44]

Luong, O

Q.-T. Luong, O. D. Faugeras, The fundamental matrix: Theory, algorithms, and stability analysis, International Journal of Computer Vision 17 (1) (1996) 43–75

work page 1996
[45]

H. S. Wong, T.-J. Chin, J. Yu, D. Suter, Dynamic and hierarchical multi-structure geometric model460 ﬁtting, in: International Conference on Computer Vision (ICCV), 2011. 29

work page 2011
[46]

Chandraker, S

M. Chandraker, S. Agarwal, F. Kahl, D. Nist´ er, D. Kriegman, Autocalibration via rank-constrained estimation of the absolute quadric, in: Computer Vision and Pattern Recognition, 2007. CVPR’07. IEEE Conference on, IEEE, 2007, pp. 1–8

work page 2007
[47]

Gherardi, A

R. Gherardi, A. Fusiello, Practical autocalibration, in: Computer Vision–ECCV 2010, Springer, 2010,465 pp. 790–801

work page 2010
[48]

Kukelova, M

Z. Kukelova, M. Bujnak, T. Pajdla, Polynomial eigenvalue solutions to the 5-pt and 6-pt relative pose problems., in: BMVC, 2008, pp. 1–10. Appendix A Gaussian elimination in Self calibration and metric recontruction equations To simplify the expressions, we introduce the notation Pγ i : row vector produced from i-th row of [ a]xF and permute x0 elements w...

work page 2008
[49]

The form of homography (4)

work page
[50]

(2): Pi m2 = PP2Hi where Hi denotes the homography obtained by substituting the i-th solution of Eq

Eq. (2): Pi m2 = PP2Hi where Hi denotes the homography obtained by substituting the i-th solution of Eq. (17) the lemma is readily deduced 31 Lemma 2. Let P 1 m2, P2 m2 as in Lemma 1. We have: K1 2 R1 2− K2 2 R2 2 = anT (25) where n is an appropriate vector.490 Proof. As in proof of Lemma 1, by observing that Hi for diﬀerent i values diﬀer only in v ≜−pTK...

work page
[51]

camera at inﬁnity

P1 is a full-rank matrix ( rank 3) for every projection matrix. The exception, referred to in the literature as “camera at inﬁnity”, is out of our scope. Remember we are handling a metric reconstruction

work page
[52]

(33) to determine510 det P2

P2 can be expressed in terms of P1, n, a, thus permitting the application of Eq. (33) to determine510 det P2. Now applying the previous points, we have det P2 = det P1 ⇐⇒ 1− nTR1 2 T K−1 2 a = 1 ⇐⇒ nTR1 2 T K−1 2 a = 0 ⇐⇒ −nTR1 2 T K−1 2 K2R1 2C1 = 0 , from (28): a =−K2R1 2C1 ⇐⇒ nTC1 = 0 , as RRT = I for rotation matrices R 34 Lemma 8. For the reconstruct...

work page
[53]

The projective reconstruction PP2 is in the canonical representation form [ [a]xF a ] (36) with FTa = 0

work page
[54]

[ a]x denotes the anti-symmetric matrix deﬁned to compute outer product with vector a [a]xv = a× v

work page
[55]

Let PP2 = [ A a ] = [ [a]xF a ] denote the Projection matrix for camera 2 in the projective reconstruction and p, p′ the solutions for π∞ acquired from Eq

e denotes the right null vector of F , Fe = 0 (37) Lemma 9. Let PP2 = [ A a ] = [ [a]xF a ] denote the Projection matrix for camera 2 in the projective reconstruction and p, p′ the solutions for π∞ acquired from Eq. (17) pT = ( p1 p2 p3 ) p′T = ( p′ 1 p′ 2 p′ 3 ) 35 Then p− p′ = ψef (38) where ef =   e1/f 2 1 e2/f 2 1 e3   Proof. From Eq. (5) an...

work page
[56]

The deﬁnition of n in Eq. (25)

work page
[57]

(2),(4)) and the notation for P matrix of Lemma 9 and have P1 = AK1− apTK1 P2 = AK1− ap′TK1 ⇐⇒ P2 = P1 + a(p− p′)TK1 ≜ P1− anT Now, we can rewrite Eq

The relation between PP2, PM2, H (Eqs. (2),(4)) and the notation for P matrix of Lemma 9 and have P1 = AK1− apTK1 P2 = AK1− ap′TK1 ⇐⇒ P2 = P1 + a(p− p′)TK1 ≜ P1− anT Now, we can rewrite Eq. (47) as (p− p′)TK1C1 = 0 (48) We next have P 1 M2  C1 1   = 0 ⇐⇒ PP2H1  C1 1   = 0 ⇐⇒ PP2   K1C1 −pTK1C1 + 1   = 0 From the assumption that PP2 is in the c...

work page
[58]

We denote v1 m2, v2 m2 the vectors that point along the viewing directions of cameras P 1 m2, P2 m2 respec-530 tively

work page
[59]

For P 1 m2 we assume det P1 > 0 C ≜ C1 m2 39 Proof of Theorem 2. From Results 6, 7, Lemma 1, Theorem 1 we have for P 1 m2: K2R1C =−a ⇐⇒   f2R1 T f2R2 T R3 T   C =−a ⇐⇒ R3 TC =−a3 (51) We have det P 1 = det K2R1 > 0 and so v1 2m = R3 Consequently, from Eq. (51), we have v1 2m T C =∥v1 2m∥∥C∥ cos ∠C, v1 2m =−a3 (52) In Eq. (52), ∥RT 3∥ = 1, since ...

work page

[1] [1]

R. I. Hartley, A. Zisserman, Multiple View Geometry in Computer Vision, 2nd Edition, Cambridge University Press, ISBN: 0521540518, 2004

work page 2004

[2] [2]

M. A. Fischler, R. C. Bolles, Random sample consensus: a paradigm for model ﬁtting with applications to image analysis and automated cartography, Communications of the ACM 24 (6) (1981) 381–395

work page 1981

[3] [3]

R. I. Hartley, Kruppa’s equations derived from the fundamental matrix, IEEE Transactions on pattern365 analysis and machine intelligence 19 (2) (1997) 133–135

work page 1997

[4] [4]

Nist´ er, An eﬃcient solution to the ﬁve-point relative pose problem, Pattern Analysis and Machine Intelligence, IEEE Transactions on 26 (6) (2004) 756–770

D. Nist´ er, An eﬃcient solution to the ﬁve-point relative pose problem, Pattern Analysis and Machine Intelligence, IEEE Transactions on 26 (6) (2004) 756–770

work page 2004

[5] [5]

Pollefeys, R

M. Pollefeys, R. Koch, L. Van Gool, Self-calibration and metric reconstruction inspite of varying and unknown intrinsic camera parameters, International Journal of Computer Vision 32 (1) (1999) 7–25.370

work page 1999

[6] [6]

Y. Seo, A. Heyden, R. Cipolla, A linear iterative method for auto-calibration using the dac equation, in: Computer Vision and Pattern Recognition, 2001. CVPR 2001. Proceedings of the 2001 IEEE Computer Society Conference on, Vol. 1, IEEE, 2001, pp. I–880. 26

work page 2001

[7] [7]

Olsson, O

C. Olsson, O. Enqvist, Stable structure from motion for unordered image collections, in: Image Analysis, Springer, 2011, pp. 524–535.375

work page 2011

[8] [8]

Snavely, et al., Bundler: Structure from motion (sfm) for unordered image collections, Available online: phototour

N. Snavely, et al., Bundler: Structure from motion (sfm) for unordered image collections, Available online: phototour. cs. washington. edu/bundler/(accessed on 12 July 2013)

work page 2013

[9] [9]

Martinec, T

D. Martinec, T. Pajdla, Robust rotation and translation estimation in multiview reconstruction, in: Computer Vision and Pattern Recognition, 2007. CVPR’07. IEEE Conference on, IEEE, 2007, pp. 1–8

work page 2007

[10] [10]

Stew´ enius, D

H. Stew´ enius, D. Nist´ er, F. Kahl, F. Schaﬀalitzky, A minimal solution for relative pose with unknown380 focal length, in: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), Vol. 2, IEEE, 2005, pp. 789–794

work page 2005

[11] [11]

S. N. Sinha, D. Steedly, R. Szeliski, A multi-stage linear approach to structure from motion, in: ECCV 2010 Workshop on Reconstruction and Modeling of Large-Scale 3D Virtual Environments, Vol. 3002, 2010, pp. 3003–3005.385

work page 2010

[12] [12]

Hartley, K

R. Hartley, K. Aftab, J. Trumpf, L1 rotation averaging using the weiszfeld algorithm, in: Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on, IEEE, 2011, pp. 3041–3048

work page 2011

[13] [13]

V. M. Govindu, Combining two-view constraints for motion estimation, in: Computer Vision and Pattern Recognition, 2001. CVPR 2001. Proceedings of the 2001 IEEE Computer Society Conference on, Vol. 2, IEEE, 2001, pp. II–218.390

work page 2001

[14] [14]

V. M. Govindu, Lie-algebraic averaging for globally consistent motion estimation, in: Computer Vision and Pattern Recognition, 2004. CVPR 2004. Proceedings of the 2004 IEEE Computer Society Conference on, Vol. 1, IEEE, 2004, pp. I–684

work page 2004

[15] [15]

Hartley, J

R. Hartley, J. Trumpf, Y. Dai, H. Li, Rotation averaging, International Journal of Computer Vision (2013) 1–39.395

work page 2013

[16] [16]

F. Kahl, R. Hartley, Multiple-view geometry under the {L inf ty}-norm, Pattern Analysis and Machine Intelligence, IEEE Transactions on 30 (9) (2008) 1603–1617

work page 2008

[17] [17]

O. Chum, J. Matas, Homography estimation from correspondences of local elliptical features, in: Pattern Recognition (ICPR), 2012 21st International Conference on, IEEE, 2012, pp. 3236–3239

work page 2012

[18] [18]

Snavely, S

N. Snavely, S. M. Seitz, R. Szeliski, Photo tourism: exploring photo collections in 3d, in: ACM trans-400 actions on graphics (TOG), Vol. 25, ACM, 2006, pp. 835–846

work page 2006

[19] [19]

M. L. Fredman, On computing the length of longest increasing subsequences, Discrete Mathematics 11 (1) (1975) 29–35. 27

work page 1975

[20] [20]

Faugeras, Q.-T

O. Faugeras, Q.-T. Luong, T. Papadopoulo, The geometry of multiple images: the laws that govern the formation of multiple images of a scene and some of their applications, MIT press, 2004.405

work page 2004

[21] [21]

D. G. Lowe, Distinctive image features from scale-invariant keypoints, International journal of computer vision 60 (2) (2004) 91–110

work page 2004

[22] [22]

Turcot, D

P. Turcot, D. G. Lowe, Better matching with fewer features: The selection of useful features in large database recognition problems, in: Computer Vision Workshops (ICCV Workshops), 2009 IEEE 12th International Conference on, IEEE, 2009, pp. 2109–2116.410

work page 2009

[23] [23]

Philbin, O

J. Philbin, O. Chum, M. Isard, J. Sivic, A. Zisserman, Object retrieval with large vocabularies and fast spatial matching, in: Computer Vision and Pattern Recognition, 2007. CVPR’07. IEEE Conference on, IEEE, 2007, pp. 1–8

work page 2007

[24] [24]

Perd’och, O

M. Perd’och, O. Chum, J. Matas, Eﬃcient representation of local geometry for large scale object re- trieval, in: Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on, IEEE,415 2009, pp. 9–16

work page 2009

[25] [25]

O. Chum, J. Matas, S. Obdrzalek, Enhancing ransac by generalized model optimization, in: Proc. of the ACCV, Vol. 2, 2004, pp. 812–817

work page 2004

[26] [26]

O. Chum, J. Matas, Geometric hashing with local aﬃne frames, in: Computer Vision and Pattern Recognition, 2006 IEEE Computer Society Conference on, Vol. 1, IEEE, 2006, pp. 879–884.420

work page 2006

[27] [27]

Sivic, A

J. Sivic, A. Zisserman, Video google: A text retrieval approach to object matching in videos, in: Computer Vision, 2003. Proceedings. Ninth IEEE International Conference on, IEEE, 2003, pp. 1470– 1477

work page 2003

[28] [28]

Z. Wu, Q. Ke, M. Isard, J. Sun, Bundling features for large scale partial-duplicate web image search, in: Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on, IEEE, 2009,425 pp. 25–32

work page 2009

[29] [29]

M. I. A. Lourakis, A. A. Argyros, Sba: a software package for generic sparse bundle adjustment, ACM Transactions on Mathematical Software (2009) 1–30

work page 2009

[30] [30]

C. Wu, S. Agarwal, B. Curless, S. M. Seitz, Multicore bundle adjustment, in: Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on, IEEE, 2011, pp. 3057–3064.430

work page 2011

[31] [31]

Dalalyan, R

A. Dalalyan, R. Keriven, l 1-penalized robust estimation for a class of inverse problems arising in multiview geometry, in: Advances in Neural Information Processing Systems, 2009, pp. 441–449. 28

work page 2009

[32] [32]

Hartley, F

R. Hartley, F. Kahl, Optimal algorithms in multiview geometry, Computer Vision–ACCV 2007 (2007) 13–34

work page 2007

[33] [33]

Olsson, F

C. Olsson, F. Kahl, Generalized convexity in multiple view geometry, Journal of Mathematical Imaging435 and Vision 38 (1) (2010) 35–51

work page 2010

[34] [34]

C. Zach, M. Pollefeys, Practical methods for convex multi-view reconstruction, in: Computer Vision– ECCV 2010, Springer, 2010, pp. 354–367

work page 2010

[35] [35]

Enqvist, F

O. Enqvist, F. Kahl, C. Olsson, Non-sequential structure from motion, in: Computer Vision Workshops (ICCV Workshops), 2011 IEEE International Conference on, IEEE, 2011, pp. 264–271.440

work page 2011

[36] [36]

Furukawa, J

Y. Furukawa, J. Ponce, Accurate, dense, and robust multiview stereopsis, Pattern Analysis and Machine Intelligence, IEEE Transactions on 32 (8) (2010) 1362–1376

work page 2010

[37] [37]

Kazhdan, M

M. Kazhdan, M. Bolitho, H. Hoppe, Poisson surface reconstruction, in: Proceedings of the fourth Eurographics symposium on Geometry processing, 2006

work page 2006

[38] [38]

Aldous, P

D. Aldous, P. Diaconis, Longest increasing subsequences: from patience sorting to the baik-deift-445 johansson theorem, Bulletin of the American Mathematical Society 36 (4) (1999) 413–432

work page 1999

[39] [39]

J. W. Hunt, T. G. Szymanski, A fast algorithm for computing longest common subsequences, Commu- nications of the ACM 20 (5) (1977) 350–353

work page 1977

[40] [40]

van Emde Boas, Preserving order in a forest in less than logarithmic time, in: FOCS, 1975, pp

P. van Emde Boas, Preserving order in a forest in less than logarithmic time, in: FOCS, 1975, pp. 75–84

work page 1975

[41] [41]

A. C. Gallagher, Using vanishing points to correct camera rotation in images, in: Computer and Robot450 Vision, 2005. Proceedings. The 2nd Canadian Conference on, IEEE, 2005, pp. 460–467

work page 2005

[42] [42]

Strecha, W

C. Strecha, W. Von Hansen, L. Van Gool, P. Fua, U. Thoennessen, On benchmarking camera calibration and multi-view stereo for high resolution imagery, in: Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on, IEEE, 2008, pp. 1–8

work page 2008

[43] [43]

Strecha, W

C. Strecha, W. Von Hansen, L. Van Gool, P. Fua, U. Thoennessen, On benchmarking camera calibration455 and multi-view stereo for high resolution imagery, in: Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on, IEEE, 2008, pp. 1–8

work page 2008

[44] [44]

Luong, O

Q.-T. Luong, O. D. Faugeras, The fundamental matrix: Theory, algorithms, and stability analysis, International Journal of Computer Vision 17 (1) (1996) 43–75

work page 1996

[45] [45]

H. S. Wong, T.-J. Chin, J. Yu, D. Suter, Dynamic and hierarchical multi-structure geometric model460 ﬁtting, in: International Conference on Computer Vision (ICCV), 2011. 29

work page 2011

[46] [46]

Chandraker, S

M. Chandraker, S. Agarwal, F. Kahl, D. Nist´ er, D. Kriegman, Autocalibration via rank-constrained estimation of the absolute quadric, in: Computer Vision and Pattern Recognition, 2007. CVPR’07. IEEE Conference on, IEEE, 2007, pp. 1–8

work page 2007

[47] [47]

Gherardi, A

R. Gherardi, A. Fusiello, Practical autocalibration, in: Computer Vision–ECCV 2010, Springer, 2010,465 pp. 790–801

work page 2010

[48] [48]

Kukelova, M

Z. Kukelova, M. Bujnak, T. Pajdla, Polynomial eigenvalue solutions to the 5-pt and 6-pt relative pose problems., in: BMVC, 2008, pp. 1–10. Appendix A Gaussian elimination in Self calibration and metric recontruction equations To simplify the expressions, we introduce the notation Pγ i : row vector produced from i-th row of [ a]xF and permute x0 elements w...

work page 2008

[49] [49]

The form of homography (4)

work page

[50] [50]

(2): Pi m2 = PP2Hi where Hi denotes the homography obtained by substituting the i-th solution of Eq

Eq. (2): Pi m2 = PP2Hi where Hi denotes the homography obtained by substituting the i-th solution of Eq. (17) the lemma is readily deduced 31 Lemma 2. Let P 1 m2, P2 m2 as in Lemma 1. We have: K1 2 R1 2− K2 2 R2 2 = anT (25) where n is an appropriate vector.490 Proof. As in proof of Lemma 1, by observing that Hi for diﬀerent i values diﬀer only in v ≜−pTK...

work page

[51] [51]

camera at inﬁnity

P1 is a full-rank matrix ( rank 3) for every projection matrix. The exception, referred to in the literature as “camera at inﬁnity”, is out of our scope. Remember we are handling a metric reconstruction

work page

[52] [52]

(33) to determine510 det P2

P2 can be expressed in terms of P1, n, a, thus permitting the application of Eq. (33) to determine510 det P2. Now applying the previous points, we have det P2 = det P1 ⇐⇒ 1− nTR1 2 T K−1 2 a = 1 ⇐⇒ nTR1 2 T K−1 2 a = 0 ⇐⇒ −nTR1 2 T K−1 2 K2R1 2C1 = 0 , from (28): a =−K2R1 2C1 ⇐⇒ nTC1 = 0 , as RRT = I for rotation matrices R 34 Lemma 8. For the reconstruct...

work page

[53] [53]

The projective reconstruction PP2 is in the canonical representation form [ [a]xF a ] (36) with FTa = 0

work page

[54] [54]

[ a]x denotes the anti-symmetric matrix deﬁned to compute outer product with vector a [a]xv = a× v

work page

[55] [55]

Let PP2 = [ A a ] = [ [a]xF a ] denote the Projection matrix for camera 2 in the projective reconstruction and p, p′ the solutions for π∞ acquired from Eq

e denotes the right null vector of F , Fe = 0 (37) Lemma 9. Let PP2 = [ A a ] = [ [a]xF a ] denote the Projection matrix for camera 2 in the projective reconstruction and p, p′ the solutions for π∞ acquired from Eq. (17) pT = ( p1 p2 p3 ) p′T = ( p′ 1 p′ 2 p′ 3 ) 35 Then p− p′ = ψef (38) where ef =   e1/f 2 1 e2/f 2 1 e3   Proof. From Eq. (5) an...

work page

[56] [56]

The deﬁnition of n in Eq. (25)

work page

[57] [57]

(2),(4)) and the notation for P matrix of Lemma 9 and have P1 = AK1− apTK1 P2 = AK1− ap′TK1 ⇐⇒ P2 = P1 + a(p− p′)TK1 ≜ P1− anT Now, we can rewrite Eq

The relation between PP2, PM2, H (Eqs. (2),(4)) and the notation for P matrix of Lemma 9 and have P1 = AK1− apTK1 P2 = AK1− ap′TK1 ⇐⇒ P2 = P1 + a(p− p′)TK1 ≜ P1− anT Now, we can rewrite Eq. (47) as (p− p′)TK1C1 = 0 (48) We next have P 1 M2  C1 1   = 0 ⇐⇒ PP2H1  C1 1   = 0 ⇐⇒ PP2   K1C1 −pTK1C1 + 1   = 0 From the assumption that PP2 is in the c...

work page

[58] [58]

We denote v1 m2, v2 m2 the vectors that point along the viewing directions of cameras P 1 m2, P2 m2 respec-530 tively

work page

[59] [59]

For P 1 m2 we assume det P1 > 0 C ≜ C1 m2 39 Proof of Theorem 2. From Results 6, 7, Lemma 1, Theorem 1 we have for P 1 m2: K2R1C =−a ⇐⇒   f2R1 T f2R2 T R3 T   C =−a ⇐⇒ R3 TC =−a3 (51) We have det P 1 = det K2R1 > 0 and so v1 2m = R3 Consequently, from Eq. (51), we have v1 2m T C =∥v1 2m∥∥C∥ cos ∠C, v1 2m =−a3 (52) In Eq. (52), ∥RT 3∥ = 1, since ...

work page