pith. sign in

arxiv: 1906.11927 · v1 · pith:OLRSMSPVnew · submitted 2019-06-27 · 💻 cs.CV

Homography from two orientation- and scale-covariant features

Pith reviewed 2026-05-25 14:42 UTC · model grok-4.3

classification 💻 cs.CV
keywords homography estimationcovariant featuresminimal solvertwo-point algorithmscale constraintsrotation constraintsSIFTRANSAC
0
0 comments X

The pith

Two orientation- and scale-covariant features suffice to estimate a homography by deriving new constraints from their scales and rotations.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper derives two new general constraints on the scales and rotations supplied by detectors such as SIFT. These constraints are specialized to the homography case, producing additional equations that reduce the minimal number of point correspondences from four to two. A solver is then built on the resulting equations. The approach also shows how normalizing the input points stabilizes the recovered scale and rotation values. A reader would care because robust estimators such as RANSAC would require far fewer random samples when only two pairs are needed instead of four.

Core claim

By providing a geometric interpretation of the angles and scales returned by orientation- and scale-covariant feature detectors, two new constraints on these quantities are obtained. When restricted to a homography, the constraints yield a minimal solver that recovers the homography from exactly two correspondences. Normalization of the correspondences is shown to keep the recovered rotation and scale parameters numerically stable.

What carries the argument

Two new constraints on scales and rotations derived from the geometric effect of a homography on covariant features.

If this is right

  • Homography estimation requires only two feature pairs instead of four.
  • RANSAC needs substantially fewer iterations when applied to the two-point solver.
  • The scale and rotation information already supplied by detectors such as SIFT is used at no extra cost.
  • The same scale-rotation constraints can be inserted into any other geometric estimation task that involves a homography.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The method could be extended to other planar transformations whose action on local scale and orientation can be written in closed form.
  • In practice the two-point solver would be combined with the four-point solver inside a hybrid RANSAC loop to handle both minimal and over-determined cases.
  • Numerical stability gains from normalization suggest that similar preprocessing steps may improve other minimal solvers that recover rotation or scale parameters.

Load-bearing premise

The angles and scales reported by the detectors correspond directly to the geometric quantities transformed by the homography.

What would settle it

Run the two-point solver on synthetic homographies with known ground-truth scales and rotations; if the recovered homography matrix deviates substantially from the ground truth while the four-point DLT succeeds, the derived constraints do not hold.

Figures

Figures reproduced from arXiv: 1906.11927 by Daniel Barath, Zuzana Kukelova.

Figure 1
Figure 1. Figure 1: Visualization of the orientation- and scale [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Inliers of the estimated homographies (by 2SIFT) drawn to example image pairs. The numbers of iterations of [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Stability study. The frequencies (100 000 runs; vertical axis) of log10 errors (horizontal) in the estimated homographies by the proposed (red), 4PT (green) and 3ORI (blue) methods. Ab = T2  A 0 0 1 T −1 1 , (17) where Ab is the normalized affinity. Matrix Ti transforms the points by translating them (last column) and applying a uniform scaling (diagonal). Due to the fact that the last column of Ti has n… view at source ↗
Figure 4
Figure 4. Figure 4: The average (of 10 000 runs on each noise σ) re-projection error of homography fitting to synthesized data by the proposed (2SIFT), normalized 4PT [15] and 3ORI [3] methods. Each camera is located randomly on a center-aligned sphere. Ten points from the object are projected into the cameras, and zero-mean Gaussian-noise is added to the coordinates. The affine parameters are calculated from the noisy coordi… view at source ↗
Figure 5
Figure 5. Figure 5: The average (10 000 runs on each noise σ) re￾projection error of homography fitting to synthesized data by the 2SIFT, normalized 4PT [15] and 3ORI [3] methods. The same test scene is used as in [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: The results on 15 sequences (9 064 image pairs) of the Malaga dataset using GC-RANSAC [7] as a robust estimator and different minimal solvers (2SIFT, 3ORI, 4PT). The confidence of RANSAC was set to 0.95 and the inlier￾outlier threshold to 2.0 pixels. The re-projection error (left; in pixels), average processing time (middle; in seconds) and average iteration number (right) are reported. 6. Conclusion We pr… view at source ↗
read the original abstract

This paper proposes a geometric interpretation of the angles and scales which the orientation- and scale-covariant feature detectors, e.g. SIFT, provide. Two new general constraints are derived on the scales and rotations which can be used in any geometric model estimation tasks. Using these formulas, two new constraints on homography estimation are introduced. Exploiting the derived equations, a solver for estimating the homography from the minimal number of two correspondences is proposed. Also, it is shown how the normalization of the point correspondences affects the rotation and scale parameters, thus achieving numerically stable results. Due to requiring merely two feature pairs, robust estimators, e.g. RANSAC, do significantly fewer iterations than by using the four-point algorithm. When using covariant features, e.g. SIFT, the information about the scale and orientation is given at no cost. The proposed homography estimation method is tested in a synthetic environment and on publicly available real-world datasets.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The paper derives two new general constraints on the scales and rotations induced by a homography acting on orientation- and scale-covariant features (e.g., SIFT). These constraints, together with the standard point correspondences, enable a minimal solver for the eight degrees of freedom of a homography from only two feature pairs. The manuscript also shows how point normalization affects the rotation and scale parameters to improve numerical stability and demonstrates the approach on synthetic data and public real-world datasets.

Significance. If the geometric constraints are independent and correctly derived, the method reduces the minimal sample size for homography estimation from four to two points when scale and orientation are already available from the detector. This directly lowers the number of RANSAC iterations required for robust estimation and exploits information that is obtained at no extra cost, which is a practical advantage for vision pipelines that rely on covariant features.

major comments (1)
  1. [Derivation of constraints] The derivation section must explicitly verify that the two new scale/orientation constraints per correspondence are linearly independent of the two point equations and of each other; otherwise the eight-equation system for two correspondences may be rank-deficient. The abstract states that exactly eight independent equations are obtained, but no rank argument or degeneracy analysis is referenced in the provided description.
minor comments (2)
  1. [Normalization] The normalization procedure for maintaining numerical stability should include a brief statement of the condition number improvement or a small synthetic example quantifying the effect on the estimated homography.
  2. [Experiments] A short discussion of degenerate configurations (e.g., when the two features are collinear or have identical scales) would help readers assess practical applicability.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive review and positive assessment of the work. We address the single major comment below.

read point-by-point responses
  1. Referee: [Derivation of constraints] The derivation section must explicitly verify that the two new scale/orientation constraints per correspondence are linearly independent of the two point equations and of each other; otherwise the eight-equation system for two correspondences may be rank-deficient. The abstract states that exactly eight independent equations are obtained, but no rank argument or degeneracy analysis is referenced in the provided description.

    Authors: We agree that an explicit verification strengthens the paper. The two point equations per correspondence arise solely from the positional mapping x' = Hx. The two new constraints are obtained by applying the homography to the local affine frame encoded by each covariant feature: one from the singular values (scale change) and one from the argument of the complex representation (orientation change). These act on distinct components of the 3x3 homography matrix and are therefore algebraically independent of the positional equations and of each other. Nevertheless, to satisfy the request we will add a short rank analysis (via the Jacobian of the eight-equation system evaluated at generic points) together with a brief degeneracy discussion in the revised derivation section. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation is geometrically self-contained

full rationale

The paper presents a first-principles geometric derivation of two new constraints relating local scale and orientation changes under a homography, obtained directly from the transformation properties of covariant features. These constraints are combined with the standard two-point equations per correspondence to produce an eight-equation system for the eight degrees of freedom of H, enabling a two-correspondence solver. No fitted parameters are renamed as predictions, no self-citation chain supplies the central equations, and the normalization discussion addresses numerical stability rather than redefining the result. The derivation therefore stands independently of its own outputs.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Based solely on the abstract, the central claim rests on the domain assumption that covariant features supply scale and orientation values that admit a direct geometric interpretation under homography; no free parameters or invented entities are mentioned.

axioms (1)
  • domain assumption The scale and orientation reported by covariant feature detectors admit a geometric interpretation that yields usable constraints under a homography transformation.
    Invoked in the first sentence of the abstract as the foundation for deriving the two new constraints.

pith-pipeline@v0.9.0 · 5689 in / 1186 out tokens · 30379 ms · 2026-05-25T14:42:13.435711+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

32 extracted references · 32 canonical work pages · 1 internal anchor

  1. [1]

    D. Barath. P-HAF: Homography estimation using partial lo- cal affine frames. In International Conference on Computer Vision Theory and Applications, 2017. 2

  2. [2]

    Bar ´ath

    D. Bar ´ath. Approximate epipolar geometry from six rotation invariant correspondences. 2018. 2

  3. [3]

    D. Barath. Five-point fundamental matrix estimation for un- calibrated cameras. Conference on Computer Vision and Pat- tern Recognition, 2018. 2, 3, 5, 6, 7

  4. [4]

    Bar ´ath

    D. Bar ´ath. Recovering affine features from orientation-and scale-invariant ones. 2018. 2, 3

  5. [5]

    Barath and L

    D. Barath and L. Hajder. A theory of point-wise homography estimation. Pattern Recognition Letters, 94:7–14, 2017. 1

  6. [6]

    Barath and L

    D. Barath and L. Hajder. Efficient recovery of essential ma- trix from two affine correspondences.IEEE Transactions on Image Processing, 27(11):5328–5337, 2018. 1, 4

  7. [7]

    Barath and J

    D. Barath and J. Matas. Graph-Cut RANSAC. Conference on Computer Vision and Pattern Recognition, 2018. 2, 4, 6, 7, 8

  8. [8]

    Bar ´ath, J

    D. Bar ´ath, J. Moln´ar, and L. Hajder. Optimal surface normal from affine transformation. 2015. 1, 3

  9. [9]

    Barath, T

    D. Barath, T. Toth, and L. Hajder. A minimal solution for two-view focal-length estimation using two affine corre- spondences. In Conference on Computer Vision and Pattern Recognition, 2017. 1

  10. [10]

    H. Bay, T. Tuytelaars, and L. Van Gool. SURF: Speeded up robust features. European Conference on Computer Vision,

  11. [11]

    Bentolila and J

    J. Bentolila and J. M. Francos. Conic epipolar constraints from affine correspondences. Computer Vision and Image Understanding, 2014. 1

  12. [12]

    Chum and J

    O. Chum and J. Matas. Matching with PROSAC-progressive sample consensus. In Computer Vision and Pattern Recogni- tion, 2005. 6

  13. [13]

    D. Cox, J. Little, and D. O’Shea. Using Algebraic Geometry. 2nd edition, 2005. 3, 4

  14. [14]

    Grayson and M

    D. Grayson and M. Stillman. Macaulay2, a software system for research in algebraic geometry. available at www.math.uiuc.edu/Macaulay2/. 3

  15. [15]

    Hartley and A

    R. Hartley and A. Zisserman. Multiple view geometry in computer vision. Cambridge University Press, 2003. 2, 5, 6, 7

  16. [16]

    R. I. Hartley. In defense of the eight-point algorithm. Pattern Analysis and Machine Intelligence, 1997. 3, 4

  17. [17]

    K. K ¨oser. Geometric Estimation with Local Affine Frames and Free-form Surfaces. Shaker, 2009. 1

  18. [18]

    Kreyszig

    E. Kreyszig. Introduction to differential geometry and Rie- mannian geometry, volume 16. University of Toronto Press,

  19. [19]

    Kukelova, M

    Z. Kukelova, M. Bujnak, and T. Pajdla. Automatic gener- ator of minimal problem solvers. In European Conference on Computer Vision, volume 5304 of Lecture Notes in Com- puter Science, 2008. 4

  20. [20]

    Kukelova, J

    Z. Kukelova, J. Heller, and A. Fitzgibbon. Efficient intersec- tion of three quadrics and applications in computer vision. In Conference on Computer Vision and Pattern Recognition, pages 1799–1808, 2016. 5

  21. [21]

    A clever elimination strategy for efficient minimal solvers

    Z. Kukelova, J. Kileel, B. Sturmfels, and T. Pajdla. A clever elimination strategy for efficient minimal solvers. In Con- ference on Computer Vision and Pattern Recognition, 2017. http://arxiv.org/abs/1703.05289. 3

  22. [22]

    D. G. Lowe. Object recognition from local scale-invariant features. In International Conference on Computer vision ,

  23. [23]

    Matas, O

    J. Matas, O. Chum, M. Urban, and T. Pajdla. Robust wide- baseline stereo from maximally stable extremal regions. Im- age and vision computing, 2004. 2

  24. [24]

    Mikolajczyk, T

    K. Mikolajczyk, T. Tuytelaars, C. Schmid, A. Zisserman, J. Matas, F. Schaffalitzky, T. Kadir, and L. Van Gool. A comparison of affine region detectors.International Journal of Computer Vision, 65(1-2):43–72, 2005. 2

  25. [25]

    S. Mills. Four-and seven-point relative camera pose from oriented features. In International Conference on 3D Vision, pages 218–227. IEEE, 2018. 2

  26. [26]

    Mishkin, J

    D. Mishkin, J. Matas, and M. Perdoch. MODS: Fast and robust method for two-view matching. Computer Vision and Image Understanding, 2015. 1, 2

  27. [27]

    Moln ´ar and D

    J. Moln ´ar and D. Chetverikov. Quadratic transformation for planar mapping of implicit surfaces. Journal of Mathemati- cal Imaging and Vision, 2014. 2

  28. [28]

    Morel and G

    J.-M. Morel and G. Yu. ASIFT: A new framework for fully affine invariant image comparison.SIAM journal on imaging sciences, 2(2):438–469, 2009. 1

  29. [29]

    Perdoch, J

    M. Perdoch, J. Matas, and O. Chum. Epipolar geometry from two correspondences. In International Conference on Pat- tern Recognition, 2006. 1

  30. [30]

    Pritts, Z

    J. Pritts, Z. Kukelova, V . Larsson, and O. Chum. Radially- distorted conjugate translations. Conference on Computer Vision and Pattern Recognition, 2018. 1

  31. [31]

    Raposo and J

    C. Raposo and J. P. Barreto. Theory and practice of structure- from-motion using affine correspondences. In Computer Vi- sion and Pattern Recognition, 2016. 1

  32. [32]

    Rublee, V

    E. Rublee, V . Rabaud, K. Konolige, and G. Bradski. ORB: An efficient alternative to sift or surf. In International Con- ference on Computer Vision, pages 2564–2571. IEEE, 2011. 2 9