Skip to main content
2 of 2
deleted 29 characters in body

Alignment of the plane coordinate system with the square is similarity transformation

I'm reading the book "Multiple View Geometry in Computer Vision" of Hartley and Zisserman. There was something I was unable to understand correctly. Here's the example where K denotes calibration matrix and w conic in image plane.:

Example 8.18. A simple calibration device The image of three squares (on planes which are not parallel, but which need not be orthogonal) provides sufficiently many constraints to compute K. Consider one of the squares. The correspondences between its four corner points and their images define the homography H between the plane π of the square and the image. Applying this homography to circular points on π determines their images as H(1, ±i, 0) T . Thus we have two points on the (as yet unknown) ω. A similar procedure applied to the other squares generates a total of six points on ω, from which it may be computed (since five points are required to determine a conic). In outline the algorithm has the following steps:

  1. For each square compute the homography H that maps its corner points, (0, 0) T , (1, 0) T , (0, 1) T , (1, 1) T , to their imaged points. (The alignment of the plane coordinate system with the square is a similarity transformation and does not affect the position of the circular points on the plane)
  2. Compute the imaged circular points for the plane of that square as H(1, ±i, 0) T .Writing H = [h 1 , h 2 , h 3 ], the imaged circular points are h 1 ± ih 2 .
  3. Fit a conic ω to the six imaged circular points.

Why is the transformation in bold part "The alignment of the plane coordinate system with the square is a similarity transformation and does not affect the position of the circular points on the plane" (or H) supposed to be a similarity transformation (or matrix)? I thought imaging the real square here is projective transformation which is a more general case, not a similarity transformation. What am I missing?