Veverka B., Potuchkova M. Image matching

Image matching

Author: Veverka B., Potuchkova M.

Source: http://www.plan.aau.dk/GetAsset.action?contentId=3592304&assetId=3614911

The first solution to the problem of image matching, although analogue in its nature, was given by Hobrough already in late 1950s. A correlator, the first system dealing with automatic finding conjugate points was presented by Wild Heerbrugg company in 1964. The system did not find a wide practical application. Nevertheless, Hobrough’s idea of applying cross-correlation found a lot of successors. From early 1970s the development focused on matching digital images and digital correlation was successfully implemented into photogrammetric systems (Helava, 1978). Nowadays, image matching techniques are incorporated into commercial photogrammetric software packages as standard measuring and calculation tools (e.g. ImageStation™ of Z/I Imaging, Match-T and Match-AT of Inpho, Phodis of Zeiss, etc.). In comparison with a manual measurement, automatic methods are faster (especially in aerotriangulation of large image blocks) and the achieved accuracy is in general higher or at least comparable with the accuracy obtained from analytical instruments. On the other hand, due to relatively high amount of mismatches that usually appear a high number of observations (redundancy principle) and implementing techniques for detection and elimination of outliers have an essential importance in order to achieve a high accuracy (Ackerman, 1996a).

Image matching is conventionally performed in the image space. This approach is also an issue of the thesis. The concept of object space matching was also developed (Helava, 1988) but it has not found practical applications so far.

A lot of research has been done with respect to automatic finding tie points connecting two overlapping images (a stereo pair) or connecting images within a block (e.g. Tang and Heipke, 1996, Ackermann, 1996). The search for corresponding points can be done in 2D, e.g. within a rectangle orientated along an approximate epipolar line (see chapter 1.2.1) in case of relative orientation of a stereo pair. In case of known orientation parameters, the search can be restricted only to one dimension i.e. directly on an epipolar line as it is used in the derivation of digital elevation models. Matching between an aerial image and orthoimage of different dates or two orthoimages is also possible and is the issue of the practical tasks described in chapters 2 and 3.

The key issue connected with image matching is a choice of a matching entity (a primitive that is compared with a primitive in other images) and a similarity measure (a quantitative measure evaluating the match of entities). Matching ‘pixel by pixel’ over the whole overlapping area of images means an enormous amount of calculations. Moreover, it leads to ambiguity due to a repetitive occurrence of grey values and due to noise in images. Thus, in general image matching belongs to the group of ill-posed problems. It does not fulfil criteria of an existing, unique solution that is stable with respect to small variations in the input data. In order to change image matching into a well-posed problem, such matching entities, similarity measures, geometric constraints, and assumptions must be defined that the space of all the possible solutions will be restricted. Table 1.2 gives an overview of three basic methods of image matching that have been developed and are used in photogrammetry and computer vision

Tab. 1.2 Image matching methods

Matching method Similarity measure Matching entity

Area-based correlation, least-squares matching grey values

Feature-based cost function interest points, edges, regions

Relational cost function symbolic description of an image

In the following sections the matching methods are described in detail. Attention is especially paid to area and feature based techniques. Possibilities of using similarity measures as mutual information and image distance are also discussed.

1.2.1 Area based methods

Grey values are the matching entities in area based matching. Matching one pixel brings an ambiguity problem. Grey values of several neighbouring pixels are therefore used. An image patch cut from one image, so called template, is searched in the second image. The template consists of m x n pixels, mostly m=n. The position of the template is referred to the central pixel that is why m and n are odd numbers. The template is compared with patches of the same size in the second image. An approximate position of a corresponding point in the second image can usually be derived (e.g. when approximations of orientation parameters of two overlapping images are known). The comparison is then restricted to an area called a search area or window (Schenk, 1999). A value of a similarity measure is calculated at each position of the template within the search area. Depending on the character of the similarity measure, a corresponding point to the centre of the template is assumed to be in the position where maximal or minimal value of the similarity measure is obtained. In photogrammetry, cross-correlation and least squares matching are the mostly used techniques for area based matching. Mutual information and image distance were applied e.g. for registering MR (magnetic resonance) or CT (computed tomography) images (Maes et al., 1997), in genome engineering (Yu and Jiang, 2001) but also in photogrammetry (Paszotta, 1999). Regardless which of similarity measures is chosen, there are several issues that have to be considered.

Size and location of the template

Larger the template, more the requirement of uniqueness of the matching entity is fulfilled. On the other hand, geometric distortions caused by relief and different orientation of images will influence matching of large templates. The requirement of uniqueness cannot be fulfilled in areas with a repetitive pattern or low contrast and structure. Water or sand areas are typical examples where image matching techniques fail. The areas hidden by high objects should also be excluded. Area based matching as a low level process finds conjugate points in spite of one of them is hidden in one of the images. In similar way, corresponding points on moving vehicles or in shadows lead to incorrect determination of parallaxes. In steep slope areas the corresponding image patches are not geometrically alike. In order to get acceptable results, a size of the template has to be small or its shape adapted to geometric distortion (e.g. a trapezoidal window). One of the possibilities of excluding undesirable areas or finding areas where image matching must be carried out with extra care is using GIS databases. This approach can be easily applied e.g. in DTM generation.

Size and location of the search window

In order to avoid mismatches, the position of the search window must be determined quite accurate in area based matching. Approximations of calculated parameters (e.g. orientation parameters, DTM) and hierarchical approach are usually used for this purpose. Hierarchical approach or coarse-to-fine strategy means that the matching process starts at higher levels of an image pyramid (reduced pixel size) where small details are suppressed. Parameters calculated from the measurements in a higher level of the image pyramid are then used as staring point for matching in a lower level. In the level with the finest geometric resolution the approximations of calculated parameters are good enough for positioning a search window with such accuracy that image matching methods resulting in subpixel accuracy can be applied.

When working with a stereo pair, additional geometric constraints can be applied such as epipolar lines. Fig. 1.8 shows a concept of an epipolar line constraint. Epipolar lines are intersections of an epipolar plane and image planes. The epipolar plane is defined by projection centres O1, O2 and an object point P. Therefore conjugate points P’ and P’’ must lie on corresponding epipolar lines e’ and e’’. In order to make matching along epipolar lines easier, images can be transformed to so called normalised images i.e. all epipolar lines in the image are parallel.

Fig. 1.8 Principle of epipolar geometry. An epipolar plane is defined by projection centres O1 and O2 and a object point P.
Epipolar lines e’ and e’’ are intersections of the epipolar plane with the image planes. (adapted from Schenk, 1999)

The size of the search window depends on how precise it is located and on geometric deformations due to relief and image orientations.

Acceptance criteria for the similarity measure

Excluding mismatches is one of the tasks connected to image matching. One possibility how to avoid some of outliers in matching is by setting thresholds for similarity measures. The thresholds can seldom be set for all cases. Although some default values are presented in literature, it happens that the match is successful in spite of exceeding the threshold and vice versa. After the position of ‘the best fit’ is found, assessment of accuracy and reliability of a found match must be carried out. In addition to thresholding similarity measures, geometrical constrains or robust adjustment techniques are used in further calculations in order to exclude false matches.

1.2.1.1 Correlation

The normalised cross-correlation coefficient r is one of common similarity measures used in photogrammetry. It is calculated by the formula 1.4:

(1.4)

r -normalised cross-correlation coefficient

- standard deviations of grey values in the template and search image patches

- covariance of grey values in the image patches

gT,gS - серые значения для шаблона и искомой области изображения

- mean grey values

R,C - number of rows and columns of image patches

If the template and search image patches are represented by vectors vT, vS of 1xRC grey values reduced of their means gT and gS, the correlation coefficient can be interpreted as r = cos , where - is an angle between the vectors, as shown in Fig. 1.9 (Kasser and Egels, 2002).

Fig. 1.9 Geometric interpretation of a correlation coefficient r = cos = vTvS /( |vT|.|vS| )/p>
The normalised correlation coefficient has values within the range –1 <= r <= 1. The value 1 is reached only if mage patches gT and gS are linked by a linear relation gT = rsgS+rt, rs>0, where rs corresponds to a scale factor and rt to a shift between grey values in gT and gS. Values close to 0 indicate no similarity and the value of –1 is obtained when the positive and the negative of an image are matched. Thus, in the image matching process positive values close to 1 are demanded.

A template moves pixel by pixel over the search window and a correlation coefficient is calculated in each position. The position where the correlation coefficient reaches its highest value is selected as a position of the best match/fit (see Fig. 1.10).

Fig. 1.10 Principle of image matching based on finding maximum of correlation coefficient r. The graph in the middle shows values of the correlation coefficient calculated in 13 x 13 positions of the template within the search area. The correlation coefficient reaches its maximum 0.79 at the position row=30 and col=32. The search area comes from a photo taken by a reseau camera Rollei 6006 metric that was scanned with pixel size of 21 mkm, the template was created manually.

The correlation coefficient itself does not inform about the accuracy of the found position of the best fit. Some investigations showing a connection between the variance of the determined shift from the centre of the search window, signal to noise ratio, and a size of the template can be found in (Rosenholm, 1986). This theoretical result has not been implemented into practical calculations so far. Nevertheless, it clearly shows that the reliability of a determined position of the best fit depends on radiometric properties of the patches that vary due to different illumination and viewing angle, temporal changes, or projection of matched images. A type of land cover and terrain also plays an important role.

A standard deviation of grey values (formula 1.1) and entropy are measures of contrast and quantity of information in an image patch and they can be used for an evaluation of suitability of the chosen template for matching. Autocorrelation can be used for the same purpose as well (see chapter 1.2.1.5, test A). In automated procedures interest or edge operators (see chapter 1.2.2) are applied. In the following matching only those results are accepted where a maximal correlation coefficient exceeds the given threshold. In processes with well determined objects of measurement like fiducial marks or artificial targets, the thresholding is a successful method for eliminating or at least considerable reducing the number of outliers. E.g. a threshold value of 0.7 proved to be suitable for automatic measurement of fiducial marks. In case of grid crosses or signalized control points when the background is not homogenous, the threshold must be somewhat reduced, e.g. to 0.5 (Kraus, 1997). A similar situation is with natural control points and tie points although in practical applications a value threshold of 0.7 is often a standard. In general, setting a threshold for a correlation coefficient does not mean that all the mismatches are eliminated. When working with natural targets, some good matches have low and some false matches have a high correlation coefficient. By setting a threshold, a number of successful matches are excluded from further calculations while some of mismatches remain. Therefore algorithms for calculating orientation parameters or for DTM generation from matched points must contain routines for eliminating outliers (see chapter 1.3).

If the position of the best fit should be determined with subpixel accuracy, the values of correlation coefficient around its maximum are approximated by a continuous function, e.g. polynomial which parameters are solved in least squares adjustment (Kraus, 1997). The position of the maximum of the polynomial corresponds to the position of the best fit in the subpixel range. Based on standard deviations of polynomial’s coefficients derived in least squares adjustment, a standard deviation of the improved position of the best fit can be calculated. In case of searching along an epipolar line a solution is restricted to a correlation curve. The method of approximation of a correlation surface by a 2nd order polynomial including formulas for calculating standard deviations of the derived position is described in detail in Appendix B.1.