Yevstyunicheva Anastasiya Abstract System of the access control based on fingerprints identification

Yevstyunicheva Anastasiya

рус укр eng

Faculty of the Computer Information Technologies and Automatics Spesilty Automation System of Control

The theme: "Automation System of the access control based on fingerprints identification".

Introduction. Importance. Novelty
1. Object characteristic
1.1 Object description
1.2 Methods of the images receiving
1.3 Field of application
2. Method review
2.1 Problem o fthe image processing
2.2 Recognition method algorithm
2.3 Recognition method analysis
2.3.1 Classes of the fingerprint comparison algorithm
2.3.2 Fingerprint classification
2.3.3 Several approaches for automatic fingerprint classification
2.3.4 Classifiers
2.3.5 A Multichannel Approach to Fingerprint Classification
2.3.6 Composite method of the fingerprint recognition
Conclusion
Literature

Fingerprint classification is a technique used to assign a fingerprint into one of the several prespecified types already established in the literature which can provide an indexing mechanism. Fingerprint classification can be viewed as a coarse level matching of the finger-prints. An input fingerprint is first matched at a coarse level to one of the prespecified types and then, at a finer level, it is compared to a subset of the database corresponding to that fingerprint type. In this study, we classify fingerprints into five classes, namely, whorl (W), right loop (R), left loop (L), arch (A), and tented arch (T) (Fig. 2.5).

Fig. 2.5. Major fingerprint classes. Twin loop images are labeled as whorl in the NIST-4 database. (a) Twin loop (W), (b) Whorl (W), (c) Right loop (R), (d) Left loop (L), (e) Arch (A), (f) Tented arch (T).

Several approaches have been developed for automatic fingerprint classification. These approaches can be broadly categorized into five main categories:

1) model-based.

The model-based fingerprint classification technique uses the locations of singular points (core and delta) to classify a fingerprint into the five above-mentioned classes. A model-based approach tries to capture the knowledge of a human expert by deriving rules for each category by handconstructing the models and therefore, does not require training. Accuracies of 85 percent and 87.5 percent have been reported on the NIST-4 database using these approaches.

2) structure-based.

A structure-based approach uses the estimated orientation field in a fingerprint image to classify the fingerprint into one of the five classes. An accuracy of 90.2 percent with 10 percent rejection is reported on NIST-4. The neural network used in was trained on images from 2,000 fingers (one image per finger) and then tested on an independent set of 2,000 images taken from the same fingers. The error reported is, thus, optimistically biased. A later version of this algorithm was tested on the NIST-14 database, which is a naturally distributed database, resulting in a better performance. However, this performance improvement should be expected since the NIST-14 database contains only a small percentage of arch-type fingerprints, which pose the most difficulty for fingerprint classifiers and the neural network used in the algorithm implicitly takes advantage of this information. A similar structure-based approach, which uses hidden Markov models for classification, depends on a reliable estimation of ridge locations, which is difficult in noisy images. In another structure-based approach, B-spline curves are used to represent and classify fingerprints.

3) frequency-based.

Frequency-based approaches use the frequency spectrum of the fingerprints for classification.

4) syntactic.

A syntactic approach uses a formal grammar to represent and classify fingerprints.

5) hybrid approaches.

Hybrid approaches combine two or more approaches for classification. These approaches show some promise but have not been tested on large databases. For example, Chong et al. report results on 89 fingerprints, Fitz and Green on 40 fingerprints, and Kawagoe and Tojo on 94 fingerprints.

1. K-Nearest Neighbor Classifier

The K-nearest neighbor classifier results in an accuracy of 85.4 percent for the five-class classification task when 10 nearest neighbors (K = 10) are considered. Classification accuracy does not always increase with increasing K; there exists an optimal value of K for finite training sample size classification problems. For the four-class classification task (where classes A and T were collapsed into one class), an accuracy of 91.5 percent is achieved. The confusion matrix for the K-nearest neighbor classification is shown in table 2.1.

True class Assigned class

W R L A T

W 320 38 31 6 0

R 1 368 2 10 21

L 0 1 359 13 8

A 1 3 7 422 20

T 0 15 16 95 208

Table 2.1 Confusion matrix for the K-nearest neighbor classification; K = 10

The diagonal entries in this matrix show the number of test patterns from different classes which are correctly classified. Since a number of fingerprints in the NIST-4 database are labeled as belonging to two different classes, row sums of the confusion matrices in Tables 1, 2, and 3 are not identical.

2. Neural Network Classifier

We trained a multilayer feed-forward neural network using a quick propagation training algorithm. The neural network has one hidden layer with 20 neurons, 192 input neurons, and five output neurons corresponding to the five classes. We obtain an accuracy of 86.4 percent for the five-class classification task. For the four-class classification task, an accuracy of 92.1 percent is achieved. The confusion matrix for the neural network classification is shown in table 2.2.

True class Assigned class

W R L A T

W 352 29 10 2 2

R 6 374 1 9 17

L 10 2 353 10 7

A 0 6 8 384 48

T 1 16 19 64 235

Table 2.2 Confusion matrix for the Neural Network Classifier

3. Two-Stage Classifier

The objective here is to perform a "simple" classification task using a K-NN classifier and then use a bank of two-class neural network classifiers to handle more subtle discriminations. The first stage uses the K-nearest neighbor (K = 10) classifier to yield the two most probable classes. We observed that 85.4 percent of the time, the class with the maximum frequency among the K nearest neighbors is the correct class and 12.6 percent of the time, the class with the second highest frequency is the correct class. In other words, the K-nearest neighbor classifier yields the top two classes with an accuracy of 98 percent. This result itself can be used to accurately classify fingerprints into two out of the five classes. Each fingerprint will have an entry in two of the five partitions of the database and the matching is required to be performed only in the corresponding two partitions of the database. The second stage uses 10 different neural networks for 10 different pairwise classifications. These neural networks have 192 input neurons, 20-40 hidden neurons in one hidden layer, and 2 output neurons. Each neural network is trained using the patterns from only the two corresponding classes in the training set. For example, the neural network which distinguishes between R and W is trained using only the patterns labeled R and W in the training set.

This two-stage classifier yields an accuracy of 90 percent for the five-class classification task and an accuracy of 94.8 percent is achieved for the four-class classification task. The confusion matrix for the two-stage classification is shown in table 2.3.

True class Assigned class

W R L A T

W 366 16 8 4 1

R 3 372 1 8 17

L 6 0 364 6 7

A 2 1 3 405 39

T 0 6 14 55 261

Table 2.3 Confusion matrix for the Two-Stage Classifier

Although our classifier is robust to noise and is able to correctly classify most of the poor quality fingerprints in the NIST-4 database, it fails on some very bad quality fingerprint images where no ridge information is present in the central part of the fingerprint. In poor quality fingerprints, it is very difficult to detect the center point correctly. Our classifier also fails to correctly classify twin loop images which are labeled as whorl in the NIST-4 database. For these images, our center location algorithm picks up the upper core and on considering that as the center, the image looks like a loop in the region of interest which leads to a misclassification of W as L or R. About 3 percent of the errors result from loop-arch misclassification because of the subtle difference between loop and arch types. The A-T misclassification accounts for about 5 percent of the errors.

4. Reject Option

Classification accuracies can be further increased by incorporating a rejection option. We use the (K, K')-nearest neighbor classifier for rejection and the proposed two-stage classifier for classification. If the number of training samples from the majority class among the K nearest neighbors of a test pattern is less than K' (K' < K), we reject the test pattern and do not attempt to classify it. Most of the rejected images using this scheme are of poor quality. Other rejected images are those images which "appear" to belong to different classes. For example, for the fingerprint image, three of its nearest neighbors belong to class R, three to class A, and four to class T. By rejecting 19.5 percent of the images for the five-class problem, the classification accuracy can be increased to 93.5 percent and for the four-class classification problem, the accuracy can be increased to 96.6 percent (table 2.4).

Classifier (10,0)-NN (10,5)-NN (10,6)-NN (10,7)-NN

Rejection rate 1,8 % 8,5 % 19,5 % 32,5 %

5-class error 10 % 8,8 % 6,5 % 1 %

4-class error 5,2 % 4,5 % 3,4 % 2,2 %

Table 2.4 Error-reject trade-off

5. Hidden Markov Model Classifier

Hidden Markov models are a form of stochastic finite state automaton well-suited to pattern recognition and success fully applied to speech recognition [15], [16] and other problems. They are appropriate to the problem posed here because of their ability to classify patterns based on a large quantity of features whose number is variable and which have certain types of underlying structure, especially if that structure results in stationarity of the feature distributions over some spatial or temporal period. In a fingerprint, the basic class information can be inferred from syntactic analysis of singular points, but can also be seen in the general pattern of the ridges-the way a nonexpert human would classify prints. The HMM is able to statistically model the different structures of the ridge patterns by accumulations of evidence across the whole print, without relying on singular point extraction.

Thinning the ridge image so that each ridge is left represented by an eight-connected, one-pixel-wide line termed the skeleton. Given the skeleton image of the ridges, parallel fiducial lines are laid across the image at an angle ф, and each one followed in turn. For each intersection of a fiducial line with a ridge, a feature is generated. Each feature consists of a number of measurements, chosen to characterize the ridge behavior and its development at the intersection point.

The measurements of each feature are termed a frame and the frames, R_ik for the i-th fiducial line are collectively termed a row, R_i, whose ordering is preserved. For each orientation ф of fiducial lines, a separate representation R_ф={R_i,V_i} of the print is obtained. In this research, only horizontal and vertical lines have been used, giving features R_h and R_v, respectively, but other angles may allow further information to be captured.

Typically, HMMs are one-dimensional structures suitable for analyzing temporal data. Here, the data are two-dimensional, but the process of feature extraction can also be described as a one-dimension alarray of one-dimensional row processes.

For classification, a model M_c is constructed for each class, c, and the maximum likelihood class is chosen after calculating the probability of the data R given the model: argmax_cP(R| M_c).

The frame emission probabilities are modeled with mixtures of diagonal covariance, multivariate Gaussian distributions. Thus, for any frame R_ik, it is possible to calculate the likelihood P(R_ik|S_ijk) of it occurring in any state S_ijk. With these likelihoods, for any row model, the likelihood of any row can be approximated by the maximum likelihood of any state sequence aligning the features and states calculated as a product of frame likelihoods and transition probabilities for the state sequence:
formula
The models are initialized by using an equal-length alignment with the frames evenly distributed across the states of the model. After estimating the initial parameter values, using smooth equal-length alignment, Viterbi alignment is used to find the maximum-likelihood alignment of frames with states, which is used for retraining. Around two iterations of training are necessary to achieve good classification performance.

For each orientation of fiducial lines, a separate classifier can be made. Since the errors of the different classifiers will be different, a combination of their scores may yield better accuracy. Denoting by M_hc, M_vc, the class c models trained with vertical and horizontal features, respectively, and assuming independence, the likelihood of the data is written as:

6. Decision Tree Classifier

Such decision, trees are built using techniques based upon those of Amit et al. These authors tackled a number of problems including that of digit recognition-classifying images of the digits '0' to '9.'

The technique used by Amit et al. for constructing decision trees involves the generation of a large number of simple features. Each feature in isolation provides little information about the classification decision, for example, the existence of an edge at a particular location in an image may give little clue as to the digit's identity. However, combinations of such features can represent much important information needed to make an accurate classification decision. Amit et al. describe a procedure for making decision trees by growing questions based upon such combinations of simple features.

The procedure has been adopted here for fingerprint classification and involves an initial feature extraction phase, followed by question building which creates informative questions assisting in classification. These complex questions are combined in a hierarchical manner to form decision trees which are used for classification. Because the trees are constructed stochastically, trees constructed for the same problem have different performances and, as is common with decision tree classifiers, multiple trees are combined to give the final classification.

We propose a fingerprint classification algorithm (fig. 2.6) based on a novel representation scheme which is directly derived from local ridge structures. The representation does not use the core, delta, and orientation field, explicitly. It is more capable of tolerating poor image quality, which is a major difficulty in fingerprint classification.

Fig. 2.6. Flow diagram of our fingerprint classification algorithm.

The category of a fingerprint is determined by its global ridge and furrow structures. A valid feature set for finger-print classification should be able to capture this global information effectively. We have developed a novel representation scheme (FingerCode) which is able to represent both the minute details and the global ridge and furrow structures of a fingerprint. For the purpose of classification, we use a low-level representation which is very effective in representing the global ridge and furrow structures and which is invariant to individual minute details.

The main steps of our feature extraction algorithm are as follows:

1) Find a registration point (center point) and define a spatial tessellation of the image space around the registration point (represented by a collection of sectors).

2) Decompose the input image into a set of component images, which preserve global ridge and furrow structures.

3) Compute the standard deviation of gray level values in each sector to form the feature vector or the FingerCode.

Let I(x, y) denote the gray level at pixel (x, y) in an M x N fingerprint image and let (x_c, y_c) denote the center point. The spatial tessellation of the image space which consists of the region of interest is defined by a collection of sectors S_i, where the ith sector S_i is computed in terms of parameters (r, ) as follows:
formula
b is the width of each band and k is the number of sectors considered in each band. We use six concentric bands around the center point. Each band is 20-pixels wide (b = 20), and segmented into eight sectors (k = 8) (Fig. 2.7). The innermost band is not used for feature extraction because the sectors in the region near the center contain very few pixels. Thus, a total of 8 x 6 = 48 sectors (S₀ through S₄₇) are defined.
A fingerprint
Fig. 2.7. Core (x), center (o), the region of interest and 48 sectors.

The main steps of our classification algorithm are as follows (Fig.2.6):

1) Locate a registration point in the input image and define a spatial tessellation of the region around the registration point (sectors).

Any point that can be consistently detected in a fingerprint image can be used as a registration point (or center point because we prefer this point to be positioned at the center of the image). In a fingerprint image, the core point presents such a consistent point. Therefore, in our algorithm, we define core point as the center point (x_c, y_c). We used the core point detection algorithm which is presented below.

a) Estimate the orientation field O using the least square orientation estimation algorithm. Orientation field O is defined as an N x N image, where O(i, j) represents the local ridge orientation at pixel (i, j). An image is divided into a set of w x w nonoverlapping windows and a single local orientation is defined for each window.

b) Smooth the orientation field in a local neighborhood. Let the smoothed orientation field be represented as O'.

c) Initialize A, a label image used to indicate the core point.

d) For each pixel (i, j) in O', compute the Poincare index and assign the corresponding pixels in A a value of one if the Poincare index is (1/2). The Poincare index at pixel (i, j) enclosed by a digital curve, which consists of a sequence of pixels that are on or within a distance of one pixel apart from the corresponding curve, is computed as follows:
formula
where (·) and (·) are the x and y coordinates of the closed digital curve with pixels.

e) Find the connected components in A. If the area of a connected component is larger than seven, a core is detected at the centroid of the connected component. If the area of a connected component is larger than 20, two cores are detected at the centroid of the connected component.

f) If more than two cores are detected, go back to Step 2.

g) If two cores are detected, the center is assigned the coordinates of the core point with the lower y value (the upper core). If only one core is detected, the center is assigned the coordinates of the core point.

h) If no core point is detected, compute the covariance matrix of the vector field in a local neighborhood (q x q) of each point in the orientation field. Define a feature image I with the largest eigenvalue of the covariance matrix for each element in the orientation image. A core is detected at the centroid of the largest connected component in the thresholded image of F and the center is assigned the coordinates of the core.

The center found above is shifted 40 pixels down for further processing based on the fact that most of the category information in a fingerprint lies in the lower part of the fingerprint. This value was empirically determined.

2) Decompose the input image into a set of component images, each of which preserves certain ridge structures; compute the standard deviation of the component images in each sector to generate the feature vector (called FingerCode).

Gabor filters are band-pass filters which have both orientation-selective and frequency-selective properties and have optimal joint resolution in both spatial and frequency domains. By applying properly tuned Gabor filters to a fingerprint image, the true ridge and fur row structures can be greatly accentuated. These accentuated ridges and furrow structures constitute an efficient representation of a fingerprint image.

An even symmetric Gabor filter has the following general form in the spatial domain:
formula
where f is the frequency of the sinusoidal plane wave along the direction from the x-axis, and and specify the Gaussian envelope along x and y axes, respectively, which determine the bandwidth of the Gabor filter.

In our algorithm, the filter frequency f is set to the average ridge frequency (1/K), where K is the interridge distance. The average interridge distance is approximately 10 pixels in a 500 dpi fingerprint image.

The values of and were empirically determined and both were set to 4.0.

A fingerprint image is decomposed into four component images corresponding to four different values of (0⁰, 45⁰,90⁰, and 135⁰ ) with respect to the x-axis (Fig. 2.8). A finger-print image is convolved with each of the four Gabor filters to produce the four component images.

Fig. 2.8. Gabor filters (size = 33 x 33, f=0.1, = 4.0, = 4.0). (a) 0⁰ orientation. (b) 45⁰ orientation. (c) 90⁰ orientation. (d) 135⁰ orientation.

Before decomposing the fingerprint image, we normalize the region of interest in each sector separately to a constant mean and variance. Normalization is done to remove the effects of sensor noise and finger pressure differences. Let I(x, y) denote the gray value at pixel (x, y), M_i and V_i, the estimated mean and variance of sector S_i, respectively, and N_i(x, y), the normalized gray-level value at pixel (x, y). For all the pixels in sector S_i, the normalized image is defined as:
formula
where M₀ and V₀ are the desired mean and variance values, respectively.

Normalization is a pixel-wise operation which does not change the clarity of the ridge and furrow structures. For our experiments, we set both M₀ and V₀ to a value of 100. Normalized, filtered, and reconstructed images for the fingerprint are shown in Fig. 2.9.

Fig. 2.9. Normalized, filtered, and reconstructed fingerprint images. (a) Normalized image. (b) Component image 0⁰. (c) Component image 45⁰. (d) Component image 90⁰. (e) Component image 135⁰. (f) Reconstructed image.

3) Feed the feature vector into a classifier; in our algorithm, a two-stage classifier is used.

This two-stage classifier uses a K-nearest neighbor classifier in its first stage and a set of neural network classifiers in its second stage to classify a feature vector into one of the five fingerprint classes. In each component image, a local neighborhood with ridges and furrows that are parallel to the corresponding filter direction exhibits a higher variation, whereas a local neighborhood with ridges and furrows that are not parallel to the corresponding filter tends to be diminished by the filter which results in a lower variation. The spatial distribution of the variations in local neighborhoods of the component images thus constitutes a characterization of the global ridge structures and is well captured by the standard deviation of grayscale values. In our algorithm, the standard deviation within the sectors defines the feature vector.

Let (x, y) be the component image corresponding to for sector S_i. For i, i = 0,1...47 and Є [0⁰, 45⁰, 90⁰, 135⁰], a feature is the standard deviation , defined as:
formula
where K_i is the number of pixels in S_i and is the mean of 0 the pixel values in (x, y). The 192-dimensional feature vectors, also called FingerCodes (similar to the IrisCode introduced by Daugman), for typical fingerprint images from different classes are shown as gray level images with four disks, each disk corresponding to one filtered image (Fig. 2.10). The gray level in each sector of a disk represents the feature value for that sector in the corresponding filtered image. One can see that visually this representation appears to discriminate the five fingerprint classes reasonably well.

Fig. 2.10. Fingerprint representation using 192-dimensional feature vectors (In each representation, the top left disk represents the 0 component, component, the bottom left disk represents the 90 component, and the bottom right disk represents the 135 the top right disk represents the 45 component). The test pattern is a right loop. Each disk corresponds to one particular filter and there are 48 features (shown as gray values) in each disk (8 x 6 = 48 sectors) for a total of 192 (48 x 4) features. (a) Test, (b) whorl, (c) right loop, (d) left loop, (e) arch, (f) tented arch.

- page up -