EMPIRICAL STUDY OF CAR LICENSE PLATES RECOGNITION

The number of vehicles on the road has increased drastically in recent years. The license plate is an identity card for a vehicle. It can map to the owner and further information about vehicle. License plate information is useful to help traffic management systems. For example, traffic management systems can check for vehicles moving at speeds not permitted by law and can also be installed in parking areas to secure the entrance or exit way for vehicles. License plate recognition algorithms have been proposed by many researchers. License plate recognition requires license plate detection, segmentation, and characters recognition. The algorithm detects the position of a license plate and extracts the characters. Various license plate recognition algorithms have been implemented, and each algorithm has its strengths and weaknesses. In this research, I implement three algorithms for detecting license plates, three algorithms for segmenting license plates, and two algorithms for recognizing license plate characters. I evaluate each of these algorithms on the same two datasets, one from Greece and one from Thailand. For detecting license plates, the best result is obtained by a Haar cascade algorithm. After the best result of license plate detection is obtained, for the segmentation part a Laplacian based method has the highest accuracy. Last, the license plate recognition experiment shows that a neural network has better accuracy than other algorithm. I summarize and analyze the overall performance of each method for comparison.


I. INTRODUCTION
HE number of vehicles on the road is always increasing. According to the International Organization of Motor Vehicle Manufacturers (OICA), in 2009, vehicles produced in the world reached 47,772,598 and in 2012 reached over 60,000,000 vehicles. Due to the large number of vehicles, it is necessary to distinguish each car and its license number. Vehicles in each country have unique license numbers. The license number differentiates one vehicle from another vehicle, even though many people buy the same model and same brand of their vehicle. The license plate number is like an identity card for the vehicles. Furthermore, from the license number, we can determine who the owner is and obtain further information about the vehicle. The license plate can also be used to help traffic management systems. For example, traffic management systems can check for vehicles moving at speed not permitted by law and can also be installed in parking areas to secure the entrance or exit way for vehicles. Various license plate recognition algorithms have been implemented to detect and read license plates of vehicles. Each algorithm has its strengths and weaknesses. T Many researchers have proposed an algorithms for license plate recognition. They distinguish between license plate detection and license plate recognition. Anagnostopoulos et al. proposed a license plate recognition algorithm for intelligent transportation system [1]. They proposed license plate recognition using a novel adaptive image segmentation technique. They performed a survey about license plate recognition only. Lisheng et al. (2012) performed license plate recognition for passenger cars in Chinese residential areas [6]. They proposed an algorithm for license plates in China since these license plates have unique characters. They extracted the license plate to obtain a segmentation of the plate characters and recognize each character. Lekhana & Srikantaswamy (2012), Gohil (2010), Jain et al., (2012) did research about license plate detection. They attempted to identify the location of license plates based on their algorithm. Each algorithm has its strengths and weaknesses. In this study, license plate detection and license plate recognition are developed in one project. Therefore the system should be able to detect license plates and recognize its number. It is important to compare different algorithms for same data. Identify the best algorithm for traffic images in Thailand. The same approach could be used to find the best method for dataset.
II. LITERATURE REVIEW License plate recognition is one of important component of intelligent transportation systems. It also helps the traffic management system in the city. The traffic management systems are installed on freeways to check for vehicles moving at speed not permitted by law and also are installed in parking area to secure its system on the entrance or exit way of vehicles. Every vehicle has license plate as identity card. There are some approaches about license plate recognition, but they all require detection. Otherwise detection and recognition should be in same project. Zheng et al., (2005) proposed a real time license plate detection using Sobel vertical edge detection and image enhancement [9]. The success rate is 97%. Anagnostopoulos et al. developed license plate recognition approach by introducing new procedure of image segmentation using concentric sliding windows [1]. The statistics model such as standard deviation and average used to identify where the license plate is. Anagnostopoulos et al. resulted 96.5% accuracy of plate recognition [1]. While there are some approaches about license plate detection, too. Lekhana & Srikantaswamy (2012) proposed a feature based license plate localization algorithm [5] so the algorithm can capture multi-object problem in different image capturing condition. It extracted license plate using edge statistic and morphological operations. The result is 96.5%. Gohil (2010) developed license plate detection using histogram based approach. This approach has an advantage of being simple and faster [3]. Jain et al. (2012) did research about license plate detection. It was done through fusion of spectral analysis and connected component analysis [4].
Today various detection and recognition techniques have been developed and it is used in traffic management and security applications, such as parking, access and border control, or tracking of stolen cars. For example, in entrance gate, license plates are used to identify the vehicles. When a vehicle enters an input gate, number plate is automatically recognized and stored in database. When vehicle later exits the place through the gate, license plate is recognized again and compared with the previous stored in the database and it is checked. License plate detection and recognition systems can be used in access control. For example, the system is used in many companies to grant access only to vehicles of authorized personnel. In some countries, ANPR systems installed on country borders automatically detect and monitor border crossings. Each vehicle can be registered in a central database and compared to a black list of stolen vehicles.
License Plate Recognition algorithms are generally composed of the following three steps: 1) license plate detection to detect the location of license plate region, 2) license plate segmentation to obtain the segmentation of the plate characters, and 3) license plate recognition to recognize each character.

A. License plate detection
In this step, I compare three algorithm for license plate detection. They are histogram approach, sliding concentric window and Haar-like wavelet cascade based detection.

Histogram approach
The flowchart of histogram approach is shown in Figure 1. According to Gohil (2010), this histogram approach has advantages of being simple and faster [3].

Convert the RGB Image into Grayscale
The algorithm for license plate detection for this study uses grayscale image, therefore RGB image is converted into grayscale image before further processing.

Dilate the Image
Dilation is one of the fundamental morphological operation. Dilation adds pixels to the boundaries of objects in an image. The number of pixels added from the objects in an image depends on the size and shape of the structuring element used to process the image. Using dilation, the noise within an image can also be removed. By making the edges sharper, the difference of gray value between neighboring pixels at the edge of an object can be increased. This enhances the edge detection. In license plate detection, the image of a car plate may not always has the same brightness and shades. Sometimes, it is broken or has noises. Therefore, the given image has to be converted from RGB to gray form. And during this conversion, certain important parameters like difference in color, lighter edges of object, etc. may get lost. The process of dilation will help to close these losses.

Horizontal and Vertical Edge Processing
Histogram is a graph representing the values of a quantity over a range. In this study, License Plate Detection algorithm, it used horizontal and vertical histogram, which represented the column-wise and row-wise of the histogram is showed in Figure 2. It represents the sum of differences of gray values between neighboring pixels of an image. To prevent loss of important information in the next steps, it is better to smooth out the drastic changes in values of histogram. For the same, the histogram is passed through a low-pass digital filter. While performing this step, each histogram value is averaged out considering the values on its right-hand side and left-hand side. This step is performed on both the horizontal histogram as well as the vertical histogram.

Filtering out Unwanted Regions in the Image
When the histograms are passed through a low-pass digital filter, a filter is applied to remove unwanted areas from an image. In this case, the unwanted areas are the rows and columns with low histogram values. A low histogram value indicates that the part of image contains very little variations among neighboring pixels. Since a region with a license plate contains a plain background with alphanumeric characters in it, the difference in the neighboring pixels, especially at the edges of characters and number plate, will be very high. This results in a high histogram value for such part of an image. Therefore, a region with probable license plate has a high horizontal and vertical histogram values. Areas with less value are thus not required anymore. Such areas are removed from an image by applying a dynamic threshold. In this algorithm, the dynamic threshold is equal to the average value of a histogram. Both horizontal and vertical histograms are passed through a filter with this dynamic threshold. The output of this process is histogram showing regions having high probability of containing a number plate. The next step is to find all the regions in an image that has high probability of containing a license plate. Coordinates of all such probable regions are stored in an array. Out of these regions, the one with the maximum histogram value is considered as the most probable candidate for number plate. All the regions are processed row-wise and column-wise to find a common region having maximum horizontal and vertical histogram value.

Sliding concentric window
According to sliding concentric window algorithm [1], license plate is detected by irregularities in the texture of images. The SCW method is developed in order to identify that local irregularities in the image. They used statistical measurement: standard deviation and mean value. The developed algorithm is based on two steps.
1. Two concentric windows A and B of size (2X1) x (2Y1) pixels and (2X2) x (2Y2), the window are presented in Figure 3.

Statistical measurements in windows A and B were calculated.
If the ratio of the statistical measurements in the two concentric windows exceeded a threshold set by the user, then the central pixel of the windows was considered to belong to a Region of Interest. For the further step after SCW is applied, there are six more steps. A complete algorithm for this license plate detection based on Anagnostopoulos et al. consists of seven steps. The sequence of next steps are as follows.   As the above list indicates, there are five steps in the process of license plate recognition: sliding concentric window (SCW) is used to indicate region where the license plate locates. Image masking is used to isolate region of image where license plate is located, Sauvola binarization is used to apply threshold over isolated region therefore the ambient or surrounding illumination source like the sun or flash of the camera is accounted. Connected component analysis (CCA) is used to identify which of the isolated regions is actually the license plate region. The 8-connectivity is adopted in this step and then connect the pixels of the same object together. Binary measurement is used to restrict the result with each parameters: aspect ratio, orientation and Euler number.

Haar-like wavelet cascade based detection
Viola & Jones (2004) adapted the idea using wavelets and developed Haar-like features [8]. Haar-like features are similar with convolution kernel which is used to detect the presence of the feature in the image. Each feature results in single value which is calculated by subtracting the sum of pixel under white rectangle with the sum of pixel under black rectangle. Haar features itself are shown in Figure 4. Viola and Jones used 24 x 24 window size base [8]. To simplify the calculation, they introduced the integral images. If integral image is used, it does not matter how large the number of pixel. An operation in integral image is involving only four pixels. But among all the features are calculated, obviously most of them are irrelevant and to deal with irrelevant features, they used Adaboost. For this Haar-like features, positive images and negative images are needed to train the classifier. They applied each and every feature on all training images. But obviously there will be misclassification when they classify positives and negatives. They select features with minimum error rate. It means they choose the best classifies positives and negatives. In an image, most of the region is negative region so the idea is to have simple method to check the window. If it is negative, then it will not be processed again. Viola and Jones introduced cascade of classifier. Instead of applying all features on a window, they grouped the features into different stages of classifiers and apply it [8].

B. License plate segmentation
In this step, I compared three algorithm for license plate segmentation. They are morphological segmentation, Otsu segmentation, and Laplacian segmentation.

Morphological segmentation
To process the image, there are some steps. The sequence of these step is shown in Figure 5. First, the license plate image has to be filtered by median filter to perform noise reduction. Secondly, the output is used to do morphological image processing: dilation and erosion using structure element disk and subtract the results of dilation and erosion image. Dilation adds pixels to the boundaries of objects in an image and erosion subtracts pixels in the boundaries of an image. The number of pixels added or subtracted from the objects in an image depends on the size and shape of the structuring element used to process the image. In this step, we use disk and line for structure element. By using dilation and erosion, the noise within an image can also be removed. This process will make the output more clear and there are no noises in the image. After morphological processing, the next step is to fill image region and holes. It is a flood-fill operation on grayscale images. For grayscale images, it brings the intensity values of dark areas that are surrounded by lighter areas up to the same intensity level as surrounding pixels. In effect, it removes regional minima that are not connected to the image border. After doing some clearances for the images, the object is found and we make rectangle border around for every character.

Otsu segmentation
After the license plate is detected, further step is doing character segmentation of the detected license plate. Some steps to do before doing character segmentation are noise reduction and histogram equalization. I used median filter to remove effect of some unwanted noises. After that binarization is implemented and an appropriate threshold is needed. To binarize the plate image, I used Otsu method which has good confidence level. According to Nobuyuki (1979), Otsu method can be used to convert a grayscale image into binary image by calculating a threshold to split a histogram into two classes [7]. Otsu method has been iterating through all possibility of threshold values so the two classes will be foreground or background. The last part of this step is segmentation. Some morphological operators are used to eliminate the plate boundary.

Laplacian segmentation
The Laplacian is an image processing which applies equally in all directions of an image. The Laplacian detects the alteration of image highlights rapidly. So, Laplacian is used for edge detection. Commonly, before Laplacian is applied, Gaussian filter is used to reduce noises in an image. The operator normally takes a grayscale image as input and produces another grayscale image as output. This can be calculated using a convolution filter. Convolution is a process for multiplying two arrays of numbers, generally of different sizes, but of the same dimensionality, to produce a third array of numbers of the same dimensionality. Since the input image is represented as a set of discrete pixels, discrete convolution kernel is needed. Two commonly used small kernels are shown in Figure 6. Using one of these kernels, the Laplacian can be calculated using standard convolution methods.     Figure 7. When the template match, it gives the high cross correlation.

Neural network
Neural Networks (NN) is an algorithm that is based on how human brain interacts and learns new things. NN consists of a number of simple units that work parallel through weighted connections. Learning algorithms adjust these weights as it processes information. The basic structure of neural network has three layers: input layer, hidden layer, and output layer. The illustration of the layers is shown in Figure 8. Feature extraction is the process of transforming the input data into a reduced representation. Features extraction in OCR using neural network refer to the extraction of each character from the image. Features extraction in OCR are segmentation and scaling. Segmentation refers to isolation of each character from others and draws a bounding box around the character. Another features extraction is scaling. Characters are scaled into fix size so all the characters should have same size.
III. METHODOLOGY This chapter describes the system and proposed methodology. In this chapter, I describe the system design and implementation of license plate recognition.

A. Overview of the system
The main objective of my thesis is to recognize car license plate. The definition of recognizing car license plate is to detect the location of license plate and to extract the character of license plate. My approach can be divided into eight experiments with three problems on two datasets: Greek license plate and Thailand license plate. It is shown in Table 1. I divided each datasets into a training set (70% of the total images) and a testing set (30% of the total images). I divide it into three parts: license plate detection, license plate segmentation and license plate recognition. In each part, I obtain the best accuracy before further analysis. Therefore, after I obtain best accuracy of three algorithm in license plate detection then I use its algorithm as an input in license plate segmentation and it is same in license plate recognition. Then, I will obtain the combination of the algorithm with the best accuracy to do license plate recognition.   In order to perform my experiments, I collected training data sets as inputs for detection and extraction of license plates. I used two datasets, the Greek dataset and Thailand dataset. I divided each datasets into a training set (70% of the total images) and a testing set (30% of the total images). I performed two main evaluations: license plate detection and license plate extraction. For license plate detection, there are three approaches. For license plate extraction, there are two approaches. They are trained and tested using two different datasets. There are some evaluation criteria for our object detection, the first rule is the license plate content including the number and the alphabet as the result of license plate detection process. Another rule is the character extracted from the license plate itself as the result of license plate extraction process. Then, it will be compared by the ground truth image.

B. Dataset
The datasets are obtained from Greek license plate dataset and Thailand license plate dataset. The Greek dataset can be download at http://www.medialab.ntua.gr/research/LPRdatabase.html, and Thailand dataset has been collected from AIT Vision and Graphics Laboratory. Samples of each datasets are shown in Figure 9 and 10.  TOTAL IMAGES FOR TRAINING SET  TOTAL IMAGES FOR TESTING SET  TOTAL IMAGES  1  Greek dataset  728  312  1040  2 Thailand dataset 716 307 1023

C. License Plate Recognition
License Plate Recognition algorithms are generally composed of the following three steps. 1. License plate detection: to detect the location of license plate region. 2. License plate segmentation: to obtain the segmentation of the plate characters. 3. License plate recognition: to recognize each character. The three steps have dependencies of each other as a sequential. I obtain the result of detection algorithm, then use it for next step: segmentation and recognition steps. I compare three algorithms in detection, then use the best result of these three algorithms. For example, histogram approach has the best accuracy among others in license plate detection algorithms, so I used histogram approach as detection algorithm and combine it with segmentation algorithm to find the best accuracy of it.

D. Creating the label file YML
The YML file is an output which is used to store the detail of ground truth images, train images and test images. The information itself consists of width, height, plate's coordinates and plate's attributes. The YML file is created to compare the result with the ground truth. If the result is not same with the ground truth, the result itself is false. The example of YML file is shown as follows.

E. Evaluation design
AIT image cropper and detection analyzer are used to evaluate the result. First, I use image cropper to create ground truth image, the output of image cropper is an YML file. It stores details of plate's coordinates, widthheight and attributes of images. Another YML files are created as an experiment output. After that, I compare YML files from ground truth and experiments with AIT detection analyzer. The result from AIT detection analyzer are accuracy of both plate detection and attributes.

A. License plate detection
The first experiment I performed is on license plate detection. In this experiment I use three algorithms. They are the histogram approach, sliding concentric window, and Haar-like wavelet cascade based detection for both Greek and Thailand license plates. I split the data into 716 positive images and 1432 negative images as a Thailand training set, and I split the Greek dataset to use 728 positive images and 1456 negative images for training. For the histogram approach and sliding concentric window, I used the same 728 the Greek set and 716 Thai positive images for training, but these method do not use negative samples. More details about the allocation of positive training and test images for both datasets can be seen in Table 2.
Both the histogram approach and sliding concentric window are implemented in MATLAB, but the Haar cascade is implemented in OpenCV. As shown in Table 3, each algorithm has different results. The best algorithm in this experiment is the Haar cascade based detection. Using the Haar cascade, I obtained an 84% hit rate on the Greek dataset and on the Thailand dataset, I obtained an 86.5% hit rate. The example of positive testing samples are shown in Figures 11(a) and 12(a), and the example of negative training samples are shown in Figures 11(b) and 12(b). Since the Haar cascade had the best detection results, I use the Haar cascade's result as input to the segmentation algorithms in the next step.
Haar cascade itself has the highest hits because the works of Haar features are similar with convolution kernel, we have to compare in each feature of the image to find the pattern match. With the base 24x24 features of Haar, I do the convolution kernel on 716 positive images and 1432 negative images. The output image itself has high intensity where the convolution kernel pixel pattern match perfectly with the input image. So, train the positive and negative images as an input and detect the presence of the feature in testing data. It is different with the histogram approach and sliding concentric window, they do not need to train the data before they test it. Because of that reason, the hits rate of the histogram approach and sliding concentric window are not higher than Haar cascade.

B. License plate segmentation
The second experiment I performed is on license plate segmentation. In this experiment, I tested three segmentation methods. They are morphological, Otsu, and Laplacian. To evaluate each segmentation method, I manually defined a rectangular ground truth bounding box for each character in each license plate. Then, I recorded the bounding box for each character. After that I compared the ground truth with the segmentation results of each method. I calculate two hit rates: the per-character hit rate and the per-plate hit rate. When the segmented characters in the license plate are not completely correct for all characters, it is still useful to know the per-character hit rate. Based on my experiment, the Laplacian method has the highest accuracy. A summary of the results segmentation are shown in Table 4. Both of Otsu method and Laplacian use Gaussian filter as preprocessing. Both of Otsu and Laplacian can segment perfectly. If I use Otsu, the characters are connected each other. But in Laplacian, the characters are not connected. The Laplacian uses convolution kernel but Otsu only calculates threshold to split the histogram into two classes. The threshold itself will be generalize the pixel value based on the threshold value. It is not like Laplacian, it works base on pixel pattern match.

C. License plate recognition
As shown in Table 5, the last experiment I performed is license plate recognition. The recognition task identities characters inside the license plate. I use two algorithms, template matching and neural network. As in the segmentation step, I report accuracy per-plate. The neural network has the best result in this step. Both of the neural network and template matching use training data to recognize each character, but have different accuracy according to experiments. The neural network tries every possibility because it uses three layers to train the template. Although template matching uses the same template but the template matching will stop to recognize if pixel value is not matched. I created three layers in neural network (256 AE 16 AE 36). I scaled the template to 16x16 pixels image before I change it into string and I use 46 templates (10 for numerical and 26 for alphabet).

V. CONCLUSION AND RECOMMENDATION
Haar-like wavelet cascade has the highest performance algorithm for license plate detection than histogram approach and sliding concentric window algorithms. Haar-like wavelet cascade has the highest accuracy on both of dataset, Greek and Thailand datasets. According to Table 3, Haar like wavelet cascade has 86.5% on Thailand dataset and 84% on Greek dataset. Haar-like wavelet takes much time during training but it has better result than other algorithms. It is caused by the Haar feature which are used to detect the presence of the feature in the given image. After the best result for license plate detection is obtained, then segmentation part, Laplacian has higher accuracy than morphological and Otsu segmentation for both datasets. The hit rate for Greek is 93.7% and for Thailand is 95.94%. In Thailand dataset, segmentation is not yet accurate enough for good recognition on a per image basis because Thailand set has special characters, not only alphanumerical. The last part is license plate recognition. Neural network has better accuracy than template matching. The accuracy of neural network is 83% for Greek and 79% for Thailand. The neural network uses three layers to do the recognition, while template matching only do the pixel match. Good segmentation is critical and needs to be improved, especially for noisy data like Thailand dataset.