queXF ICR Process
queXF has implemented an Intelligent Character Recognition (ICR) process from version 1.12.0 to detect isolated handwritten characters. The system is available for testing, but currently is not optimised so may run slowly.
The process of ICR is broken in to the following 7 steps:
- Character isolation
- Noise reduction
- Boundary removal
- Feature extraction
- Training or Recognition
queXF already provides a character isolation feature as queXF considers each “box” on a form to be an independent entity. Where text is entered in to an individual character or number field, queXF knows the coordinates of the box on the page. Using the page edges, queXF determines the rotation, zoom (scale) and offset of the page and then applies these transformations to the box coordinates. It then extracts the character image from the page image.
Original box location overlay on scanned form:
Box location after rotation, zoom and offset:
Examples of extracted original images of the handwritten letter A:
Given the isolated character image, queXF then removes “salt and pepper” noise (usually introduced via scanning) using the kFill algorithm.
See: kfill_modified function in functions/functions.ocr.php) with a k value of 5
The kFill algorithm was proposed in: K.Chinnasarn, Y.Rangsanseri, P.Thitimajshima: Removing Salt-and-Pepper Noise in Text/Graphics Images. Proceedings of The Asia-Pacific Conference on Circuits and Systems (APCCAS'98), pp. 459-462, 1998 (click for PDF)
Examples of the characters after noise reduction. Notice the effect on the dots on the left hand side of the third character:
The BOX_EDGE value in functions/functions.ocr.php
This function is an implementation of part 4 "Noise Cleaning along Character Image Boundaries" from: Preprocessing and Image Enhancement Algorithms for a Form-based Intelligent Character Recognition System, Dipti Deodhare, NNR Ranga Suri and R Amit. International Journal of Computer Science & Appliacations Vol. II, No. II pp. 131-144
Examples of boundary removal. Notice the bottom line on the second character has been removed, along with the dots on the left hand edge of the third character:
The whitespace around the character is then disregarded (a bounding box is detected) then the character image within the bounding box is resized to a standard size (queXF uses 44x34 pixels).
See: resize_bounding in functions/functions.ocr.php
Examples of resizing. Notice that due to the noise reduction and boundary removal, that the resizing can include the entire character in the box:
Thinning is the process of reducing an image to a skeleton that is 1 pixel wide. This is to remove the effect of the “thickness” of pen strokes on character recognition - so only the shape is left. The Zhang-Suen algorithm has been implemented in queXF to thin the images.
See: thinzs_np in functions/functions.ocr.php
The thinning function was ported from analysis.c in the T. Y. Zhang , C. Y. Suen, A fast parallel algorithm for thinning digital patterns, Communications of the ACM, v.27 n.3, p.236-239, March 1984
Examples of thinning applied to the resized characters:
Overview of image manipulation before feature extraction
This is the process of identifying pertinent information in the image that can be used for training or recognition.
The process used by queXF is to calculate 16 features for an image:
- Split the image in to 30 degree sections from the centroid (12 sections)
- For each section, calculate the normalised vector distance. This is the sum of the distance between each filled pixel in the sector and the centroid, divided by the number of filled pixels in the sector. These become 12 “features” of the character image
- Split the image in to 90 degree sections from the centroid (4 sections)
- For each section, calculate the proportion of filled pixels in this section compared to the entire image. These become 4 more “features” of the image
See: sector_distance in functions/functions.ocr.php
The algorithm for feature extraction is described in Section 2 of "Hand printed Character Recognition using Neural Networks" by Vamsi K. Madasu, Brian C. Lovell and M. Hanmandlu
Training or Recognition
Once the features have been extracted from a character image, the data can either form part of the training set, or be compared against an existing training set for recognition.
queXF implements training based on fuzzy logic. The 16 features extracted from each character image form 16 fuzzy sets. The mean and variance of each of the 16 fuzzy sets is calculated for each character. These form the knowledge base (KB).
See: generate_kb in functions/functions.ocr.php
The “fuzzy distance” between the features in the character image to be recognised, and each character in the knowledge base is calculated. The character with the minimum “fuzzy distance” should identify the character.
See: ocr_guess in functions/functions.ocr.php
The algorithms for training and recognition is described in Section 4.2 of "Hand printed Character Recognition using Neural Networks" by Vamsi K. Madasu, Brian C. Lovell and M. Hanmandlu