An Approach to Extract Features from Document Image for Character Recognition

2013 
In this paper we present a technique to extract features from a document image which can be used in machine learning algorithms in order to recognize characters from document image. The proposed method takes the scanned image of the handwritten character from paper document as input and processes that input through several stages to extract effective features. The object in the converted binary image is segmented from the background and resized in a global resolution. Morphological thinning operation is applied on the resized object and then the technique scanned the object in order to search for features there. In this approach the feature values are estimated by calculating the frequency of existence of some predefined shapes in a character object. All of these frequencies are considered as estimated feature values which are then stored in a vector. Every element in that vector is considered as a single feature value or an attribute for the corresponding image. Now these feature vectors for individual character objects can be used to train a suitable machine learning algorithms in order to classify a test object. The k-nearest neighbor classifier is used for simulation in this paper to classify the handwritten character into the recognized classes of characters. The proposed technique takes less time to compute, has less complexity and increases the performance of classifiers in matching the handwritten characters with the machine readable form.
    • Correction
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    2
    Citations
    NaN
    KQI
    []