Please use this identifier to cite or link to this item: http://hdl.handle.net/10266/903
Title: Hybrid Approach to Classify Gurmukhi Script
Authors: Kaur, Antarpreet
Supervisor: Kumar, Rajiv
Keywords: OCR;Gurmukhi Script;Language Processing
Issue Date: 25-Aug-2009
Abstract: Extensive research has been done on optical character recognition in the last few decades. Most of the efforts were made to develop OCR systems for foreign languages which are available in the market. In the context of Indian languages, majority of work has been reported on Hindi and Bangla but a very few reports are available on Gurmukhi script which is used to write Punjabi language, one of the popular languages of northern India. In OCR system, segmentation or more precisely character segmentation is an important preprocessing step for text recognition. The segmentation can be done by various ways. But in this work, we explored one aspect that is not used on Gurmukhi script so far. The first, segmentation stage takes as input an image of a document and separates the different logical parts, like the line of a paragraph, words of a line and characters of a word. The lines and words are segmented according to the horizontal and vertical projection profile respectively. Then, for the segmentation of text into individual characters, water reservoir concept is used. Here, we classify Gurmukhi script characters to subclasses so that segmented part can be tested out whether this represents a correct character or not. For that purpose a hybrid approach is used. This hybrid approach is a combination of water reservoir and feature based approach. The significant point of the scheme is that a character image is tested against only certain subsets of classes at each stage, which increases the computational efficiency.
URI: http://hdl.handle.net/10266/903
Appears in Collections:Masters Theses@SOM

Files in This Item:
File Description SizeFormat 
903.pdf1.07 MBAdobe PDFThumbnail
View/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.