A Novel Approach Towards Devanagari Transliteraton Using Statistical and Structural Feature Extraction

Kaur, Jasmine

A Novel Approach Towards Devanagari Transliteraton Using Statistical and Structural Feature Extraction

dc.contributor.author	Kaur, Jasmine
dc.contributor.supervisor	Kumar, Vinay
dc.date.accessioned	2016-08-29T10:26:48Z
dc.date.available	2016-08-29T10:26:48Z
dc.date.issued	2016-08-29
dc.description.abstract	Majority of the ancient Indian literature such as Bhagavad Gita, Vedas, Mahabharata, and Ramayana is written in Devanagari script. Devanagari script is popular in India and is known by just a small fraction of population whereas Roman script is widely adopted all over the world. To make the rich voluminous Indian literature readily available to the people who are unfamiliar with Devanagari script, transliteration of the Devanagari documents into a much familiar Roman script is the way to go. This dissertation attempts in Romanization of Devanagari document using character recognition with the help of underlying statistical and structural properties of the characters. The character recognition process interprets the document images and converts the text into editable format. Moreover automation of this process will greatly reduce the human interference while converting the Devanagari text documents to much familiar and editable roman script. However it is a challenging task because of the complex structure and enormity of Devanagari character set as compared to limited size of roman alphabets. One of the first tasks performed to isolate the constituent characters is segmentation. Line segmentation methodology in this dissertation discusses the case of overlapping and skewed lines. Overlapping line segmentation is based on number of connected components which is made equivalent to number of individual lines in the image. Mathematical morphological operation, closing and dilation to be exact are used to limit skew angle variation range thereby expediting the projection profile method of skew correction. The presented skew correction method works for full range of angles. The proposed character segmentation algorithm is designed to segment conjuncts and separate shadow characters. Presented shadow character segmentation scheme employs connected component method to isolate the character, keeping the constituent characters intact. Statistical features namely different order moments like area, variance, skewness and kurtosis along with structural features of characters are employed in two phase recognition process. After recognition, constituent Devanagari characters are mapped to corresponding roman alphabets in a way that resulting roman alphabets have similar pronunciation as the source characters. The algorithm is evaluated comprehensively on various Devanagari documents with positive results.	en_US
dc.identifier.uri	http://hdl.handle.net/10266/4194
dc.language.iso	en	en_US
dc.subject	Image processing	en_US
dc.subject	Devanagari	en_US
dc.subject	Sanskrit	en_US
dc.subject	Character Recognition	en_US
dc.subject	Feature Extraction	en_US
dc.title	A Novel Approach Towards Devanagari Transliteraton Using Statistical and Structural Feature Extraction	en_US
dc.type	Thesis	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: 4194.pdf
Size:: 5.33 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 2.03 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Masters Theses@ECED