Offline Segmentation of Machine Printed Gurmukhi Script with Emphasis on Touching Characters

Loading...
Thumbnail Image

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Character segmentation is an important process of optical character recognition depends very much on the success rate of segmentation. Touching characters are a major factor of error in segmentation. A lot of work has been done on segmentation for scripts like Roman (for English), Kanji (for Chinese) and Kana (for Japanese). But none of these is fully applicable to Gurmukhi script. The OCR system development for Gurmukhi script is difficult because the characters in a world are topologically connected, two or more characters in a world may have intersecting minimum bounding rectangles, presence of multi-component characters and further the presence of touching characters make it even more harder. In the proposed work, the document image captured by a flat-bad scanner is subjected to thinning (skeletonization), line segmentation, zone detection, world segmentation & character segmentation. An attempt is made to segment the touching characters in Gurmukhi script. Keywords: OCR, Gurmukhi Script, Thinning (Skeletonization), Segmentation, Touching Characters.

Description

Citation

Endorsement

Review

Supplemented By

Referenced By