Algorithm for automatic text retrieval from images of book covers

dc.contributor.authorYadav, Niharika
dc.contributor.supervisorKumar, Vinay
dc.date.accessioned2015-10-28T09:15:28Z
dc.date.available2015-10-28T09:15:28Z
dc.date.issued2015-10-28T09:15:28Z
dc.descriptionME, ECEDen
dc.description.abstractText extraction is one of the major areas of research in the field of document image Analysis. Text retrieval is needed for bibliographic databases, structuring images etc. Text embedded in multimedia data, as a well-defined model of concepts for humans’ communication, contains much semantic information related to the content. This text information can provide a much truer form of content–based access to the image and video documents if it can be extracted and harnessed efficiently. Moreover, automation of this process will greatly reduce the human interference while converting books (specifically their covers where this task becomes extremely difficult) to readable and editable electronic format specifically for electronic book readers. However this is a challenging task because images contain text of different size, style, orientation, alignment, low contrast, noise and have complex background structure. This dissertation propounds a method for extracting text from images of book covers and embedded text. A new text model is constructed to retrieve text regions from the scene text images. The image is first clustered to reduce the number of color variances, a suitable plane is identified and then text region is segmented using connected component based method. The text thus obtained is then enhanced to ameliorate the results. A detailed study of sundry techniques that have been proposed so far, along with their performance analysis has also been incorporated in the work. The algorithm is evaluated comprehensively on various datasets including ICDAR -2011 dataset. The experimental results demonstrate that the proposed text detection method can capture the inherent properties of text and discriminate text from other objects efficiently. The proposed method gives a very high character recognition rate for monochrome images, however in cases where there is a drastic variation in the text features rejection is noticeable.en
dc.format.extent5824581 bytes
dc.format.mimetypeapplication/pdf
dc.identifier.urihttp://hdl.handle.net/10266/3825
dc.language.isoenen
dc.subjectImage Processingen
dc.subjectECEDen
dc.titleAlgorithm for automatic text retrieval from images of book coversen
dc.typeThesisen

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
3825.pdf
Size:
5.55 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.78 KB
Format:
Item-specific license agreed upon to submission
Description: