Algorithm for automatic text retrieval from images of book covers

Yadav, Niharika

Algorithm for automatic text retrieval from images of book covers

dc.contributor.author	Yadav, Niharika
dc.contributor.supervisor	Kumar, Vinay
dc.date.accessioned	2015-10-28T09:15:28Z
dc.date.available	2015-10-28T09:15:28Z
dc.date.issued	2015-10-28T09:15:28Z
dc.description	ME, ECED	en
dc.description.abstract	Text extraction is one of the major areas of research in the field of document image Analysis. Text retrieval is needed for bibliographic databases, structuring images etc. Text embedded in multimedia data, as a well-defined model of concepts for humans’ communication, contains much semantic information related to the content. This text information can provide a much truer form of content–based access to the image and video documents if it can be extracted and harnessed efficiently. Moreover, automation of this process will greatly reduce the human interference while converting books (specifically their covers where this task becomes extremely difficult) to readable and editable electronic format specifically for electronic book readers. However this is a challenging task because images contain text of different size, style, orientation, alignment, low contrast, noise and have complex background structure. This dissertation propounds a method for extracting text from images of book covers and embedded text. A new text model is constructed to retrieve text regions from the scene text images. The image is first clustered to reduce the number of color variances, a suitable plane is identified and then text region is segmented using connected component based method. The text thus obtained is then enhanced to ameliorate the results. A detailed study of sundry techniques that have been proposed so far, along with their performance analysis has also been incorporated in the work. The algorithm is evaluated comprehensively on various datasets including ICDAR -2011 dataset. The experimental results demonstrate that the proposed text detection method can capture the inherent properties of text and discriminate text from other objects efficiently. The proposed method gives a very high character recognition rate for monochrome images, however in cases where there is a drastic variation in the text features rejection is noticeable.	en
dc.format.extent	5824581 bytes
dc.format.mimetype	application/pdf
dc.identifier.uri	http://hdl.handle.net/10266/3825
dc.language.iso	en	en
dc.subject	Image Processing	en
dc.subject	ECED	en
dc.title	Algorithm for automatic text retrieval from images of book covers	en
dc.type	Thesis	en

Files

Original bundle

Now showing 1 - 1 of 1

Name:: 3825.pdf
Size:: 5.55 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.78 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Masters Theses@ECED