Please use this identifier to cite or link to this item: http://hdl.handle.net/10266/3424
Title: A hybrid technique to remove back-to-front interference in historical document images
Authors: Singhal, Arushi
Supervisor: Kumar, Rajiv
Keywords: Back-to-front interference;Historical documents;Bleed through effect;computer science;computer science applications
Issue Date: 28-Jul-2015
Abstract: The study of historical documents is a topic that presents major challenges for researchers from various fields such as history, political science, psychology, computer science, among others. Historical documents contain significant information about cultural and scientific value. Historical artifacts consist of documents, letters, newspapers, pictures, maps, etc. Many of these are stored in libraries, museums or government archives. However, due to the preservation, few people have access to this material. Also such documents are frequently degraded over time. In order to make easier the access to this rich source of knowledge of the history of a society, digitization of the material comes as a possible solution. Digitized degraded documents require specialized processing to remove different kinds of noise and to improve readability. However, handling these documents is extremely delicate. Getting software to do the work automatically what the user would need to do manually can bring great financial and historical benefits, alongside with better preservation. The problem is further aggravated if the document is written on both sides because with time the ink from the back side of the paper tends to seep through and disturbs the visibility of text on the other side during digitization of paper. This effect is called as “ink-bleed through” or “back-to-front interference”. Among the document image processing steps, the segmentation is one of the most important as it will be responsible for identifying what needs to be recognized. The first step of segmentation is the thresholding (or binarization) of the image. Binarization identifies which pixels belong to the foreground image and which belong to the background. A misclassification of the pixels can impair subsequent stages of processing. We present a new approach for this problem by filtering the background first using ideas of visual perception theory. When an observer stands back from a document, he/she loses the details of the image (as the acuity of the human vision decreases with the distance). Distant objects project smaller images onto the retina. As we increase the distance from the object, the details are lost and only the main colors remain. This idea is used to binarize the degraded historical documents and remove “back-to-front interference”.
Description: M.Tech-Computer Science Applications
URI: http://hdl.handle.net/10266/3424
Appears in Collections:Masters Theses@CSED

Files in This Item:
File Description SizeFormat 
3424.pdf2.3 MBAdobe PDFThumbnail
View/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.