Please use this identifier to cite or link to this item: http://hdl.handle.net/10266/4205
Title: Analysis of Big Data through Deduplication Technique
Authors: Garg, Sanjeev
Supervisor: Bala, Anju
Keywords: Big data;Cloud Computing
Issue Date: 30-Aug-2016
Abstract: As the data available on the web is in heterogeneous formats such as text, video, audio etc. Hence, there is need to integrate the data from the different sources and analyze the data which can be utilized for efficient query execution. If data is not analyzed properly then execution time for the user query processing will be more and result also will not be according to user need. So, there is need to analyze the data after combining different formatted data into same format. After integration, data becomes large and there is need to used different type of de-duplication techniques to analyze data. Because the different formatted data may contain same record so there is chance of redundancy of data. There are different data de-duplication techniques for removal of redundant or similar data. There is another a de-duplication technique has been introduced in which format comparison of data is checked after integrating heterogeneous data in same format. Finally, the experimental results validate the efficiency in terms of execution time, storage space and success.
URI: http://hdl.handle.net/10266/4205
Appears in Collections:Masters Theses@CSED

Files in This Item:
File Description SizeFormat 
4205.pdf3.9 MBAdobe PDFThumbnail
View/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.