Analysis of Big Data through Deduplication Technique
Loading...
Files
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
As the data available on the web is in heterogeneous formats such as text, video, audio
etc. Hence, there is need to integrate the data from the different sources and analyze the
data which can be utilized for efficient query execution. If data is not analyzed properly
then execution time for the user query processing will be more and result also will not be
according to user need. So, there is need to analyze the data after combining different
formatted data into same format. After integration, data becomes large and there is need
to used different type of de-duplication techniques to analyze data. Because the different
formatted data may contain same record so there is chance of redundancy of data. There
are different data de-duplication techniques for removal of redundant or similar data.
There is another a de-duplication technique has been introduced in which format
comparison of data is checked after integrating heterogeneous data in same format.
Finally, the experimental results validate the efficiency in terms of execution time,
storage space and success.
