Improved Performance of RDF Data Using HIVE ,PIG in HADOOP

Loading...
Thumbnail Image

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Semantic Web Data is an efficient way to represent data in World Wide Web. Semantic data is not concerned about the structure, it is concerned about the meaning of data. In the modern generation of "Semantic Web Data" and its rapid growth requirement is well managed storage and evaluation. So cloud data services play an important role. These services are based on the MapReduce Programming Model. MapReduce Programming model is famous for its scalability, parallel processing, cost-effective solution and flexibility. Hadoop is an open source implementation of MapReduce. Hadoop has two components , one for storage part that is HDFS(Hadoop Distributed File System) and other for processing part that is MapReduce. Hadoop based extensions such as PIG and HIVE are query languages which provide high level data flow. Although SPARQL is a query language for RDF(Resource Description Framework) and it is also considered as the backbone of the semantic web based applications. Here we introduce HIVE and PIG for querying RDF data. Here we have a dataset consist 5000 triples and then we execute our queries which are based on HIVE, PIG and SPARQL on this dataset. The goal of this thesis is to compare the results of SPARQL, HIVE and PIG and analyze the retrieval time for a query in RDF data. Finally we can conclude which framework will work fast and scalable for our dataset.

Description

Master of Engineering-CSE

Keywords

Citation

Endorsement

Review

Supplemented By

Referenced By