Efficient Framework for Semantic Search on Web

Jindal, Vikas

Efficient Framework for Semantic Search on Web

Files

3975.pdf (2.06 MB)

Date

2016-08-01

Authors

Jindal, Vikas

Supervisors

Bawa, Seema

Batra, Shalini

Abstract

With frequent and faster growth of the Web and dependence on the Web for relevant information retrieval, search engines have become the most popular and powerful tool for accessing desired information online. However, it is observed that the Web pages returned by even a renowned search engine are not so accurately useful. The necessity of finding the most relevant information has given rise to the research in the field of semantic search. Traditional Web search methods where basic relevance criteria rely primarily on the presence of query keywords within the returned pages are required to be replaced with more effective semantic search techniques.Semantic based search would be able to provide users a more intelligent form of finding what they are looking for within the global source of information available online. In this thesis, various approaches for semantic based search on Web have been studied and analyzed resulting in the identification of two broad perspectives of semantic search as elaborated in the chapter on literature review. Fundamental limitations identified in the existing approaches have been major motivation for proposing efficient semantic based search approach. Later a framework for QUery-context based Information retrieval using Corpus Knowledge (QUICK) is proposed which has been elaborated in the chapter on proposed framework. Here the Web pages returned by a baseline system in response to original query are used to generate a corpus of words related to the query category. The word tokens which are laying in the close proximity of the query keywords are supposed to be semantically related to the original query. The relative positioning and frequency of the words with respect to the query word is assigned due importance using probabilistic feature of the proposed approach which in turn ensures to have greater probability in reaching to the context of the query. The approach shows the possibility of generating a set of context features in an efficient manner in order to produce a more accurate model of the query topic. This context oriented semantic search approach has been implemented using an open source library of language processing features, NLTK and integrating it with Python language interpreter. The elaborations have been presented in the chapter on design and implementation of QUICK. Category specific user query is entered to a standard search engine in order to retrieve most relevant documents pertaining to that domain. The top-ranked returned documents are stored and techniques are applied for filtering non-lexical tokens like stop-words, non-alphabetic strings. The words laying in the close proximity of the xv query keywords are extracted to be used as context vector. The strength of association of the context vector features to the category is calculated and presented in the form of a list. A set of features having best strength of association to the category are selected and treated as the context features of the category to be used for the semantic expansion of the query pertaining to that category. The experiments for the comparison of result set precision of the proposed QUICK based semantic search and the standard keyword based search have been performed and elaborated in the chapter on testing and validation. The proposed semantic based search approach has witnessed a significant improvement over the standard keyword based approach. Finally, the findings of the entire thesis have been concluded along with the potential scope for future directions in the said domain.

Description

PHD, CSED

Keywords

Semantic Web, Ontologies, Knowledge Corpus, Semantic Search

URI

http://hdl.handle.net/10266/3975

Collections

Doctoral Theses@CSED

Full item page

Efficient Framework for Semantic Search on Web

Files

Date

Authors

Supervisors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections

Endorsement

Review

Supplemented By

Referenced By