Improving Efficiency of Web Crawler Algorithm Using Parametric Variations
Loading...
Files
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
There are billions and billions of Web pages published over the internet via World Wide
Web. All of us rely on internet as a source of information. This source of information is
available in various forms; Websites, databases, images, sound, videos and many more.
A search engine classifies the Search results by keyword matches, link analysis, or other
mechanisms perhaps not entirely clear to a front end user. Search engines help us to
gather information from their own indexed databases. The Web Crawler of these search
engines is expert in crawling various Web pages to gather huge source of information.
In this thesis report explains the basic architecture of crawler based search engine, its
working in offline (query independent) and online (query dependent). Also explains
architecture of web crawler, its working and various algorithms used by the web crawler
for index the web. Calculating the pagerank value to a page by considering web as graph.
A better relevance formula was implemented by combining page rank with the topic
similarity measure of the hyper link meta data, this provides better ranking to the relevant
documents for the given user query. From the experimental results of metrics which are
implemented into C language concludes the implemented formula gives better ranking
than other metrics.
Description
M.E. (CSED)
