Please use this identifier to cite or link to this item:
Title: Semantic Web Mining of Unstructured Data
Authors: Manuja, Manoj
Supervisor: Garg, Deepak
Keywords: Semantic Web Mining;SVM;Semantic kernnel;computer science
Issue Date: 6-Oct-2014
Abstract: Over the last couple of decades, web classification has gradually transitioned from syntax to semantic centered approach that classifies the text based on domain ontologies. These ontologies are either built manually or populated automatically using machine learning techniques. Pre-requisite condition to build such system is the availability of ontology which may be either full-fledged domain ontology or a seed ontology that can be enriched automatically. This is a dependency condition for any given semantic based text classification system. We have designed, developed and implemented a web classification system that is self-governed in terms of ontology population and does not require any pre-built ontology either full-fledged or seed. It starts from user query, build a seed ontology from it and automatically enrich it by extracting concepts from the downloaded documents only. The evaluated parameters like precision (85%), accuracy (86%), AUC (Convex) and MCC (High + ive) provide a better worth of the proposed system when compared with similar automated text classification systems. We have used Support Vector Machines (SVMs) to find similarity / dissimilarity measures among concepts and features so that similar concepts are linked together for optimal knowledge discovery. The learning system we have developed above has two components – kernel machine for encapsulating the learning task and kernel function for imbibing the learning hypothesis. Linear kernel function has been used which primary exploits syntactic structures of the text. To improve the scope of knowledge extraction, we have exploited semantic kernel functions which use a-priori semantic information for knowledge extraction. Therefore, building the classification system with semantic kernel functions instead of linear kernel functions forms the next step of our research. We have tried to validate vi the performance and accuracy parameters obtained above by way of using semantic kernel function in place of linear kernel function. This also provides us an opportunity to explore the usefulness of semantic kernel functions in the context of semantic web mining. The evaluated parameters like precision (89.2%), accuracy (88%), AUC (More Convex Area) and MCC (More High +ve) clearly validate our framework with improved performance and accuracy measurements when we use semantic kernel functions instead of linear kernel functions. There are a few open issues like fine tuning of query manager in our framework, use of OWL instead of RDF, and performance improvement of overall system which need to be explored more in depth as future directions to this research work.
Description: PhD-CSED-Thesis
Appears in Collections:Doctoral Theses@CSED

Files in This Item:
File Description SizeFormat 
3243.pdf1.83 MBAdobe PDFThumbnail

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.