Lexical Disambiguation using English WordNet with Natural Language Toolkit

Loading...
Thumbnail Image

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

The expansion of the Information Technology, has given rise to the emergence of the great amounts of the unstructured data like the Web pages, document warehouses, blog corpora and many more. Consequently, there is arising an increasing demand to treat the massive information through the means of automated methods of lexical disambiguation i.e. Word Sense Disambiguation (WSD). It is a tedious task to deal with, as to resolve this issue one need to overcome the complexities of language and it is a complicated affair to recognize a semantic layout from the unstructured sources of the text and still the researches are continued in this field so as to resolve the issue at the best possible level of accuracy. WSD is considered as an artificial intelligence problem having the capability to recognize the meaning of the words, which are in the context of the given text. The issue of lexical disambiguation existing in a sentence is resolved here with the help of the Lesk Algorithm, with the modification that, the Part of Speech (POS) of the ambiguous word is predicted with the help of Decision Tree Classifier, which helps in resolving the issue of accuracy to determine the correct POS to a great extent, and this even aided the Lesk Algorithm to limit its effort to just one Part-of-Speech of the ambiguous word only. The output is yielded in the form of the ‘sense’ which gives a best match with the context of the sentence in which it is mentioned. Experimental results showed that the accuracy to determine the sense of a word was improved. The resultant sense obtained as an output was further judged by the computation of its similarity score (i.e. Wu-Palmer similarity score and Jiang-Conrath similarity score) with the words in the context bag. The modified Lesk Algorithm further facilitated in getting the correct translation of the ambiguous words to the languages named Punjabi and Hindi.

Description

ME, CSED

Citation

Endorsement

Review

Supplemented By

Referenced By