Lexical Disambiguation using English WordNet with Natural Language Toolkit

dc.contributor.authorKukreja, Swati
dc.contributor.supervisorBatra, Shalini
dc.date.accessioned2014-08-12T07:55:12Z
dc.date.available2014-08-12T07:55:12Z
dc.date.issued2014-08-12T07:55:12Z
dc.descriptionME, CSEDen
dc.description.abstractThe expansion of the Information Technology, has given rise to the emergence of the great amounts of the unstructured data like the Web pages, document warehouses, blog corpora and many more. Consequently, there is arising an increasing demand to treat the massive information through the means of automated methods of lexical disambiguation i.e. Word Sense Disambiguation (WSD). It is a tedious task to deal with, as to resolve this issue one need to overcome the complexities of language and it is a complicated affair to recognize a semantic layout from the unstructured sources of the text and still the researches are continued in this field so as to resolve the issue at the best possible level of accuracy. WSD is considered as an artificial intelligence problem having the capability to recognize the meaning of the words, which are in the context of the given text. The issue of lexical disambiguation existing in a sentence is resolved here with the help of the Lesk Algorithm, with the modification that, the Part of Speech (POS) of the ambiguous word is predicted with the help of Decision Tree Classifier, which helps in resolving the issue of accuracy to determine the correct POS to a great extent, and this even aided the Lesk Algorithm to limit its effort to just one Part-of-Speech of the ambiguous word only. The output is yielded in the form of the ‘sense’ which gives a best match with the context of the sentence in which it is mentioned. Experimental results showed that the accuracy to determine the sense of a word was improved. The resultant sense obtained as an output was further judged by the computation of its similarity score (i.e. Wu-Palmer similarity score and Jiang-Conrath similarity score) with the words in the context bag. The modified Lesk Algorithm further facilitated in getting the correct translation of the ambiguous words to the languages named Punjabi and Hindi.en
dc.format.extent1327283 bytes
dc.format.mimetypeapplication/pdf
dc.identifier.urihttp://hdl.handle.net/10266/2870
dc.language.isoenen
dc.subjectLexical Disambiguationen
dc.subjectWordneten
dc.titleLexical Disambiguation using English WordNet with Natural Language Toolkiten
dc.typeThesisen

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
2870.pdf
Size:
1.27 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.79 KB
Format:
Item-specific license agreed upon to submission
Description: