Word Sense Disambiguation for Hindi Language

Loading...
Thumbnail Image

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Hindi is a national language of India, spoken by 500 million people and ranking 4th by majority spoken in the world. But, the language is making hindrances in the advantages of Information Technology revolution in India. So, there is the need of the adequate measures to perform natural language processing (NLP) through computer processing so that computer based system can be interacted by users through natural language like Hindi and handled by users who have knowledge of regional language. So, Language Translator is important tool to resolve this problem. Word Sense Disambiguation (WSD) is an important concept that is to be evaluated for performing machine translation and a tool is needed to perform disambiguation so that computers would be able to interpret a word in its proper sense according to its context. Word Sense Disambiguation (WSD) is the process of identifying which sense of a word is used in a given sentence. A word can have a number of senses, which is termed an ambiguity. Something is ambiguous when it can be understood in two or more possible ways or when it has more than one meaning. This word sense disambiguation is an ‘intermediate task’, which is not an end in itself, but rather is necessary at one level or another to accomplish most natural language processing tasks. In this way, Word Sense Disambiguation (WSD) is the problem of selecting a sense for a word from a set of predefined possibilities. Here the sense inventory usually comes from a dictionary or thesaurus to determine these different possibilities. In this thesis work, the different approaches of Word Sense Disambiguation (WSD) like knowledge based approaches, machine learning based approaches and hybrid based approaches are discussed, and later the problem of disambiguation is being tried to solve by using Hindi WordNet developed at IIT, Bombay containing different words and their sets of synonyms called synsets. By the help of the words in these synsets, we make an attempt to resolve the ambiguity by making the comparisons between the different senses of the word in the sentence with the words present in the synset form of the WordNet and the information related to these words in the form of parts-of-speech. This WordNet is considered to be the most important resource available to researchers in computational linguistics, text analysis and many related areas.

Description

M.E (CSED)

Citation

Endorsement

Review

Supplemented By

Referenced By