Word Sense Disambiguation for Hindi Language
Loading...
Files
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Hindi is a national language of India, spoken by 500 million people and ranking 4th by
majority spoken in the world. But, the language is making hindrances in the advantages
of Information Technology revolution in India. So, there is the need of the adequate
measures to perform natural language processing (NLP) through computer processing so
that computer based system can be interacted by users through natural language like
Hindi and handled by users who have knowledge of regional language. So, Language
Translator is important tool to resolve this problem. Word Sense Disambiguation (WSD)
is an important concept that is to be evaluated for performing machine translation and a
tool is needed to perform disambiguation so that computers would be able to interpret a
word in its proper sense according to its context.
Word Sense Disambiguation (WSD) is the process of identifying which sense of a word
is used in a given sentence. A word can have a number of senses, which is termed an
ambiguity. Something is ambiguous when it can be understood in two or more possible
ways or when it has more than one meaning. This word sense disambiguation is an
‘intermediate task’, which is not an end in itself, but rather is necessary at one level or
another to accomplish most natural language processing tasks. In this way, Word Sense
Disambiguation (WSD) is the problem of selecting a sense for a word from a set of
predefined possibilities. Here the sense inventory usually comes from a dictionary or
thesaurus to determine these different possibilities.
In this thesis work, the different approaches of Word Sense Disambiguation (WSD) like
knowledge based approaches, machine learning based approaches and hybrid based
approaches are discussed, and later the problem of disambiguation is being tried to solve
by using Hindi WordNet developed at IIT, Bombay containing different words and their
sets of synonyms called synsets. By the help of the words in these synsets, we make an
attempt to resolve the ambiguity by making the comparisons between the different senses
of the word in the sentence with the words present in the synset form of the WordNet and
the information related to these words in the form of parts-of-speech. This WordNet is
considered to be the most important resource available to researchers in computational
linguistics, text analysis and many related areas.
Description
M.E (CSED)
