English to Hindi Statistical Machine Translation System
Loading...
Authors
Supervisors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Machine Translation is an important part of Natural Language Processing. It refers to
using machine to convert one natural language to another. Statistical Machine
Translation is a part of Machine Translation that strives to use machine learning
paradigm towards translating text. Statistical Machine Translation consists of
Language Model (LM), Translation Model (TM) and decoder.
In this thesis, English to Hindi Statistical Machine Translation system has been
developed. The development of Language Model, Translation Model and decoder is
done by making use of software’s available in Linux environment. SR International’s
Language Model (SRILM) for Language Model, GIZA++ and mkcls for Translation
Model, Moses for decoding, has been used in this system.
LM computes the probability of target language sentences. TM calculates the
probability of target sentences given the source sentence and the decoder maximizes
the probability of translated text of target language.
A parallel corpus of 5000 sentences in English and Hindi has been used in training of
the system. The system was evaluated using manual evaluation method and a
geometric average score of 2.693, 2.93 on the parameters of fluency and adequacy
respectively, were found.
Description
M.E. (Software Engineering - CSED)
