Please use this identifier to cite or link to this item: http://hdl.handle.net/10266/5276
Title: Sanskrit to Hindi Statistical Machine Translation System
Authors: Kaur, Manmeet
Supervisor: Chana, Inderveer
Kumar, Ravinder
Keywords: Machine Translation;Statistical approach;Language model;Translation Model;Search
Issue Date: 20-Aug-2018
Abstract: Machine Translation is a name given to computerized strategies used for translations of all or part of data from one regular language into another along with or without from human aid. Machine translation in the field of Natural Languages is benefiting over years of semiautomatic and manual analysis by computer programs and linguists. This is proven to be fruitful in form of publically available dictionaries and transition systems. Computer software’s are used to translate data or speech from one language form to another to as to bridge the communication barrier. There are many approaches for building machine translation. The main problems in the area of Machine translation are corpus availability, training the system and searching for getting the accurate translated text. Machine translation system faces many challenges as of semantics, Structural and lexical ambiguity, word sense which needs to be taken care while building any Translation System. In this research work, Sanskrit to Hindi statistical Machine Translation system is build using Thot, an open source tool available for Statistical Machine translation building. Thot helps in training phrase based models. The tool has many commands to follow for building of the system starting from corpus preprocessing, language and translation model training, generating of configuration files, tuning parameters, searching and end by post processing the output. The proposed system uses n-gram model for language model, phrase alignment model for translation model and branch and bond search algorithm. The system is deployed on cloud using Virtual Private Server. A parallel corpus of 1000 sentences in Sanskrit to Hindi have been built for training the system. The system is evaluated using manual evaluation method and geometric average score. Bleu (Bilingual Evaluation Understudy) score refers to the method used for automatic evaluation of Machine translation system. Microsoft Translator hub is used to calculate Bleu Score in this research work.
URI: http://hdl.handle.net/10266/5276
Appears in Collections:Masters Theses@CSED

Files in This Item:
File Description SizeFormat 
manmeet thesis(801631011).pdf3.37 MBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.