UNL Punjabi Deconverter

Loading...
Thumbnail Image

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

iii The World Wide Web represents a formidable tool for communication and information access. With simple equipment, it is possible to access innumerable documents about a huge variety of topics, from any place around the world. However, despite the abundance of information, languages very often cause problems. When most of the web pages today are written in few most commonly used languages like English, French, Chinese etc, it becomes difficult for a person with insufficient knowledge of these languages to access and use this tool of communication and information. This has prompted the need to devise means of automatically converting the information from one natural language to another natural language, called Machine Translation. This process needs syntactic and semantic analysis of both source and target languages. Interlingua based machine translation has received a considerable attention because of economy of translation of effort and also additional attraction of the Interlingua providing a knowledge representation scheme. In this thesis work, we have dealt with the language independent deconverter for the Punjabi language it takes as input a UNL (Universal Networking language) expression. For the purpose of conversion we use Interlingua which follow the UNL specifications proposed by UNU/IAS Tokyo. UNL (Universal Networking language) is a language used to represent a semantic graph equivalent of a concept (contained in text document). The system takes a set of UNL expression as input and with the help of language independent algorithm and language dependent data generates corresponding Punjabi sentence. The process of deconversion involves syntax planning, case marker generation and morphology phase. The syntax planning phase is aimed at generation of proper sequence of words for the target sentence. These phases first reads the input UNL file and convert it into semantic-net like structure known as nodenet. Nodenet is a directed acyclic graph structure, which defines the sentence in the form of Directed Acylic Graph. We use lexicon files to map the UWs to target language worlds. After generating a nodenet, the problem of the syntax plan generation get reduce to the problem of Directed Acylic Graph traversal. Proper traversal of the node net generates the syntax plan of the target sentence. This syntax plan needs to be processed by the case-marking file, which apply proper case marker for each and every relations. This case-marking phase is next processed by the morphology phase. The morphology phase gives a final form of the target sentence.

Description

Citation

Endorsement

Review

Supplemented By

Referenced By