Segmentation of Punjabi Speech Signals into Phonemes Using Hidden Markov Models

Bansal, Divya

Segmentation of Punjabi Speech Signals into Phonemes Using Hidden Markov Models

dc.contributor.author	Bansal, Divya
dc.contributor.supervisor	Jindal, Khushneet
dc.date.accessioned	2012-07-23T07:47:50Z
dc.date.available	2012-07-23T07:47:50Z
dc.date.issued	2012-07-23T07:47:50Z
dc.description	M.Tech. (Computer Science Applications)	en
dc.description.abstract	Recently, the use of computers in speech synthesis has become an important area of research among speech and computer scientists and linguists. Speech synthesis refers to the artificial production of human speech. For this purpose, speech synthesis systems often called text-to-speech (TTS) systems are developed that "read" text from a document, Web page etc. and generate speech in the form of audio wave or mp3 files. These systems (TTS) are very useful majorly for visually impaired people especially those having poor vision or visual dyslexia, for illiterate people who can understand spoken native language, for educational and research purposes. All TTS systems are developed with the aim to produce high quality synthesized speech which is both natural, intelligible, can be correctly understood and interpreted by the user. This thesis attempts to implement speech synthesis support for Punjabi language in mobile device. It is achieved by segmenting a speech database into smaller units using HMM Toolkit (HTK) based on hidden markov models (HTS approach) that are further concatenated to generate speech signals. The proposed system converts English text in the form of the caller’s name stored in contact list into Punjabi speech in mobile phones. The input text data is initially processed in pre-processing stage for titles like Mr. Tapas, numbers like in Bharat1234, initials like K.K. Sharma and thereafter, the processed data is used in training and testing phase of HTK. With the help of HTK, various HMM acoustic models are firstly trained using spectral features (Mel-Cepstral Coefficients) extracted from the recorded Punjabi speech corpus and various context-independent monophones and context-dependent triphones models are generated. For example for word “bharat” generated monophones are a, bh, t etc. & triphones are bh-a+r. Later in the testing phase, correct phoneme sequence from a network of all possible combinations is generated corresponding to the test sample word using HMM models and feature vectors like for the word “Tapas” the output phoneme sequence is ਤ, ਪ, ਸ instead of phoneme sequence ਟ, ਪ, ਸ. These phoneme sequences are given as input to the application to generate speech signals by concatenating the phonemes.	en
dc.description.sponsorship	School of Mathematics and Computer Applications, Thapar University, Patiala	en
dc.format.extent	2223987 bytes
dc.format.mimetype	application/pdf
dc.identifier.uri	http://hdl.handle.net/10266/1775
dc.language.iso	en	en
dc.subject	Hidden Markov Models	en
dc.subject	HMM Tool Kit	en
dc.subject	Punjabi Speech Corpus	en
dc.subject	Speech Synthesis	en
dc.subject	Phoneme Generation	en
dc.title	Segmentation of Punjabi Speech Signals into Phonemes Using Hidden Markov Models	en
dc.type	Thesis	en

Files

Original bundle

Now showing 1 - 1 of 1

Name:: 1775.pdf
Size:: 2.13 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.79 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Masters Theses@CSED