Segmentation of Punjabi Speech Signals into Phonemes Using Hidden Markov Models

dc.contributor.authorBansal, Divya
dc.contributor.supervisorJindal, Khushneet
dc.date.accessioned2012-07-23T07:47:50Z
dc.date.available2012-07-23T07:47:50Z
dc.date.issued2012-07-23T07:47:50Z
dc.descriptionM.Tech. (Computer Science Applications)en
dc.description.abstractRecently, the use of computers in speech synthesis has become an important area of research among speech and computer scientists and linguists. Speech synthesis refers to the artificial production of human speech. For this purpose, speech synthesis systems often called text-to-speech (TTS) systems are developed that "read" text from a document, Web page etc. and generate speech in the form of audio wave or mp3 files. These systems (TTS) are very useful majorly for visually impaired people especially those having poor vision or visual dyslexia, for illiterate people who can understand spoken native language, for educational and research purposes. All TTS systems are developed with the aim to produce high quality synthesized speech which is both natural, intelligible, can be correctly understood and interpreted by the user. This thesis attempts to implement speech synthesis support for Punjabi language in mobile device. It is achieved by segmenting a speech database into smaller units using HMM Toolkit (HTK) based on hidden markov models (HTS approach) that are further concatenated to generate speech signals. The proposed system converts English text in the form of the caller’s name stored in contact list into Punjabi speech in mobile phones. The input text data is initially processed in pre-processing stage for titles like Mr. Tapas, numbers like in Bharat1234, initials like K.K. Sharma and thereafter, the processed data is used in training and testing phase of HTK. With the help of HTK, various HMM acoustic models are firstly trained using spectral features (Mel-Cepstral Coefficients) extracted from the recorded Punjabi speech corpus and various context-independent monophones and context-dependent triphones models are generated. For example for word “bharat” generated monophones are a, bh, t etc. & triphones are bh-a+r. Later in the testing phase, correct phoneme sequence from a network of all possible combinations is generated corresponding to the test sample word using HMM models and feature vectors like for the word “Tapas” the output phoneme sequence is ਤ, ਪ, ਸ instead of phoneme sequence ਟ, ਪ, ਸ. These phoneme sequences are given as input to the application to generate speech signals by concatenating the phonemes.en
dc.description.sponsorshipSchool of Mathematics and Computer Applications, Thapar University, Patialaen
dc.format.extent2223987 bytes
dc.format.mimetypeapplication/pdf
dc.identifier.urihttp://hdl.handle.net/10266/1775
dc.language.isoenen
dc.subjectHidden Markov Modelsen
dc.subjectHMM Tool Kiten
dc.subjectPunjabi Speech Corpusen
dc.subjectSpeech Synthesisen
dc.subjectPhoneme Generationen
dc.titleSegmentation of Punjabi Speech Signals into Phonemes Using Hidden Markov Modelsen
dc.typeThesisen

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
1775.pdf
Size:
2.13 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.79 KB
Format:
Item-specific license agreed upon to submission
Description: