Automatic Identification of Modal, Breathy and Creaky Voices

Sharma, Ajay

Automatic Identification of Modal, Breathy and Creaky Voices

Files

1884.pdf (2.01 MB)

Date

2012-08-21T04:32:20Z

Authors

Sharma, Ajay

Supervisors

Sharma, R. K.

Abstract

Computers in the past few decades have changed a lot, from size of almost a room to size of one’s palm. Nowadays, even mobile phones are equivalent to a mini computer. Interacting with computer has changed from punched cards to finger tip, but still speech is not widely used as an interaction medium with the computer. This is mainly due to the problems faced during recognizing of speech. Voice quality is one of the reasons for the not so fast and effective growth in the domain speech recognition. This thesis deals with the identification of modal, creaky and breathy voices. An algorithm is presented in this thesis which successfully identifies these three types of voice qualities. The thesis is divided into five chapters. A brief outline of each chapter is given in the following paragraphs. Chapter 1 firstly discusses the basic model of Speech Recognition. Then the issues in Automatic Speech Recognition are discussed which are: noise, voice quality and detection of voiced, unvoiced and silence region. Finally a literature survey on the algorithms and methods used to identify different types of voice qualities is done. Chapter 2 is divided into three parts, i.e., data collection, preprocessing and computation of features. Data collection part describes how data was collected and for how many users it was collected. The preprocessing phase then discusses the preprocessing technique applied (windowing) before features are extracted. Finally feature extraction explains the different features used, like zero crossing rate, fundamental frequency and short time energy. Chapter 3 discusses the facts and results obtained from the features used which are then used to identify the different voice qualities. Finally an algorithm is designed using these features and applied to the data collected. Chapter 4 is divided into two parts the first part displays the output obtained from the algorithm for words spoken in different voice qualities. The next part shows the accuracy obtained for different voice qualities along with the overall accuracy of the algorithm. The algorithm proposed is able to achieve 90.1% accuracy in identifying the modal voices, 89.8 accuracy for breathy and finally 80.7% for creaky voices. iii Chapter 5 concludes the work. It is worth mentioning here that overall accuracy achieved in this work using the proposed algorithm is 87.2%. Also future scope in this domain is discussed in this chapter.

Description

Master of Technology (Computer Science and Applications)

Keywords

Speech recognition, modal voice, breathy voice, creaky voice, voice quality, ZCR, F0, STE

URI

http://hdl.handle.net/10266/1884

Collections

Masters Theses@CSED

Full item page

Automatic Identification of Modal, Breathy and Creaky Voices

Files

Date

Authors

Supervisors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections

Endorsement

Review

Supplemented By

Referenced By