Please use this identifier to cite or link to this item:
http://hdl.handle.net/10266/1885
Title: | Automatic Identification of Silence, Voiced and Unvoiced Chunks in Speech |
Authors: | Sharma, Poonam |
Supervisor: | Sharma, R. K. |
Keywords: | Speech recognition;Silence Region;Voiced Region;Unvoiced Region;ZCR;F0;STE |
Issue Date: | 21-Aug-2012 |
Abstract: | Computers are greatly influencing the lives of human beings and their usage is increasing at a tremendous rate. The ease with which we can exchange information between user and computer is of immense importance today. But the input devices like mouse and keyboard have their limitations when used as an interface to exchange the information. Speech which is natural and quick way of exchanging the information between humans, if used to communicate with computers can overcome all these limitations. Speech recognition is in research for many years and has attracted many researchers across the world. Detection of word boundary, silence detection, voiced unvoiced detection, noise removal, effects of voice quality are the prominent problems for achieving high degree of accuracy in speech recognition. The main goal of this thesis is to design an algorithm for automating the detection of silence, voiced and unvoiced chunks in speech signal which is very important for increasing accuracy of any recognition system. This thesis is divided into five chapters. A brief outline of each chapter is given in the following paragraphs. Chapter 1 includes two sections namely, speech recognition and its issues and literature survey. Issues in speech recognition include: Silence, unvoiced and voiced detection, noise and voice quality. In literature review a detailed literature survey on the algorithms and methods used until now for word boundary detection and silence, unvoiced and voiced classification is done chronologically. Chapter 2 contains the work carried out for the three important phases namely data collection, preprocessing and feature extraction for the automation of the classification. In data collection phase sounds were recorded of 3 males and one female member. 15 words were spoken 3 times by each member in Hindi. After that in preprocessing windowing of the speech signal was done using rectangular and hamming window and than three different features namely, zero crossing rate, short time energy and fundamental frequency were calculated which were used for the automation of the algorithm. Chapter 3 focuses on the main work done for automation. Its first section discuss the results that are derived from the calculation of feature vectors and are helpful in the identification of the silence, voiced and unvoiced chunks in the input speech signal. The next section describes (iii) how these results are used to develop the algorithm and provides the information regarding the steps that are followed for automation and the corresponding flow chart. Chapter 4 discusses the outputs of the algorithm showing graphs for different words showing the situations when the algorithm was almost completely identifying the signal correctly and when it was not in some situations. The overall accuracy of the algorithm is found out to be 96.61 %. Chapter 5 presents the conclusion of the work done and also the advancements that can be made to the work to increase the accuracy. |
Description: | Master of Technology (Computer Science and Applications) |
URI: | http://hdl.handle.net/10266/1885 |
Appears in Collections: | Masters Theses@CSED |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.