Speaker Independent Isolated word speech to text conversion using HTK
| dc.contributor.author | Mittal, Shweta | |
| dc.contributor.supervisor | Verma, Karun | |
| dc.date.accessioned | 2014-08-05T11:27:18Z | |
| dc.date.available | 2014-08-05T11:27:18Z | |
| dc.date.issued | 2014-08-05T11:27:18Z | |
| dc.description | Master of Engineering-Thesis | en |
| dc.description.abstract | Speech to Text Conversion or Speech Recognition allows a computer to identify the words that a person speaks into a mike or any other similar hardware and convert it into written words. This thesis provides a description of implementation of HMM (Hidden Markov Model) Based Speaker Independent Isolated Word Speech to Text Conversion System. The System is developed by using HTK (Hidden Markov Model ToolKit) for Punjabi language which is an Indo-Aryan language spoken by about 130 million people mainly in West Punjab in Pakistan and in East Punjab in India. For implementation of the system, first of all, gathering of data that include 1010 words having 10 records from every 101 distinct words that is Punjabi language counting (0 to 100) is done from 10 distinct people. Then two sets of data are prepared in which first set of data obtained the above gathered data and second set of data obtained the above gathered data obtained after applying noise reduction technique (Auto Spectral Subtraction) on it. Then for both sets of data, out of 1010 words, 760 words are used to train the system and 250 words are used to test the system. The system uses the Mel Frequency Cepstral Coefficients (MFCCs) to extract features from speech files. For both sets of data, the system is trained and tested at three levels that are word level, mono-phone level and tri-phone level. The accuracy obtained from the system is 84.8% at word level, 88% at mono-phone level and 97.2% at tri-phone level for first set of data and 89.2% at word level, 92.4% at mono-phone level and 98.4% at triphone level for second set of data. | en |
| dc.description.sponsorship | Computer Science and Engineering, Thapar University, Patiala | en |
| dc.format.extent | 2178834 bytes | |
| dc.format.mimetype | application/pdf | |
| dc.identifier.uri | http://hdl.handle.net/10266/2828 | |
| dc.language.iso | en_US | en |
| dc.subject | Autospectral Subtraction feature | en |
| dc.subject | Noise Reduction | en |
| dc.subject | Speech to Text Convesion | en |
| dc.subject | Triphone Model | en |
| dc.subject | Monophone | en |
| dc.subject | computer science | en |
| dc.title | Speaker Independent Isolated word speech to text conversion using HTK | en |
| dc.type | Thesis | en |
