Speech Recognition of Punjabi Numerals Using Convolutional Neural Networks (CNNs)

Thakur, Aditi

Speech Recognition of Punjabi Numerals Using Convolutional Neural Networks (CNNs)

dc.contributor.author	Thakur, Aditi
dc.contributor.supervisor	Verma, Karun
dc.date.accessioned	2017-08-11T04:44:30Z
dc.date.available	2017-08-11T04:44:30Z
dc.date.issued	2017-08-11
dc.description	Master of Engineering -CSE	en_US
dc.description.abstract	Speech is one of the most natural ways a human interacts and expresses. It is the most convenient form of giving an input to a system. With advancements in technology almost every object that surround humans is slowly progressing towards being automated. This means that in near future almost everything will be controlled using voice or gestures. Slowly and steadily the count of devices and objects that we come across daily in our lives being speech recognizable is increasing like ATMs for visually impaired people and various applications can be supported with speech recognizing system to provide employment opportunities for the differently abled people. But achieving good accuracy in speech recognition and making the speech recognition system noise robust has always been one of the main concerns of this research area. The model that has dominated the speech recognition field has been GMM-HMM, but with the advancement in the big data field and the computing power, the deep net models have leveraged these gains and used them to outperform GMM-HMM model .But still there is a race of minimizing the error rate. Achieving accuracy for speech recognition has been a huge obstacle in the domain of Natural Language Processing. The model used predominantly for recognizing speech is GMM-HMM. But with the boom of Deep learning, it has took primacy over the earlier model. With the advancement in the parallel processing and usage of the GPU power, Deep Learning has emanated throughout and has set forth results that has asserted the fact of it outperforming the GMM-HMM. In this research work we implemented deep learning algorithm - Convolutional Neural network (CNN) with the purpose of achieving good accuracy using the data set. The data is audio data (.wav files) capturing recital of counting from 0 to 100 in Punjabi Language. Data has been targeted to achieve a good balance of male and female speakers. The CNN model architecture comprises of four stack of convolutional layer , ReLU unit and Max pooling unit and further the output from these stacks is passed on to the two fully connected layer . The first fully connected layer has a drop out of 25%. The results obtained from this work has shown better performance as compared to the existing work.	en_US
dc.identifier.uri	http://hdl.handle.net/10266/4629
dc.language.iso	en	en_US
dc.subject	Convolutional Neural Network	en_US
dc.subject	Speech Recognition	en_US
dc.subject	Dropout	en_US
dc.subject	Pooling	en_US
dc.subject	Back Propagation	en_US
dc.subject	Gradient Descent	en_US
dc.title	Speech Recognition of Punjabi Numerals Using Convolutional Neural Networks (CNNs)	en_US
dc.type	Thesis	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: 4629.pdf
Size:: 1.73 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 2.03 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Masters Theses@CSED