Handwritten Punjabi Character Recognition using Convolutional Neural Networks
Loading...
Files
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Today, computers have influenced the life of human beings to a great extent. To provide
the communication between computers and users, natural language processing techniques
have proven to be very efficient way to exchange the information with less personnel
requirement. In this thesis work, natural handwriting technique is used to recognize the
online handwritten Punjabi characters as natural handwritten characters are less error
prone as compared to the input taken via mouse or keyboard. This thesis describes
the implementation of handwritten Punjabi character recognition using deep learning
technique named as Convolutional Neural Networks (CNNs). The main problem occurs
in the recognition of handwritten characters is due to the occurrence of variation in the
handwriting style of different users because each person has their own style of writing
and also the variability in the writing style of his/her own style due to change in mood,
speed of writing at different instant of time.
Punjabi script is chosen for this research work as it comes on 14th position in the spoken
languages and less work is done on Punjabi script as compared to work done on
other scripts such as English, Devanagari, Gujarati, Chinese. CNN is chosen for the
implementation as it is proven to be very efficient technique to recognize and classify the
recognized handwritten characters into their respective classes as it concentrates on the
dynamic features of the input handwritten character which is obtained from the random
generated character matrices.
Here, we used 5-layer CNN having stride value of one for the classification of handwritten
images into one of the large number of classes (430 classes) available. Punjabi script
has total of 430 classes consisting of 35 consonants, 10 vowel identifiers and their corresponding
combination characters. In our dataset, each class contains 100 images thereby
providing a total of 43,000 number of character images dataset. We divide our dataset in
the ratio of 65:25:10, 55:35:10, 45:45:10 training:testing:validation samples data respectively.
Training, testing and validation accuracy at different number of epochs (consist
of forward pass and backward pass) for these different sample ratios are calculated and
thus compared.
Description
Master of Engineering -CSE
