Please use this identifier to cite or link to this item: http://hdl.handle.net/10266/2877
Title: Application of Data Pre-Processing Techniques for Supervised Classification
Authors: Singh, Tej
Supervisor: Kumar, Ravi
Keywords: NEURUAL NETWORK CLUSTERING;electronics and communication;communication
Issue Date: 12-Aug-2014
Abstract: This dissertation is an effort to assess the learning and generalization performance of multilayer perceptrons preceded by a preprocessing stage. In other words, the effect of data preprocessing on several benchmark data has been summarized in this work. The goal of any supervised or unsupervised algorithm is to find a function that best suits a set of inputs to its correct output. However, single layer perceptron cannot learn some relatively simple patterns, such as those are not linearly separable. A multi-layered network overcomes such shortcomings and especially Back-Propagation Neural Networks (BPNN) can create internal representation and learn different features in each layer. Principal Component Analysis (PCA) is a powerful tool for analyzing data. The main advantage of PCA is that once you have found these patterns in the data and you compress the data i.e. by reducing the number of dimensions, without much loss of information. Transfer Cluster Analysis (TCA) tries to learn some transfer components across domains in a reproducing kernel space using maximum mean discrepancy. Wavelet transform is fast emerging as one of the most potent tools for signal analysis. Wavelet analysis has many advantages over traditional Fourier transform based approaches. Back-propagation algorithm is best known algorithm in supervised learning. This method used the gradient descent to calculate the loss function with respect to all weight in the network. This technique used a training function traingdm and a learning adaption function learngdm. The performance function is calculated in terms of mean square error (MSE). This technique is used to trained for two different real time data called as Iris data and User Student Modeling data respectively. This BP Algorithm is trained for various combination of learning rate lr (0.1 to1) ,and momentum constant mc (0.1-1.0). And technique is trained for varies the number of neuron from two to eight which result are shown by graphic technique Box-Plot. After applying BP algorithm on raw data sets taking fifty percentage as training input and rest of the data for test data. A well known data compression technique Principal Component Analysis (PCA) and transfer cluster analysis (TCA) is applied on same data sets. The modifying data set is again trained with BP algorithm and results are shown again by Box-Plot. Next, a continuous wavelets transform technique is after applying PCA and TCA technique on raw data sets. Obtained data set is trained with BP algorithm for various types of combination of learning rates and momentum constants. The results have shown that data preprocessing has a profound effect on classification performance of the final classifier.
Description: Master of Engineering-Thesis
URI: http://hdl.handle.net/10266/2877
Appears in Collections:Masters Theses@ECED

Files in This Item:
File Description SizeFormat 
2877.pdf1.21 MBAdobe PDFThumbnail
View/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.