Efficient Implementation of Adaptive Filters and Classifiers Using Multilayer Perceptron Feedforward Neural Network
Loading...
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Artificial Neural Networks (ANNs) are the mainstay of machine learning. It is hard to talk about pattern analysis in general without mentioning ANNs or its modern-day variants (e.g. deep networks). Used prolifically as a tool since decades, ANNs as a computational block have until recently been treated as a black box with a grain of suspicion about its actual efficacy. However, eventually the ‘black box’ has become larger and harder to fathom about since the advent of deep learning which has now propelled the machine learning research in some sort of a revolutionary path where everyone treading it is armed with some variant of a ‘deep learning tool’. The time is now ripe to open up the black box of ANN and play around with its components to see how stuff works and may be make it work better. This thesis is a compilation of efforts made for improving the performance of a typical ANN with a basic text book architecture with an aim to extract an efficient and versatile performance an ANN with a minimalist architecture in order to ensure a hardware implementable design and an IC compatible implementation.
In general the following methodology was adopted for identifying an optimal architecture and measuring its performance. The data were first partitioned into training and test sets and several candidate architectures of the ANN were initialized. All the architectures were trained separately were trained repeatedly and Mean Squared Error (MSE) was recorded for each set of experiments. In order to avoid over-fitting K-fold cross validation scheme was adopted and the data in the training and test sets were replaced after a fixed set of experiments. The ANN architecture giving minimum average MSE was selected as the optimal architecture and as then subjected to test data for evaluating its generalization performance. In three out of four chapters describing backpropagation (BP) trained ANN, the above-mentioned methodology was adopted. The architecture was restricted to a single hidden layer with number of neurons varying from two to eight, the aim of the researcher being obtaining the best possible performance from the simplest and shallowest network.
This thesis is organized into the following sections:
Chapter 1 presents an overview of the current technological scenario in the backdrop of a resurgence of ANN underlining the importance of and motivations behind the presented work.
iv
Chapter 2 proposes a principal component analysis (PCA) based learning rate variation methodology by making it dependent upon the output feature statistics rather than the input space as done traditionally. This technique also applies advantage of PCA to assess conditioning of the input data so as to evaluate the sensitivity of error with input data variation and hence evading the whole adaptation process in case of an ill-conditioned data. Simulation experiments performed with several benchmark data sets with variation in the number of attributes and classes reveal the efficacy of the proposed algorithm. The formula proposed in this work does not hurt the computational efficiency of the network, favoring its implementation on any physical environment.
Chapter 3 describes a novel weight pruning technique that not only knocks off the non-relevant weights but also upgrades the performance in terms of generalization and computational complexity. "Coefficient of dominance", a statistically dependent parameter has been introduced in the evaluation of complexity penalty term so as to scrutinize the relevance of weights which further helps to quantify the information content of the weight set. The weight with higher information content can bring out the major variations of the input data for better performance and hence will be retained. The analysis of the methodology used in this chapter has been conducted for datasets with a large diversity of attributes and in comparison to the well-known pruning strategies in terms of execution time, convergence rate, generalization and computational complexity.
Chapter 4 attempts to implement a hybrid learning approach which is a blend of a recently proposed unsupervised, biologically plausible algorithm for training the hidden layers and conventional least mean square (LMS) algorithm for the output layer. This amalgamation is aimed to improve the convergence by training all the neurons concurrently and bypassing the recursive computation of the local gradient. Training the hidden layer neurons using the unsupervised approach helps to provide a direct control over the hidden layer dynamics which also caters for faster training of the output neurons. The performance of the ANN has been investigated for various benchmark datasets with variation in number of hidden layer neurons, epochs, activation functions and parameters of the activation functions.
Chapter 5 aims to implement adaptive filter using an ANN contrasting its traditional design approach. This work commences by implementing a variable step-size adaptive filter with the
v
conventional design approach using variants of LMS and using direct, transpose and a novel area efficient structure. The analysis of filters has been carried out considering the estimated hardware utilization resources, actual area, delay and power targeting specific FPGAs. In the later part of this chapter, ANN based adapted filters are implemented using variants of LMS and trained using two different variants of BP viz. gradient descent with momentum BP and Levenberg Marquardt algorithm. This approach is investigated by comparing the original required signals with that of signals recovered after training the network. This study encompasses analyzing the two approaches presented in this chapter considering both the hardware overhead and computational resources.
Chapter 6 summarizes the key findings and significant contribution of this thesis and the possible future research directions.
