Multi-objective Metaheuristic Approaches for Data Clustering in Engineering Application(s)
Loading...
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Clustering is the process of combining similar data objects into a number of groups
(called clusters). Clusters are formed in such a way that data objects having similar
nature are kept in one cluster and are dissimilar to data objects of other clusters.
This thesis focuses on the concept of clustering, i.e., automatically determining
the number of features and number of clusters. Due to unknown number of
cluster, clustering technique is treated as NP-hard problem. To solve this problem,
metaheuristic algorithms are utilized. In this thesis, two novel bio-inspired
metaheuristic optimization algorithms have been proposed namely Spotted Hyena
Optimizer (SHO) and Emperor Penguin Optimizer (EPO) for solving real-life
engineering design problems. The proposed approaches are assessed on standard
benchmark test functions. The convergence and computational complexity have
also been analyzed to ensure the applicability of proposed algorithms. The
performance of the proposed algorithms are analysed and compared with different
algorithms such as GWO, PSO, MFO, MVO, SCA, GSA, GA, and HS. Experimental
results reveal that the proposed algorithms are able to solve constrained and
unconstrained engineering design problems.
In addition to above-proposed algorithms, a multi-objective version of spotted
hyena optimizer is proposed and named as Multi-objective Spotted Hyena
Optimizer (MOSHO). In order to determine the better solution, the concept of
Pareto dominance is utilized in the proposed approach. An external repository is
used to store the Pareto optimal solutions. An adaptive grid mechanism is used
to produce the distributed Pareto fronts and improve diversity. Moreover, the
group selection mechanism is also employed for better convergence. The roulette
wheel mechanism is used to select the effective solutions from archive to simulate
the social and hunting behaviors of spotted hyenas. The proposed algorithm
has been tested on multi-objective benchmark test functions and then applied
on constrained engineering design problems to demonstrate its applicability on
real-life problems. The experimental results reveal that the proposed algorithm performs better than the others and produces the Pareto optimal solutions with
high convergence.
An automatic data clustering technique is proposed which utilizes Multi-objective
Spotted Hyena Optimizer (MOSHO) algorithm. It employs real-coded variable
length representation to encode cluster centers automatically. In this technique,
two concepts namely threshold setting and cutoff computation are used to refine
the search process. Consequently, the proposed technique is also compared with
other existing clustering techniques. This technique is then applied to image
segmentation and its performance is compared with other approaches.
For high-dimensional data sets, the existing feature selection techniques may not
produce optimal feature subset. To resolve this problem, a novel automatic feature
selection technique is proposed, which is based on multi-objective spotted hyena
optimizer. This approach finds the optimal number of features dynamically (i.e.,
automatically during simulation run). A novel idea of threshold setting is proposed
for determining the most appropriate feature subset. Subsequently, a new
objective function is also designed for the efficient search process. Experimental
results suggest that the proposed approach performs superior to existing feature
selection approaches.
This thesis also presents the problems associated with the collective approach of
both feature selection and automatic clustering techniques. A novel approach
for automatic clustering and feature selection using multi-objective spotted hyena
optimizer is also proposed. This approach finds out both the features as well as
the number of clusters simultaneously from the given data set. A variable length
agent is proposed to encode both cluster centers and features of the different
number of clusters. Two novel thresholds have been developed for finding the
optimal number of clusters and features. A novel clustering criterion has also been
designed to search space efficiently. The performance of the proposed technique
is tested on two real-life applications and compared with the existing algorithms.Experimental results show that the proposed approach is superior than the other
competitor algorithms.
