Multi-objective Metaheuristic Approaches for Data Clustering in Engineering Application(s)

Dhiman, Gaurav

Multi-objective Metaheuristic Approaches for Data Clustering in Engineering Application(s)

Files

My_Thesis_Final.pdf (16.66 MB)

Date

2019-09-20

Authors

Dhiman, Gaurav

Supervisors

Kumar, Vijay

Abstract

Clustering is the process of combining similar data objects into a number of groups (called clusters). Clusters are formed in such a way that data objects having similar nature are kept in one cluster and are dissimilar to data objects of other clusters. This thesis focuses on the concept of clustering, i.e., automatically determining the number of features and number of clusters. Due to unknown number of cluster, clustering technique is treated as NP-hard problem. To solve this problem, metaheuristic algorithms are utilized. In this thesis, two novel bio-inspired metaheuristic optimization algorithms have been proposed namely Spotted Hyena Optimizer (SHO) and Emperor Penguin Optimizer (EPO) for solving real-life engineering design problems. The proposed approaches are assessed on standard benchmark test functions. The convergence and computational complexity have also been analyzed to ensure the applicability of proposed algorithms. The performance of the proposed algorithms are analysed and compared with different algorithms such as GWO, PSO, MFO, MVO, SCA, GSA, GA, and HS. Experimental results reveal that the proposed algorithms are able to solve constrained and unconstrained engineering design problems. In addition to above-proposed algorithms, a multi-objective version of spotted hyena optimizer is proposed and named as Multi-objective Spotted Hyena Optimizer (MOSHO). In order to determine the better solution, the concept of Pareto dominance is utilized in the proposed approach. An external repository is used to store the Pareto optimal solutions. An adaptive grid mechanism is used to produce the distributed Pareto fronts and improve diversity. Moreover, the group selection mechanism is also employed for better convergence. The roulette wheel mechanism is used to select the effective solutions from archive to simulate the social and hunting behaviors of spotted hyenas. The proposed algorithm has been tested on multi-objective benchmark test functions and then applied on constrained engineering design problems to demonstrate its applicability on real-life problems. The experimental results reveal that the proposed algorithm performs better than the others and produces the Pareto optimal solutions with high convergence. An automatic data clustering technique is proposed which utilizes Multi-objective Spotted Hyena Optimizer (MOSHO) algorithm. It employs real-coded variable length representation to encode cluster centers automatically. In this technique, two concepts namely threshold setting and cutoff computation are used to refine the search process. Consequently, the proposed technique is also compared with other existing clustering techniques. This technique is then applied to image segmentation and its performance is compared with other approaches. For high-dimensional data sets, the existing feature selection techniques may not produce optimal feature subset. To resolve this problem, a novel automatic feature selection technique is proposed, which is based on multi-objective spotted hyena optimizer. This approach finds the optimal number of features dynamically (i.e., automatically during simulation run). A novel idea of threshold setting is proposed for determining the most appropriate feature subset. Subsequently, a new objective function is also designed for the efficient search process. Experimental results suggest that the proposed approach performs superior to existing feature selection approaches. This thesis also presents the problems associated with the collective approach of both feature selection and automatic clustering techniques. A novel approach for automatic clustering and feature selection using multi-objective spotted hyena optimizer is also proposed. This approach finds out both the features as well as the number of clusters simultaneously from the given data set. A variable length agent is proposed to encode both cluster centers and features of the different number of clusters. Two novel thresholds have been developed for finding the optimal number of clusters and features. A novel clustering criterion has also been designed to search space efficiently. The performance of the proposed technique is tested on two real-life applications and compared with the existing algorithms.Experimental results show that the proposed approach is superior than the other competitor algorithms.

Keywords

Clustering, Emperor penguin optimizer, Spotted hyena optimizer, Feature Selection, Optimization, Diversity, Metaheuristic, Constrained optimization, Engineering design problem., Benchmark test functions

URI

http://hdl.handle.net/10266/5824

Collections

Doctoral Theses@CSED

Full item page

Multi-objective Metaheuristic Approaches for Data Clustering in Engineering Application(s)

Files

Date

Authors

Supervisors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections

Endorsement

Review

Supplemented By

Referenced By