Modified K-Means to improve clustering using Genetic algorithm

Setia, Vandana

Modified K-Means to improve clustering using Genetic algorithm

dc.contributor.author	Setia, Vandana
dc.contributor.supervisor	Arora, Vinay
dc.date.accessioned	2018-08-27T05:48:08Z
dc.date.available	2018-08-27T05:48:08Z
dc.date.issued	2018-08-23
dc.description.abstract	In today’s era data generated by scientific applications and corporate environment has grown rapidly not only in size but also in variety. This data collected is of huge amount and there is a difficulty in collecting and analyzing such big data. Data mining is the technique in which useful information and hidden relationship among data is extracted, but the traditional data mining approaches cannot be directly used for big data due to their inherent complexity. Data Clustering is one of the most important issues in data mining and machine learning. Clustering is a task of discovering homogenous groups of the studied objects. Recently, many researchers have a significant interest in developing clustering algorithms. The most problem in clustering is that we do not have prior information knowledge about the given dataset. Moreover, the choice of input parameters such as the number of clusters, number of nearest neighbors and other factors in these algorithms make the clustering more challengeable topic. Thus any incorrect choice of these parameters yields bad clustering results. Furthermore, these algorithms suffer from unsatisfactory accuracy when the dataset contains clusters with different complex shapes, densities, sizes, noise, and outliers. In this thesis, we propose a new approach for unsupervised clustering task. Our approach consists of three phases of operations. In the first phase we use the Genetic algorithm for finding first initial cluster centroid. In genetic algorithm we use a crossover and mutation of the dataset. The second phase, takes these initial cluster centroid produced by genetic algorithm for finding clusters using K-means clustering. From the second phase we obtain a set of clusters of the given dataset. Hence, the third phase considers these clusters for evaluation of cluster based on Davies Bouldin Index. This new algorithm is named as Genetic K-means Algorithm (GKA). We present experiments that provide the strength of our new proposed algorithm in discovering clusters with different non-convex shapes, sizes, densities, noise, outliers and higher accuracy. These experiments show the superiority of our proposed algorithm when comparing with K-means algorithm.	en_US
dc.identifier.uri	http://hdl.handle.net/10266/5319
dc.language.iso	en	en_US
dc.subject	Clustering	en_US
dc.subject	k-Means	en_US
dc.title	Modified K-Means to improve clustering using Genetic algorithm	en_US
dc.type	Thesis	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Modified K-Means to improve clustering using Genetic algorithm.pdf
Size:: 2.09 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 2.03 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Masters Theses@CSED