Processing, Analysis and Visualization of Social Data

Loading...
Thumbnail Image

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Socializing is central to the nature of humans and it is widespread, but scientists have long pondered how it can be analyzed and explored. One answer involves examining the networked interactions and population structure available on social media like Facebook, Twitter, and LinkedIn etc. A social structure of individuals related directly or indirectly on the basis of some common factor like similar likings etc. is a social network. In order to understand the behavior and structure of a social network we need to study the network and this study is called social network analysis. There has been a rapid increment in the research and study of data mining community and social network analysis. There are various social networking sites available on internet like LinkedIn, Facebook, Instagram, Twitter, Google and many more. Interactions over such sites produces huge amount of data because billions of active users maintain their accounts. Hence, it is a tedious task to analyze the complex data. People often are in control over whom they interact with that comprises their social circle. Social media provides a platform to people to express their thoughts. The user is free to enter the text in any form. As a result there is a possibility of inconsistency. To remove these inconsistencies, first the data is normalized on the basis of some transformation. It is of great importance for academic and business to analyze such online social communities and predicting their behavior. In this research, LinkedIn data is extracted and apply the normalization technique to remove the redundancies. Then apply the Hierarchical Clustering algorithm to the normalized data set to cluster the data according to the job title and visualized the clustered data in the form of tree and dendrogram. Secondly, we extract tweets about “MCDResults” (MCD results of Delhi) using Twitter API and R-tool for social data analysis. Then apply the preprocessing technique to clean the data for the further analysis and cluster the tweets based on the geolocation information using K-means clustering algorithm. We also describe the different approaches in the field of community detection and compared those approaches based on the modularity function using different datasets of real world network of varying sizes.

Description

Citation

Endorsement

Review

Supplemented By

Referenced By