Social Media Analysis using Optimized K-Means Clustering

被引:0
|
作者
Alsayat, Ahmed [1 ]
El-Sayed, Hoda [1 ]
机构
[1] Bowie State Univ, Dept Comp Sci, Bowie, MD 20715 USA
关键词
K-Means; Genetic Algorithm; Clustering; Social Media Analysis; DataMining;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The increasing influence of social media and enormous participation of users creates new opportunities to study human social behavior along with the capability to analyze large amount of data streams. One of the interesting problems is to distinguish between different kinds of users, for example users who are leaders and introduce new issues and discussions on social media. Furthermore, positive or negative attitudes can also be inferred from those discussions. Such problems require a formal interpretation of social media logs and unit of information that can spread from person to person through the social network. Once the social media data such as user messages are parsed and network relationships are identified, data mining techniques can be applied to group different types of communities. However, the appropriate granularity of user communities and their behavior is hardly captured by existing methods. In this paper, we present a framework for the novel task of detecting communities by clustering messages from large streams of social data. Our framework uses K-Means clustering algorithm along with Genetic algorithm and Optimized Cluster Distance (OCD) method to cluster data. The goal of our proposed framework is twofold that is to overcome the problem of general K-Means for choosing best initial centroids using Genetic algorithm, as well as to maximize the distance between clusters by pairwise clustering using OCD to get an accurate clusters. We used various cluster validation metrics to evaluate the performance of our algorithm. The analysis shows that the proposed method gives better clustering results and provides a novel use-case of grouping user communities based on their activities. Our approach is optimized and scalable for real-time clustering of social media data.
引用
收藏
页码:61 / 66
页数:6
相关论文
共 50 条
  • [42] A K-means Clustering with Optimized Initial Center Based on Hadoop Platform
    Lin, Kunhui
    Li, Xiang
    Zhang, Zhongnan
    Chen, Jiahong
    2014 PROCEEDINGS OF THE 9TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE & EDUCATION (ICCSE 2014), 2014, : 263 - 266
  • [43] An Optimized Initialization Center K-means Clustering Algorithm based on Density
    Yuan, Qilong
    Shi, Haibo
    Zhou, Xiaofeng
    2015 IEEE INTERNATIONAL CONFERENCE ON CYBER TECHNOLOGY IN AUTOMATION, CONTROL, AND INTELLIGENT SYSTEMS (CYBER), 2015, : 790 - 794
  • [44] Analysis of Electricity Consumption at Home Using K-means Clustering Algorithm
    Choi, Hyun Wong
    Qureshi, Nawab Muhammad Faseeh
    Shin, Dong Ryeol
    2019 21ST INTERNATIONAL CONFERENCE ON ADVANCED COMMUNICATION TECHNOLOGY (ICACT): ICT FOR 4TH INDUSTRIAL REVOLUTION, 2019, : 639 - 643
  • [45] Optimized Cartesian K-Means
    Wang, Jianfeng
    Wang, Jingdong
    Song, Jingkuan
    Xu, Xin-Shun
    Shen, Heng Tao
    Li, Shipeng
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2015, 27 (01) : 180 - 192
  • [46] K-Means Clustering Algorithm Optimized by Particle Swarm Optimization Algorithm
    Chai, Yi
    Ma, Hao
    Zhang, Ke
    Qian, Kun
    INTERNATIONAL CONFERENCE ON CONTROL ENGINEERING AND AUTOMATION (ICCEA 2014), 2014, : 852 - 857
  • [47] An Optimized Interpolation Model Based on K-means Clustering for Rainfall Calculation
    Zhang, Lelin
    Xiu, Jiapeng
    Yang, Zhengqiu
    Liu, Chen
    2020 5TH INTERNATIONAL CONFERENCE ON MECHANICAL, CONTROL AND COMPUTER ENGINEERING (ICMCCE 2020), 2020, : 1194 - 1198
  • [48] An Analysis of Students' Academic Performance Using K-Means Clustering Algorithm
    Ahmad, Maryam
    Arshad, Noreen Izza Bt
    Sarlan, Aliza Bt
    ADVANCES ON INTELLIGENT INFORMATICS AND COMPUTING: HEALTH INFORMATICS, INTELLIGENT SYSTEMS, DATA SCIENCE AND SMART COMPUTING, 2022, 127 : 309 - 318
  • [49] Optimized K-Means Algorithm
    Belhaouari, Samir Brahim
    Ahmed, Shahnawaz
    Mansour, Samer
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2014, 2014
  • [50] Analyzing the Evolution of Rare Events via Social Media Data and k-means Clustering Algorithm
    Lu, Xiaoyu Sean
    Z, Mengchu
    2016 IEEE 13TH INTERNATIONAL CONFERENCE ON NETWORKING, SENSING, AND CONTROL (ICNSC), 2016,