Discovering Communities with Self-adaptive k Clustering in Microblog Data

被引:3
|
作者
Huang, Ting [1 ]
Peng, Dunlu [1 ]
Cao, Lidong [1 ]
机构
[1] Shanghai Univ Sci & Technol, Sch Opt Elect & Comp Engn, Shanghai 201800, Peoples R China
关键词
microblogging; clustering; adaptive k; community recognition; social network;
D O I
10.1109/CGC.2012.92
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Nowadays, microblogging has been a popular social network service whose population has incredibly increased in past few years. Many business companies regard microblogging service as an indispensable medium to directly obtain timely opinions from customers and potential customers. A community in social network refers to a crowd of people having similar interests or paying their attention on same things. User community recognition in microblogging social network service is very important for identifying hot topics or users' interests which are very helpful for companies to improve their marketing strategies. However, the massive non-structural tweet data brings tremendous challenge for efficiently mining the valuable communities hidden in it. Tweet data is characterized as containing massive information, being involved in large fields, short-length and non-structure. This makes tweets quite different from the conventional text documents. In order to analyze the data more effectively, in this paper, we propose a set of techniques to preprocess tweets, such as word identification, categories matching and data standardization. An unsupervised learning method has been presented to automatically cluster microblog users into different communities. In the method, an optimized CLARANS algorithm has been developed according to the characteristics of microblog data. During the process of clustering, the interactive relationship between tweets is also exploited to improve the clustering quality. In addition, a self-adaptive k strategy is employed to make the proposed approach more applicable. In order to investigate the performance of our approach from different aspects, we conducted a series of experiments with the microblog data collected from SINA Weibo.
引用
收藏
页码:383 / 390
页数:8
相关论文
共 50 条
  • [1] A Novel Self-Adaptive Clustering Algorithm for Dynamic Data
    Liu, Ming
    Lin, Lei
    Shan, Lili
    Sun, Chengjie
    [J]. NEURAL INFORMATION PROCESSING, ICONIP 2012, PT III, 2012, 7665 : 42 - 49
  • [2] K-Means Clustering Based on Self-adaptive Weight
    Zhang, Yuzhu
    Shi, Hualin
    Zhang, Damin
    [J]. PROCEEDINGS OF 2012 2ND INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND NETWORK TECHNOLOGY (ICCSNT 2012), 2012, : 1540 - 1544
  • [3] Hybrid K-Means and Improved Self-Adaptive Particle Swarm Optimization for Data Clustering
    Pacifico, Luciano D. S.
    Ludermir, Teresa B.
    [J]. 2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
  • [4] A Self-Adaptive Microblog Topic Tracking Method by User Relationship
    [J]. Zhang, Chuang (zhangchuang@iie.ac.cn), 2017, Chinese Institute of Electronics (45):
  • [5] Self-adaptive genetic algorithm for clustering
    Kivijärvi, J
    Fränti, P
    Nevalainen, O
    [J]. JOURNAL OF HEURISTICS, 2003, 9 (02) : 113 - 129
  • [6] Self-Adaptive Genetic Algorithm for Clustering
    Juha Kivijärvi
    Pasi Fränti
    Olli Nevalainen
    [J]. Journal of Heuristics, 2003, 9 : 113 - 129
  • [7] A Self-Adaptive Spectral Clustering Algorithm
    Cai Xiaoyan
    Dai Guanzhong
    Yang Libin
    Zhang Guoqing
    [J]. PROCEEDINGS OF THE 27TH CHINESE CONTROL CONFERENCE, VOL 4, 2008, : 551 - 553
  • [8] Self-Adaptive Anytime Stream Clustering
    Kranen, Philipp
    Assent, Ira
    Baldauf, Corinna
    Seidl, Thomas
    [J]. 2009 9TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, 2009, : 249 - +
  • [9] Data Evolvement Analysis Based on Topology Self-Adaptive Clustering Algorithm
    Liu, Ming
    Liu, Bingquan
    Liu, Yuanchao
    Sun, Chengjie
    [J]. INFORMATION TECHNOLOGY AND CONTROL, 2012, 41 (02): : 162 - 172
  • [10] The True Self-adaptive Spectral Clustering Algorithms
    Xie, Juan-Ying
    Ding, Li-Juan
    [J]. Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2019, 47 (05): : 1000 - 1008