CMPK: a high accuracy microblog user classification method for professional analysis

被引:0
|
作者
Peng, Ying [1 ]
Wang, Haiquan [1 ]
机构
[1] Beihang Univ, Sch Comp Sci & Engn, Beijing, Peoples R China
关键词
text mining; user classification; vector space model; K-Nearest Neighbor algorithm;
D O I
10.1109/CSC.2013.28
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Analyzing and mining the massive data recorded in microblog in order to discover the characteristics and rules of individual behaviors, group behaviors and interactive behaviors is now the research hotspot of massive data mining and behavioral analysis area. However, the influence of social attributes, such as user's occupation, to his behavior and social relations is always neglected in the existing researches. Concerning this issue, the paper proposed a high accuracy microblog user classification method for professional analysis-CMPK (Classification Method based on Professional lexicon and K-nearest neighbor algorithm), this method uses vector space model combined with the professional lexicon and KNN (K-Nearest Neighbor algorithm) classification algorithm to analyze the industry that the microblog user belongs to based on all kinds of information he put on the network. The experiment proved that the accuracy rate of CMPK is nearly 90% which is high precision.
引用
收藏
页码:134 / 139
页数:6
相关论文
共 50 条
  • [21] Analysis of User's Weight in Microblog Network Based on User Influence and Active Degree
    Jie Lian
    Yun Liu
    Zhen-Jiang Zhang
    Jun-Jun Cheng
    Fei Xiong
    [J]. Journal of Electronic Science and Technology, 2012, (04) : 368 - 377
  • [22] Application of Knowledge Gain on Multi-Type Feature Space in Microblog User Classification
    Yan, Xu
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON GRANULAR COMPUTING (GRC), 2014, : 340 - 345
  • [23] A Sentiment Analysis Method of Short Texts in Microblog
    Li, Jie
    Qiu, Lirong
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND ENGINEERING (CSE) AND IEEE/IFIP INTERNATIONAL CONFERENCE ON EMBEDDED AND UBIQUITOUS COMPUTING (EUC), VOL 1, 2017, : 776 - 779
  • [24] Predicting microblog users' lifetime activities - A user-based analysis
    Chen, Xi
    Geng, Ruibin
    Cai, Shun
    [J]. ELECTRONIC COMMERCE RESEARCH AND APPLICATIONS, 2015, 14 (03) : 150 - 168
  • [25] A Hybrid Method to Sentiment Analysis for Chinese Microblog
    Fu, Xia
    Du, Yajun
    Ye, Yongtao
    [J]. KNOWLEDGE GRAPH AND SEMANTIC COMPUTING: LANGUAGE, KNOWLEDGE, AND INTELLIGENCE, CCKS 2017, 2017, 784 : 152 - 157
  • [26] User Interest Analysis and Personalized Information Service Implementation Based on Microblog
    Tang, Li-Fang
    [J]. 2016 INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND INFORMATION SECURITY (CSIS 2016), 2016, : 521 - 527
  • [27] Analysis of user interactive social behavior between microblog and BBS in a university
    [J]. Lai, Q.-N., 1600, Editorial Board of Journal on Communications (34):
  • [28] A rumor spreading model based on user browsing behavior analysis in microblog
    Huang, Jiajia
    Su, Qiang
    [J]. 2013 10TH INTERNATIONAL CONFERENCE ON SERVICE SYSTEMS AND SERVICE MANAGEMENT (ICSSSM), 2013, : 170 - 173
  • [29] High-Reproducibility and High-Accuracy Method for Automated Topic Classification
    Lancichinetti, Andrea
    Sirer, M. Irmak
    Wang, Jane X.
    Acuna, Daniel
    Koerding, Konrad
    Amaral, Luis A. Nunes
    [J]. PHYSICAL REVIEW X, 2015, 5 (01):
  • [30] Sentiment Analysis for Microblog Related to Finance Based on Rules and Classification
    Yan, Danfeng
    Hu, Bo
    Qin, Jiafeng
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING (BIGCOMP), 2018, : 119 - 126