Maximal Sequence Mining Approach for Topic Detection from Microblog Streams

被引:0
|
作者
Jafariakinabad, Fereshteh [1 ]
Hua, Kien A. [1 ]
机构
[1] Univ Cent Florida, Dept Comp Sci, Orlando, FL 32816 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Unprecedented expansion of user generated content in recent years demands more attempts of information filtering in order to extract high quality information from the huge amount of available data. In particular, topic detection from microblog streams is the first step toward monitoring and summarizing social data. This task is challenging due to the short and noisy characteristics of microblog content. Moreover, the underlying models need to be able to deal with heterogeneous streams which contain multiple stories evolving simultaneously. In this work, we introduce a frequent pattern mining approach for topic detection from a microblog stream. This approach first uses a Maximal Sequence Mining (MSM) algorithm to extract pattern sequences, each an ordered set of terms. This scheme can capture more semantic information than using unordered sets of the same terms. A pattern graph, which is a directed-graph representation of the mined sequences, can then be constructed. Subsequently, a community detection algorithm is applied on the pattern graph to group the mined patterns into different topic clusters. Experiments on Twitter datasets demonstrate that MSM approach achieves high performance in comparison with the state-of-the-art methods.
引用
收藏
页数:8
相关论文
共 50 条
  • [1] Emerging Topic Detection from Microblog Streams Based on Emerging Pattern Mining
    Peng, Min
    Ouyang, Shuang
    Zhu, Jiahui
    Huang, Jiajia
    Wang, Hua
    Yong, Jianming
    [J]. PROCEEDINGS OF THE 2018 IEEE 22ND INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN ((CSCWD)), 2018, : 259 - 264
  • [2] Topic-Specific Post Identification in Microblog Streams
    Karunasekera, Shanika
    Harwood, Aaron
    Samarawickrama, Sameendra
    Ramamohanarao, Kotagiri
    Robins, Garry
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2014,
  • [3] Topic Detection from Microblog Based on Text Clustering and Topic Model Analysis
    Huang, Siqi
    Yang, Yitao
    Li, Huakang
    Sun, Guozi
    [J]. 2014 ASIA-PACIFIC SERVICES COMPUTING CONFERENCE (APSCC), 2014, : 88 - 92
  • [4] A Topic Detection Method for Chinese Microblog
    Xie, Jing
    Liu, Gongshen
    Ning, Wei
    [J]. 2012 INTERNATIONAL SYMPOSIUM ON INFORMATION SCIENCE AND ENGINEERING (ISISE), 2012, : 100 - 103
  • [5] Identifying and tracking topic-level influencers in the microblog streams
    Su, Sen
    Wang, Yakun
    Zhang, Zhongbao
    Chang, Cheng
    Zia, Muhammad Azam
    [J]. MACHINE LEARNING, 2018, 107 (03) : 551 - 578
  • [6] Identifying and tracking topic-level influencers in the microblog streams
    Sen Su
    Yakun Wang
    Zhongbao Zhang
    Cheng Chang
    Muhammad Azam Zia
    [J]. Machine Learning, 2018, 107 : 551 - 578
  • [7] Microblog Topic Mining Based on FR-DATM
    Liu Bingyu
    Wang Cuirong
    Wang Yiran
    Zhang Kun
    Wang Cong
    [J]. CHINESE JOURNAL OF ELECTRONICS, 2018, 27 (02) : 334 - 341
  • [8] Microblog Topic Mining Based on FR-DATM
    LIU Bingyu
    WANG Cuirong
    WANG Yiran
    ZHANG Kun
    WANG Cong
    [J]. Chinese Journal of Electronics, 2018, 27 (02) : 334 - 341
  • [9] A User Behavior Anomaly Detection Approach based on Sequence Mining over Data Streams
    Zhou, Yong
    Wang, Yijie
    Ma, Xingkong
    [J]. 2016 17TH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED COMPUTING, APPLICATIONS AND TECHNOLOGIES (PDCAT), 2016, : 376 - 381
  • [10] Mining maximal frequent itemsets from data streams
    Mao, Guojun
    Wu, Xindong
    Zhu, Xingquan
    Chen, Gong
    Liu, Chunnian
    [J]. JOURNAL OF INFORMATION SCIENCE, 2007, 33 (03) : 251 - 262