TopicSketch: Real-Time Bursty Topic Detection from Twitter

被引:130
|
作者
Xie, Wei [1 ]
Zhu, Feida [1 ]
Jiang, Jing [1 ]
Lim, Ee-Peng [1 ]
Wang, Ke [2 ]
机构
[1] Singapore Management Univ, Living Analyt Res Ctr, Singapore 178902, Singapore
[2] Simon Fraser Univ, Burnaby, BC V5A 1S6, Canada
基金
新加坡国家研究基金会;
关键词
TopicSketch; tweet stream; bursty topic; realtime;
D O I
10.1109/TKDE.2016.2556661
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Twitter has become one of the largest microblogging platforms for users around the world to share anything happening around them with friends and beyond. A bursty topic in Twitter is one that triggers a surge of relevant tweets within a short period of time, which often reflects important events of mass interest. How to leverage Twitter for early detection of bursty topics has therefore become an important research problem with immense practical value. Despite the wealth of research work on topic modelling and analysis in Twitter, it remains a challenge to detect bursty topics in real-time. As existing methods can hardly scale to handle the task with the tweet stream in real-time, we propose in this paper TopicSketch, a sketch-based topic model together with a set of techniques to achieve real-time detection. We evaluate our solution on a tweet stream with over 30 million tweets. Our experiment results show both efficiency and effectiveness of our approach. Especially it is also demonstrated that TopicSketch on a single machine can potentially handle hundreds of millions tweets per day, which is on the same scale of the total number of daily tweets in Twitter, and present bursty events in finer-granularity.
引用
收藏
页码:2216 / 2229
页数:14
相关论文
共 50 条
  • [1] TopicSketch: Real-time Bursty Topic Detection from Twitter
    Xie, Wei
    Zhu, Feida
    Jiang, Jing
    Lim, Ee-Peng
    Wang, Ke
    [J]. 2013 IEEE 13TH INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2013, : 837 - 846
  • [2] Topic Sketch: Real Time Bursty Topic Detection From Social Media
    Keshav, B.
    Rajeshwari, J.
    [J]. 2017 INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND CONTROL SYSTEMS (ICICCS), 2017, : 904 - 908
  • [3] Real-Time Top-R Topic Detection on Twitter with Topic Hijack Filtering
    Hayashi, Kohei
    Maehara, Takanori
    Toyoda, Masashi
    Kawarabayashi, Ken-ichi
    [J]. KDD'15: PROCEEDINGS OF THE 21ST ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2015, : 417 - 426
  • [4] Real-Time Detection of COVID-19 Events From Twitter: A Spatial-Temporally Bursty-Aware Method
    Fei, Gaolei
    Cheng, Yong
    Ma, Wanlun
    Chen, Chao
    Wen, Sheng
    Hu, Guangmin
    [J]. IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, 2023, 10 (02) : 656 - 672
  • [5] A Refined Method for Detecting Interpretable and Real-Time Bursty Topic in Microblog Stream
    Zhang, Tao
    Zhou, Bin
    Huang, Jiuming
    Jia, Yan
    Zhang, Bing
    Li, Zhi
    [J]. WEB INFORMATION SYSTEMS ENGINEERING, WISE 2017, PT I, 2017, 10569 : 3 - 17
  • [6] Real-Time Detection of Traffic From Twitter Stream Analysis
    D'Andrea, Eleonora
    Ducange, Pietro
    Lazzerini, Beatrice
    Marcelloni, Francesco
    [J]. IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2015, 16 (04) : 2269 - 2283
  • [7] A Framework for Real-Time Spam Detection in Twitter
    Gupta, Himank
    Jamal, Mohd. Saalim
    Madisetty, Sreekanth
    Desarkar, Maunendra Sankar
    [J]. 2018 10TH INTERNATIONAL CONFERENCE ON COMMUNICATION SYSTEMS & NETWORKS (COMSNETS), 2018, : 380 - 387
  • [8] Real-Time Topic Detection with Dynamic Windows
    Su, Na
    Ji, Shujuan
    Liu, Jimin
    [J]. COMPUTER JOURNAL, 2020, 63 (03): : 469 - 478
  • [9] A survey on real-time event detection from the Twitter data stream
    Hasan, Mahmud
    Orgun, Mehmet A.
    Schwitter, Rolf
    [J]. JOURNAL OF INFORMATION SCIENCE, 2018, 44 (04) : 443 - 463
  • [10] Real-time trending topics detection and description from Twitter content
    Madani, Amina
    Boussaid, Omar
    Zegour, Djamel Eddine
    [J]. SOCIAL NETWORK ANALYSIS AND MINING, 2015, 5 (01) : 1 - 13