Growing Story Forest Online from Massive Breaking News

被引:19
|
作者
Liu, Bang [1 ]
Niu, Di [1 ]
Lai, Kunfeng [2 ]
Kong, Linglong [1 ]
Xu, Yu [2 ]
机构
[1] Univ Alberta, Edmonton, AB, Canada
[2] Tencent Inc, Mobile Internet Grp, Shenzhen, Peoples R China
关键词
Text Clustering; Online Story Tree; Information Retrieval;
D O I
10.1145/3132847.3132852
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We describe our experience of implementing a news content organization system at Tencent that discovers events from vast streams of breaking news and evolves news story structures in an online fashion. Our real-world system has distinct requirements in contrast to previous studies on topic detection and tracking (TDT) and event timeline or graph generation, in that we 1) need to accurately and quickly extract distinguishable events from massive streams of long text documents that cover diverse topics and contain highly redundant information, and 2) must develop the structures of event stories in an online manner, without repeatedly restructuring previously formed stories, in order to guarantee a consistent user viewing experience. In solving these challenges, we propose Story Forest, a set of online schemes that automatically clusters streaming documents into events, while connecting related events in growing trees to tell evolving stories. We conducted extensive evaluation based on 60 GB of real-world Chinese news data, although our ideas are not language-dependent and can easily be extended to other languages, through detailed pilot user experience studies. The results demonstrate the superior capability of Story Forest to accurately identify events and organize news text into a logical structure that is appealing to human readers, compared to multiple existing algorithm frameworks.
引用
收藏
页码:777 / 785
页数:9
相关论文
共 50 条
  • [1] Story Forest: Extracting Events and Telling Stories from Breaking News
    Liu, Bang
    Han, Fred X.
    Niu, Di
    Kong, Linglong
    Lai, Kunfeng
    Xu, Yu
    ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2020, 14 (03)
  • [2] Classification Of Breaking News Taken from the Online News Sites
    Kilic, Erdal
    Tavus, Mustafa Resit
    Karhan, Zehra
    2015 23RD SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2015, : 363 - 366
  • [4] Detection of breaking news from Online web search queries
    Murata, Tsuyoshi
    NEW GENERATION COMPUTING, 2008, 26 (01) : 63 - 73
  • [5] Towards the detection of breaking news from online Web search keywords
    Murata, Tsuyoshi
    2006 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, Workshops Proceedings, 2006, : 401 - 404
  • [6] Stylistic Analysis on Online News Story Comments
    LI Pei
    海外英语, 2013, (21) : 301 - 304
  • [7] On detecting business event from the headlines and leads of massive online news articles
    Qian, Yu
    Deng, Xiongwen
    Ye, Qiongwei
    Ma, Baojun
    Yuan, Hua
    INFORMATION PROCESSING & MANAGEMENT, 2019, 56 (06)
  • [8] Breaking the News: Extracting the Sparse Citation Network Backbone of Online News Articles
    Spitz, Andreas
    Gertz, Michael
    PROCEEDINGS OF THE 2015 IEEE/ACM INTERNATIONAL CONFERENCE ON ADVANCES IN SOCIAL NETWORKS ANALYSIS AND MINING (ASONAM 2015), 2015, : 274 - 279
  • [9] " Breaking news" from spermatids
    Gouraud A.
    Brazeau M.-A.
    Grégoire M.-C.
    Simard O.
    Massonneau J.
    Arguin M.
    Boissonneault G.
    Basic and Clinical Andrology, 23 (1)
  • [10] The effects of micropayments on online news story selection and engagement
    Geidner, Nick
    D'Arcy, Denae
    NEW MEDIA & SOCIETY, 2015, 17 (04) : 611 - 628