Summarizing Weibo with Topics Compression

被引:0
|
作者
Litvak, Marina [1 ]
Vanetik, Natalia [1 ]
Li, Lei [2 ]
机构
[1] Shamoon Engn Coll, Dept Software Engn, Beer Sheva, Israel
[2] Beijing Univ Posts & Telecommun, Dept Comp Sci, Beijing, Peoples R China
关键词
D O I
10.1007/978-3-319-77116-8_39
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Extractive text summarization aims at selecting a small subset of sentences so that the contents and meaning of the original document are best preserved. In this paper we describe an unsupervised approach to extractive summarization. It combines hierarchical topic modeling (TM) with the Minimal Description Length (MDL) principle and applies them to Chinese language. Our summarizer strives to extract information that provides the best description of text topics in terms of MDL. This model is applied to the NLPCC 2015 Shared Task of Weibo-Oriented Chinese News Summarization [1], where Chinese texts from news articles were summarized with the goal of creating short meaningful messages for Weibo (Sina Weibo is a Chinese microblogging website, one of the most popular sites in China.) [2]. The experimental results disclose superiority of our approach over other summarizers from the NLPCC 2015 competition.
引用
收藏
页码:522 / 534
页数:13
相关论文
共 50 条