Multi-Document Summarization by Information Distance

被引:4
|
作者
Long, Chong [1 ]
Huang, Minlie [1 ]
Zhu, Xiaoyan [1 ]
Li, Ming [2 ]
机构
[1] Tsinghua Univ, Dept Comp Sci & Technol, Tsinghua Natl Lab Informat Sci & Technol, State Key Lab Intelligent Technol & Syst, Beijing, Peoples R China
[2] Univ Waterloo, Sch Comp Sci, Waterloo, ON N2L 3G1, Canada
关键词
Data Mining; Text Mining; Kolmogorov Complexity; Information Distance;
D O I
10.1109/ICDM.2009.107
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Fast changing knowledge on the Internet can be acquired more efficiently with the help of automatic document summarization and updating techniques. This paper described a novel approach for multi-document update summarization. The best summary is defined to be the one which has the minimum information distance to the entire document set. The best update summary has the minimum conditional information distance to a document cluster given that a prior document cluster has already been read. Experiments on the DUC 2007 dataset(1) and the TAC 2008 dataset(2) have proved that our method closely correlates with the human summaries and outperforms other programs such as LexRank in many categories under the ROUGE evaluation criterion.
引用
收藏
页码:866 / +
页数:2
相关论文
共 50 条
  • [1] Geodesic Distance based Multi-document Summarization
    Ma, Huifang
    He, Qing
    Shi, Zhongzhi
    [J]. IEEE NLP-KE 2008: PROCEEDINGS OF INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING, 2008, : 54 - 59
  • [2] Multi-document summarization for terrorism information extraction
    Wang, Fu Lee
    Yang, Christopher C.
    Shi, Xiaodong
    [J]. INTELLIGENCE AND SECURITY INFORMATICS, PROCEEDINGS, 2006, 3975 : 602 - 608
  • [3] Personalized Multi-Document Summarization in information retrieval
    Yang, Xiao-Peng
    Liu, Xiao-Rong
    [J]. PROCEEDINGS OF 2008 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2008, : 4108 - +
  • [4] Multi-document summarization as applied in information retrieval
    Zhou, Dan
    Li, Lei
    [J]. PROCEEDINGS OF THE 2007 IEEE INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING (NLP-KE'07), 2007, : 203 - +
  • [5] A Multi-Document Coverage Reward for RELAXed Multi-Document Summarization
    Parnell, Jacob
    Unanue, Inigo Jauregi
    Piccardi, Massimo
    [J]. PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 5112 - 5128
  • [6] Event graphs for information retrieval and multi-document summarization
    Glavas, Goran
    Snajder, Jan
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2014, 41 (15) : 6904 - 6916
  • [7] MULTI-DOCUMENT VIDEO SUMMARIZATION
    Wang, Feng
    Merialdo, Bernard
    [J]. ICME: 2009 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-3, 2009, : 1326 - 1329
  • [8] On redundancy in multi-document summarization
    Calvo, Hiram
    Carrillo-Mendoza, Pabel
    Gelbukh, Alexander
    [J]. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2018, 34 (05) : 3245 - 3255
  • [9] Abstractive Multi-Document Summarization
    Ranjitha, N. S.
    Kallimani, Jagadish S.
    [J]. 2017 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2017, : 1690 - 1693
  • [10] Weighted consensus multi-document summarization
    Wang, Dingding
    Li, Tao
    [J]. INFORMATION PROCESSING & MANAGEMENT, 2012, 48 (03) : 513 - 523