Single document summarization using the information from documents with the same topic

被引:5
|
作者
Mao, Xiangke [1 ,2 ,3 ]
Huang, Shaobin [1 ]
Shen, Linshan [1 ]
Li, Rongsheng [1 ]
Yang, Hui [2 ,3 ]
机构
[1] Harbin Engn Univ, Coll Comp Sci & Technol, Harbin 150001, Peoples R China
[2] CETC Big Data Res Inst Co Ltd, Guiyang 550022, Peoples R China
[3] Big Data Applicat Improving Govt Governance Capab, Guiyang 550022, Peoples R China
关键词
Extractive summarization; Neighborhood documents; Graph model; Biased LexRank; SENTENCE SCORING TECHNIQUES;
D O I
10.1016/j.knosys.2021.107265
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The essence of extractive summarization is to measure the importance of sentences in the document. When extracting summary from a single document, it is difficult to comprehensively and effectively evaluate the importance of sentences due to the lack of information. In this paper, we propose a kind of single document summarization method using information from documents under the same topic. This method integrates the topic information from neighborhood documents and statistical information from the target document to calculate the score of sentences. Then the scoring results are used as a prior scores for each sentence in the target document. After the target document is represented by the sentence graph, the final score of the sentences are obtained by the biased random walk algorithm. Finally, the Maximal Marginal Relevance (MMR) algorithm is used to select the sentences to form summary. The experimental results on the DUC2001 and DUC2002 datasets show that the effect of extracting summary is improved by incorporating information from the documents under the same topic. (C) 2021 Elsevier B.V. All rights reserved.
引用
收藏
页数:11
相关论文
共 50 条
  • [21] Using cross-document random walks for topic-focused multi-document summarization
    Wan, Xiaojun
    Yang, Jianwu
    Xiao, Jianguo
    2006 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE, (WI 2006 MAIN CONFERENCE PROCEEDINGS), 2006, : 1012 - +
  • [22] Topic Attentional Neural Network for Abstractive Document Summarization
    Liu, Hao
    Zheng, Hai-Tao
    Wang, Wei
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2019, PT II, 2019, 11440 : 70 - 81
  • [23] Fuzzy clustering for topic analysis and summarization of document collections
    Witte, Rene
    Bergler, Sabine
    ADVANCES IN ARTIFICIAL INTELLIGENCE, 2007, 4509 : 476 - +
  • [24] Spoken document summarization using acoustic, prosodic and semantic information
    Huang, CL
    Hsieh, CH
    Wu, CH
    2005 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), VOLS 1 AND 2, 2005, : 434 - 437
  • [25] Automatic Multi-Document Summarization for Indonesian Documents Using Hybrid Abstractive-Extractive Summarization Technique
    Yapinus, Glorian
    Erwin, Alva
    Galinium, Maulahikmah
    Muliady, Wahyu
    2014 6TH INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY AND ELECTRICAL ENGINEERING (ICITEE), 2014, : 39 - 43
  • [26] A method for the automatic summarization of topic-based clusters of documents
    Pons-Porrata, A
    Ruiz-Shulcloper, J
    Berlanga-Llavori, R
    PROGRESS IN PATTERN RECOGNITION, SPEECH AND IMAGE ANALYSIS, 2003, 2905 : 596 - 603
  • [27] Spoken document summarization using topic-related corpus and semantic dependency grammar
    Hsieh, CH
    Huang, CL
    Wu, CH
    2004 INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, 2004, : 333 - 336
  • [28] Multi-document summarization using probabilistic topic-based network models
    1613, Institute of Information Science (32):
  • [29] Korean document summarization using topic phrases extraction and locality-based similarity
    Ryu, J
    Han, KR
    Rim, KW
    FOUNDATIONS OF INTELLIGENT SYSTEMS, 2003, 2871 : 320 - 325
  • [30] Topic Oriented Multi-document Summarization Using LSA, Syntactic and Semantic Features
    Anjaneyulu, M.
    Sarma, S. S. V. N.
    Reddy, P. Vijaya Pal
    Chander, K. Prem
    Nagaprasad, S.
    INTERNATIONAL CONFERENCE ON INNOVATIVE COMPUTING AND COMMUNICATIONS, VOL 2, 2019, 56 : 487 - 502