A Scoring Model Assisted by Frequency for Multi-Document Summarization

被引:0
|
作者
Yu, Yue [1 ,3 ]
Wu, Mutong [1 ]
Su, Weifeng [1 ,2 ]
Cheung, Yiu-ming [3 ]
机构
[1] Div Sci & Technol, Comp Sci & Technol Programme, Hefei, Peoples R China
[2] BNU HKBU United Int Coll, Guangdong Key Lab AI & Multimodal Data Proc, Zhuhai, Guangdong, Peoples R China
[3] Hong Kong Baptist Univ, Dept Comp Sci, Hong Kong, Peoples R China
关键词
Multiple document summarization; Position information; Frequency; Graph;
D O I
10.1007/978-3-030-86383-8_25
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
While position information plays a significant role in sentence scoring of single document summarization, the repetition of content among different documents greatly impacts the salience scores of sentences in multi-document summarization. Introducing frequencies information can help identify important sentences which are generally ignored when only considering position information before. Therefore, in this paper, we propose a scoring model, SAFA (Self-Attention with Frequency Graph) which combines position information with frequency to identify the salience of sentences. The SAFA model constructs a frequency graph at the multi-document level based on the repetition of content of sentences, and assigns initial score values to each sentence based on the graph. The model then uses the position-aware gold scores to train a self-attention mechanism, obtaining the sentence significance at its single document level. The score of each sentence is updated by combing position and frequency information together. We train and test the SAFA model on the large-scale multi-document dataset Multi-News. The extensive experimental results show that the model incorporating frequency information in sentence scoring outperforms the other state-of-the-art extractive models.
引用
收藏
页码:309 / 320
页数:12
相关论文
共 50 条
  • [1] Subtopic-focused sentence scoring in multi-document summarization
    Li Sujian
    Qu Weiguang
    ALPIT 2007: PROCEEDINGS OF THE 6TH INTERNATIONAL CONFERENCE ON ADVANCED LANGUAGE PROCESSING AND WEB INFORMATION TECHNOLOGY, 2007, : 98 - +
  • [2] Mixture of Topic Model for Multi-document Summarization
    Liu Na
    Li Ming-xia
    Lu Ying
    Tang Xiao-jun
    Wang Hai-wen
    Xiao Peng
    26TH CHINESE CONTROL AND DECISION CONFERENCE (2014 CCDC), 2014, : 5168 - 5172
  • [3] A Hybrid Topic Model for Multi-Document Summarization
    Xu, JinAn
    Liu, JiangMing
    Araki, Kenji
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2015, E98D (05): : 1089 - 1094
  • [4] A Hybrid Hierarchical Model for Multi-Document Summarization
    Celikyilmaz, Asli
    Hakkani-Tur, Dilek
    ACL 2010: 48TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2010, : 815 - 824
  • [5] A Multi-Document Coverage Reward for RELAXed Multi-Document Summarization
    Parnell, Jacob
    Unanue, Inigo Jauregi
    Piccardi, Massimo
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 5112 - 5128
  • [6] Document-Based HITS Model for Multi-document Summarization
    Wan, Xiaojun
    PRICAI 2008: TRENDS IN ARTIFICIAL INTELLIGENCE, 2008, 5351 : 454 - 465
  • [7] MULTI-DOCUMENT VIDEO SUMMARIZATION
    Wang, Feng
    Merialdo, Bernard
    ICME: 2009 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-3, 2009, : 1326 - 1329
  • [8] A document-sensitive graph model for multi-document summarization
    Furu Wei
    Wenjie Li
    Qin Lu
    Yanxiang He
    Knowledge and Information Systems, 2010, 22 : 245 - 259
  • [9] On redundancy in multi-document summarization
    Calvo, Hiram
    Carrillo-Mendoza, Pabel
    Gelbukh, Alexander
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2018, 34 (05) : 3245 - 3255
  • [10] Abstractive Multi-Document Summarization
    Ranjitha, N. S.
    Kallimani, Jagadish S.
    2017 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2017, : 1690 - 1693