A Scoring Model Assisted by Frequency for Multi-Document Summarization

被引:0
|
作者
Yu, Yue [1 ,3 ]
Wu, Mutong [1 ]
Su, Weifeng [1 ,2 ]
Cheung, Yiu-ming [3 ]
机构
[1] Div Sci & Technol, Comp Sci & Technol Programme, Hefei, Peoples R China
[2] BNU HKBU United Int Coll, Guangdong Key Lab AI & Multimodal Data Proc, Zhuhai, Guangdong, Peoples R China
[3] Hong Kong Baptist Univ, Dept Comp Sci, Hong Kong, Peoples R China
关键词
Multiple document summarization; Position information; Frequency; Graph;
D O I
10.1007/978-3-030-86383-8_25
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
While position information plays a significant role in sentence scoring of single document summarization, the repetition of content among different documents greatly impacts the salience scores of sentences in multi-document summarization. Introducing frequencies information can help identify important sentences which are generally ignored when only considering position information before. Therefore, in this paper, we propose a scoring model, SAFA (Self-Attention with Frequency Graph) which combines position information with frequency to identify the salience of sentences. The SAFA model constructs a frequency graph at the multi-document level based on the repetition of content of sentences, and assigns initial score values to each sentence based on the graph. The model then uses the position-aware gold scores to train a self-attention mechanism, obtaining the sentence significance at its single document level. The score of each sentence is updated by combing position and frequency information together. We train and test the SAFA model on the large-scale multi-document dataset Multi-News. The extensive experimental results show that the model incorporating frequency information in sentence scoring outperforms the other state-of-the-art extractive models.
引用
收藏
页码:309 / 320
页数:12
相关论文
共 50 条
  • [41] Multi-document summarization using closed patterns
    Qiang, Ji-Peng
    Chen, Ping
    Ding, Wei
    Xie, Fei
    Wu, Xindong
    KNOWLEDGE-BASED SYSTEMS, 2016, 99 : 28 - 38
  • [42] Automatic multi-document summarization for digital libraries
    Ou Shiyan
    Khoo, Christopher S. G.
    Goh, Dion H.
    PROCEEDINGS OF THE ASIA-PACIFIC CONFERENCE ON LIBRARY & INFORMATION EDUCATION & PRACTICE 2006: PREPARING INFORMATION PROFESSIONALS FOR LEADERSHIP IN THE NEW AGE, 2006, : 72 - +
  • [43] Disentangling Specificity for Abstractive Multi-document Summarization
    Ma, Congbo (congbo.ma@mq.edu.au), 1600, Institute of Electrical and Electronics Engineers Inc.
  • [44] Enhancing multi-document summarization using concepts
    Rao, Pattabhi R. K.
    Devi, S. Lalitha
    SADHANA-ACADEMY PROCEEDINGS IN ENGINEERING SCIENCES, 2018, 43 (02):
  • [45] Multi-document summarization for terrorism information extraction
    Wang, Fu Lee
    Yang, Christopher C.
    Shi, Xiaodong
    INTELLIGENCE AND SECURITY INFORMATICS, PROCEEDINGS, 2006, 3975 : 602 - 608
  • [46] Genetic algorithm based multi-document summarization
    Liu, Dexi
    He, Yanxiang
    Ji, Donghong
    Yang, Hua
    PRICAI 2006: TRENDS IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2006, 4099 : 1140 - 1144
  • [47] Unsupervised Multi-document Summarization with Holistic Inference
    Zhang, Haopeng
    Cho, Sangwoo
    Song, Kaiqiang
    Wang, Xiaoyang
    Wang, Hongwei
    Zhang, Jiawei
    Yu, Dong
    13TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING AND THE 3RD CONFERENCE OF THE ASIA-PACIFIC CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, IJCNLP-AACL 2023, 2023, : 123 - 133
  • [48] Multi-document summarization based on unsupervised clustering
    Ji, Paul
    INFORMATION RETRIEVAL TECHNOLOLGY, PROCEEDINGS, 2006, 4182 : 560 - 566
  • [49] A Game Theory Approach for Multi-document Summarization
    Ahmad, Amreen
    Ahmad, Tanvir
    ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2019, 44 (04) : 3655 - 3667
  • [50] Geodesic Distance based Multi-document Summarization
    Ma, Huifang
    He, Qing
    Shi, Zhongzhi
    IEEE NLP-KE 2008: PROCEEDINGS OF INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING, 2008, : 54 - 59