A Scoring Model Assisted by Frequency for Multi-Document Summarization

被引：0

作者：

Yu, Yue ^{[1
,3
]}

Wu, Mutong ^{[1
]}

Su, Weifeng ^{[1
,2
]}

Cheung, Yiu-ming ^{[3
]}

机构：

[1] Div Sci & Technol, Comp Sci & Technol Programme, Hefei, Peoples R China

[2] BNU HKBU United Int Coll, Guangdong Key Lab AI & Multimodal Data Proc, Zhuhai, Guangdong, Peoples R China

[3] Hong Kong Baptist Univ, Dept Comp Sci, Hong Kong, Peoples R China

来源：

ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2021, PT V | 2021年 / 12895卷

关键词：

Multiple document summarization; Position information; Frequency; Graph;

D O I：

10.1007/978-3-030-86383-8_25

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

While position information plays a significant role in sentence scoring of single document summarization, the repetition of content among different documents greatly impacts the salience scores of sentences in multi-document summarization. Introducing frequencies information can help identify important sentences which are generally ignored when only considering position information before. Therefore, in this paper, we propose a scoring model, SAFA (Self-Attention with Frequency Graph) which combines position information with frequency to identify the salience of sentences. The SAFA model constructs a frequency graph at the multi-document level based on the repetition of content of sentences, and assigns initial score values to each sentence based on the graph. The model then uses the position-aware gold scores to train a self-attention mechanism, obtaining the sentence significance at its single document level. The score of each sentence is updated by combing position and frequency information together. We train and test the SAFA model on the large-scale multi-document dataset Multi-News. The extensive experimental results show that the model incorporating frequency information in sentence scoring outperforms the other state-of-the-art extractive models.

引用

页码：309 / 320

页数：12

共 50 条

[21] Multi-Document Summarization for Turkish News
Demirci, Ferhat
Karabudak, Engin
Ilgen, Bahar
2017 INTERNATIONAL ARTIFICIAL INTELLIGENCE AND DATA PROCESSING SYMPOSIUM (IDAP), 2017,
[22] Multi-document summarization via submodularity
Jingxuan Li
Lei Li
Tao Li
Applied Intelligence, 2012, 37 : 420 - 430
[23] Multi-document text summarization - A survey
Tandel, Amol
Modi, Brijesh
Gupta, Priyasha
Wagle, Shreya
Khedkar, Sujata
PROCEEDINGS OF 2016 INTERNATIONAL CONFERENCE ON DATA MINING AND ADVANCED COMPUTING (SAPIENCE), 2016, : 336 - 339
[24] Multi-document summarization via submodularity
Li, Jingxuan
Li, Lei
Li, Tao
APPLIED INTELLIGENCE, 2012, 37 (03) : 420 - 430
[25] An Overview of Research on Multi-Document Summarization
Bao R.
Sun H.
Data Analysis and Knowledge Discovery, 2024, 8 (02) : 17 - 32
[26] MULTI-DOCUMENT SUMMARIZATION OF EVALUATIVE TEXT
Carenini, Giuseppe
Cheung, Jackie Chi Kit
Pauls, Adam
COMPUTATIONAL INTELLIGENCE, 2013, 29 (04) : 545 - 576
[27] Aspect Based Multi-Document Summarization
Sahoo, Deepak
Balabantaray, Rakesh
Phukon, Mridumoni
Saikia, Saibali
2016 IEEE INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION AND AUTOMATION (ICCCA), 2016, : 873 - 877
[28] Hierarchical Summarization: Scaling Up Multi-Document Summarization
Christensen, Janara
Soderland, Stephen
Bansal, Gagan
Mausam
PROCEEDINGS OF THE 52ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1, 2014, : 902 - 912
[29] Multi-Document Summarization by Information Distance
Long, Chong
Huang, Minlie
Zhu, Xiaoyan
Li, Ming
2009 9TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, 2009, : 866 - +
[30] Causal Maps for Multi-Document Summarization
Strelnikoff, Sasha
Jammalamadaka, Aruna
Warmsley, Dana
2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2020, : 4437 - 4445

← 1 2 3 4 5 →