Abstractive Summarization Model with Adaptive Sparsemax

被引：0

作者：

Guo, Shiqi ^{[1
]}

Si, Yumeng ^{[1
]}

Zhao, Jing ^{[2
,3
]}

机构：

[1] East China Normal Univ, Sch Comp Sci & Technol, Shanghai 200241, Peoples R China

[2] East China Normal Univ, Sch Comp Sci & Technol, Shanghai 200062, Peoples R China

[3] East China Normal Univ, Shanghai Key Lab Multidimens Informat Proc, Shanghai 200241, Peoples R China

来源：

NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, NLPCC 2022, PT I | 2022年 / 13551卷

关键词：

Abstractive summarization; Seq2Seq; Adaptive sparsemax;

D O I：

10.1007/978-3-031-17120-8_62

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The Abstractive summarization models mostly rely on Sequence-to-Sequence architectures, in which the softmax function is widely used to transform the model output to simplex. However, softmax's output probability distribution often has the long-tail effect especially when the vocabulary size is large. Many unrelated tokens occupy too many probabilities so they will reduce the training efficiency and effect. More recently, some work has begun to design mapping functions to gain sparse output probabilities to ignore these irrelevant tokens. In this paper, we propose Adaptive Sparsemax which can self-adaptively control the sparsity of the model's output. Our method combines sparsemax and temperature mechanism, and the temperature value can be learned by the neural network. One of the advantages of our method is that it doesn't need any hyperparameter. The experimental result on CNN-Daily Mail and LCSTS dataset shows that our method has better performance on the abstractive summarization task than baseline models.

引用

页码：810 / 821

页数：12

共 50 条

[1] A Combined Extractive With Abstractive Model for Summarization
Liu, Wenfeng
Gao, Yaling
Li, Jinming
Yang, Yuzhen
[J]. IEEE ACCESS, 2021, 9 : 43970 - 43980
[2] Frustratingly Easy Model Ensemble for Abstractive Summarization
Kobayashi, Hayato
[J]. 2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 4165 - 4176
[3] A Relation Enhanced Model For Abstractive Dialogue Summarization
Yi, Pengyao
Liu, Ruifang
[J]. 2022 INTERNATIONAL CONFERENCE ON CYBER-ENABLED DISTRIBUTED COMPUTING AND KNOWLEDGE DISCOVERY, CYBERC, 2022, : 240 - 246
[4] Abstractive Summarization Model for Summarizing Scientific Article
Ulker, Mehtap
Ozer, A. Bedri
[J]. IEEE ACCESS, 2024, 12 : 91252 - 91262
[5] Abstractive Summarization with the Aid of Extractive Summarization
Chen, Yangbin
Ma, Yun
Mao, Xudong
Li, Qing
[J]. WEB AND BIG DATA (APWEB-WAIM 2018), PT I, 2018, 10987 : 3 - 15
[6] Controllable Abstractive Summarization
Fan, Angela
Grangier, David
Auli, Michael
[J]. NEURAL MACHINE TRANSLATION AND GENERATION, 2018, : 45 - 54
[7] Abstractive Text Summarization Using Enhanced Attention Model
Roul, Rajendra Kumar
Joshi, Pratik Madhav
Sahoo, Jajati Keshari
[J]. INTELLIGENT HUMAN COMPUTER INTERACTION (IHCI 2019), 2020, 11886 : 63 - 76
[8] Abstractive Meeting Summarization based on an Attentional Neural Model
Dammak, Nouha
BenAyed, Yassine
[J]. THIRTEENTH INTERNATIONAL CONFERENCE ON MACHINE VISION (ICMV 2020), 2021, 11605
[9] A Context based Coverage Model for Abstractive Document Summarization
Kim, Heechan
Lee, Soowon
[J]. 2019 10TH INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGY CONVERGENCE (ICTC): ICT CONVERGENCE LEADING THE AUTONOMOUS FUTURE, 2019, : 1129 - 1132
[10] Abstractive Meeting Summarization via Hierarchical Adaptive Segmental Network Learning
Zhao, Zhou
Pan, Haojie
Fan, Changjie
Liu, Yan
Li, Linlin
Yang, Min
Cai, Deng
[J]. WEB CONFERENCE 2019: PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE (WWW 2019), 2019, : 3455 - 3461

← 1 2 3 4 5 →