Abstractive Summarization Model with Adaptive Sparsemax

被引:0
|
作者
Guo, Shiqi [1 ]
Si, Yumeng [1 ]
Zhao, Jing [2 ,3 ]
机构
[1] East China Normal Univ, Sch Comp Sci & Technol, Shanghai 200241, Peoples R China
[2] East China Normal Univ, Sch Comp Sci & Technol, Shanghai 200062, Peoples R China
[3] East China Normal Univ, Shanghai Key Lab Multidimens Informat Proc, Shanghai 200241, Peoples R China
关键词
Abstractive summarization; Seq2Seq; Adaptive sparsemax;
D O I
10.1007/978-3-031-17120-8_62
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The Abstractive summarization models mostly rely on Sequence-to-Sequence architectures, in which the softmax function is widely used to transform the model output to simplex. However, softmax's output probability distribution often has the long-tail effect especially when the vocabulary size is large. Many unrelated tokens occupy too many probabilities so they will reduce the training efficiency and effect. More recently, some work has begun to design mapping functions to gain sparse output probabilities to ignore these irrelevant tokens. In this paper, we propose Adaptive Sparsemax which can self-adaptively control the sparsity of the model's output. Our method combines sparsemax and temperature mechanism, and the temperature value can be learned by the neural network. One of the advantages of our method is that it doesn't need any hyperparameter. The experimental result on CNN-Daily Mail and LCSTS dataset shows that our method has better performance on the abstractive summarization task than baseline models.
引用
收藏
页码:810 / 821
页数:12
相关论文
共 50 条
  • [1] A Combined Extractive With Abstractive Model for Summarization
    Liu, Wenfeng
    Gao, Yaling
    Li, Jinming
    Yang, Yuzhen
    [J]. IEEE ACCESS, 2021, 9 : 43970 - 43980
  • [2] Frustratingly Easy Model Ensemble for Abstractive Summarization
    Kobayashi, Hayato
    [J]. 2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 4165 - 4176
  • [3] A Relation Enhanced Model For Abstractive Dialogue Summarization
    Yi, Pengyao
    Liu, Ruifang
    [J]. 2022 INTERNATIONAL CONFERENCE ON CYBER-ENABLED DISTRIBUTED COMPUTING AND KNOWLEDGE DISCOVERY, CYBERC, 2022, : 240 - 246
  • [4] Abstractive Summarization Model for Summarizing Scientific Article
    Ulker, Mehtap
    Ozer, A. Bedri
    [J]. IEEE ACCESS, 2024, 12 : 91252 - 91262
  • [5] Abstractive Summarization with the Aid of Extractive Summarization
    Chen, Yangbin
    Ma, Yun
    Mao, Xudong
    Li, Qing
    [J]. WEB AND BIG DATA (APWEB-WAIM 2018), PT I, 2018, 10987 : 3 - 15
  • [6] Controllable Abstractive Summarization
    Fan, Angela
    Grangier, David
    Auli, Michael
    [J]. NEURAL MACHINE TRANSLATION AND GENERATION, 2018, : 45 - 54
  • [7] Abstractive Text Summarization Using Enhanced Attention Model
    Roul, Rajendra Kumar
    Joshi, Pratik Madhav
    Sahoo, Jajati Keshari
    [J]. INTELLIGENT HUMAN COMPUTER INTERACTION (IHCI 2019), 2020, 11886 : 63 - 76
  • [8] Abstractive Meeting Summarization based on an Attentional Neural Model
    Dammak, Nouha
    BenAyed, Yassine
    [J]. THIRTEENTH INTERNATIONAL CONFERENCE ON MACHINE VISION (ICMV 2020), 2021, 11605
  • [9] A Context based Coverage Model for Abstractive Document Summarization
    Kim, Heechan
    Lee, Soowon
    [J]. 2019 10TH INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGY CONVERGENCE (ICTC): ICT CONVERGENCE LEADING THE AUTONOMOUS FUTURE, 2019, : 1129 - 1132
  • [10] Abstractive Meeting Summarization via Hierarchical Adaptive Segmental Network Learning
    Zhao, Zhou
    Pan, Haojie
    Fan, Changjie
    Liu, Yan
    Li, Linlin
    Yang, Min
    Cai, Deng
    [J]. WEB CONFERENCE 2019: PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE (WWW 2019), 2019, : 3455 - 3461