Attention Head Masking for Inference Time Content Selection in Abstractive Summarization

被引:0
|
作者
Cao, Shuyang [1 ]
Wang, Lu [1 ]
机构
[1] Univ Michigan, Comp Sci & Engn, Ann Arbor, MI 48109 USA
基金
美国国家科学基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
How can we effectively inform content selection in Transformer-based abstractive summarization models? In this work, we present a simple-yet-effective attention head masking technique, which is applied on encoderdecoder attentions to pinpoint salient content at inference time. Using attention head masking, we are able to reveal the relation between encoder-decoder attentions and content selection behaviors of summarization models. We then demonstrate its effectiveness on three document summarization datasets based on both in-domain and cross-domain settings. Importantly, our models outperform prior state-of-the-art models on CNN/Daily Mail and New York Times datasets. Moreover, our inference-time masking technique is also data-efficient, requiring less than 20% of the training samples to outperform BART fine-tuned on the full CNN/DailyMail dataset.
引用
收藏
页码:5008 / 5016
页数:9
相关论文
共 50 条
  • [41] Multimodal Abstractive Summarization using bidirectional encoder representations from transformers with attention mechanism
    Argade, Dakshata
    Khairnar, Vaishali
    Vora, Deepali
    Patil, Shruti
    Kotecha, Ketan
    Alfarhood, Sultan
    [J]. HELIYON, 2024, 10 (04)
  • [42] Abstractive text summarization model combining a hierarchical attention mechanism and multiobjective reinforcement learning
    Sun, Yujia
    Platos, Jan
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2024, 248
  • [43] A Convolution-Self Attention Abstractive Summarization Method Fusing Sequential Grammar Knowledge
    Luo, Senlin
    Wang, Ruiyi
    Wu, Qian
    Pan, Limin
    Wu, Zhouting
    [J]. Beijing Ligong Daxue Xuebao/Transaction of Beijing Institute of Technology, 2021, 41 (01): : 93 - 101
  • [44] Boundary-Aware Abstractive Summarization with Entity-Augmented Attention for Enhancing Faithfulness
    Li, Jiuyi
    Liu, Junpeng
    Ma, Jianjun
    Yang, Wei
    Huang, Degen
    [J]. ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2024, 23 (04)
  • [45] Inference Time Style Control for Summarization
    Cao, Shuyang
    Wang, Lu
    [J]. 2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 5942 - 5953
  • [46] An abstractive text summarization technique using transformer model with self-attention mechanism
    Sandeep Kumar
    Arun Solanki
    [J]. Neural Computing and Applications, 2023, 35 : 18603 - 18622
  • [47] An abstractive text summarization technique using transformer model with self-attention mechanism
    Kumar, Sandeep
    Solanki, Arun
    [J]. NEURAL COMPUTING & APPLICATIONS, 2023, 35 (25): : 18603 - 18622
  • [48] A global and local information extraction model incorporating selection mechanism for abstractive text summarization
    Li, Yuanyuan
    Huang, Yuan
    Huang, Weijian
    Wang, Wei
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (2) : 4859 - 4886
  • [49] A global and local information extraction model incorporating selection mechanism for abstractive text summarization
    Yuanyuan Li
    Yuan Huang
    Weijian Huang
    Wei Wang
    [J]. Multimedia Tools and Applications, 2024, 83 : 4859 - 4886
  • [50] Order-Preserving Abstractive Summarization for Spoken Content Based on Connectionist Temporal Classification
    Lu, Bo-Ru
    Shyu, Frank
    Chen, Yun-Nung
    Lee, Hung-Yi
    Lee, Lin-Shan
    [J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 2899 - 2903