Attention Head Masking for Inference Time Content Selection in Abstractive Summarization

被引:0
|
作者
Cao, Shuyang [1 ]
Wang, Lu [1 ]
机构
[1] Univ Michigan, Comp Sci & Engn, Ann Arbor, MI 48109 USA
基金
美国国家科学基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
How can we effectively inform content selection in Transformer-based abstractive summarization models? In this work, we present a simple-yet-effective attention head masking technique, which is applied on encoderdecoder attentions to pinpoint salient content at inference time. Using attention head masking, we are able to reveal the relation between encoder-decoder attentions and content selection behaviors of summarization models. We then demonstrate its effectiveness on three document summarization datasets based on both in-domain and cross-domain settings. Importantly, our models outperform prior state-of-the-art models on CNN/Daily Mail and New York Times datasets. Moreover, our inference-time masking technique is also data-efficient, requiring less than 20% of the training samples to outperform BART fine-tuned on the full CNN/DailyMail dataset.
引用
收藏
页码:5008 / 5016
页数:9
相关论文
共 50 条
  • [1] Abstractive Text Summarization with Multi-Head Attention
    Li, Jinpeng
    Zhang, Chuang
    Chen, Xiaojun
    Cao, Yanan
    Liao, Pengcheng
    Zhang, Peng
    [J]. 2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
  • [2] Abstractive Summarization by Neural Attention Model with Document Content Memory
    Choi, Yunseok
    Kim, Dahae
    Lee, Jee-Hyong
    [J]. PROCEEDINGS OF THE 2018 CONFERENCE ON RESEARCH IN ADAPTIVE AND CONVERGENT SYSTEMS (RACS 2018), 2018, : 11 - 16
  • [3] Attend to Medical Ontologies: Content Selection for Clinical Abstractive Summarization
    Sotudeh, Sajad
    Goharian, Nazli
    Filice, Ross W.
    [J]. 58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 1899 - 1905
  • [4] A Few Good Sentences: Content Selection for Abstractive Text Summarization
    Srivastava, Vivek
    Bhat, Savita
    Pedanekar, Niranjan
    [J]. MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES: RESEARCH TRACK, ECML PKDD 2023, PT IV, 2023, 14172 : 124 - 141
  • [5] A Cascade Approach to Neural Abstractive Summarization with Content Selection and Fusion
    Lebanoff, Logan
    Dernoncourt, Franck
    Kim, Doo Soon
    Chang, Walter
    Liu, Fei
    [J]. 1ST CONFERENCE OF THE ASIA-PACIFIC CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 10TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (AACL-IJCNLP 2020), 2020, : 529 - 535
  • [6] Attention Optimization for Abstractive Document Summarization
    Gui, Min
    Tian, Junfeng
    Wang, Rui
    Yang, Zhenglu
    [J]. 2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 1222 - 1228
  • [7] Neural Abstractive Summarization with Structural Attention
    Chowdhury, Tanya
    Kumar, Sachin
    Chakraborty, Tanmoy
    [J]. PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 3716 - 3722
  • [8] Attention based Abstractive Summarization of Malayalam Document
    Nambiar, Sindhya K.
    Peter, David S.
    Idicula, Sumam Mary
    [J]. AI IN COMPUTATIONAL LINGUISTICS, 2021, 189 : 250 - 257
  • [9] Contrastive Attention Mechanism for Abstractive Sentence Summarization
    Duan, Xiangyu
    Yu, Hongfei
    Yin, Mingming
    Zhang, Min
    Luo, Weihua
    Zhang, Yue
    [J]. 2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 3044 - 3053
  • [10] Attention Temperature Matters in Abstractive Summarization Distillation
    Zhang, Shengqiang
    Zhang, Xingxing
    Bao, Hangbo
    Wei, Furu
    [J]. PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 127 - 141