Context-aware positional representation for self-attention networks q

被引:4
|
作者
Chen, Kehai [1 ]
Wang, Rui [1 ]
Utiyama, Masao [1 ]
Sumita, Eiichiro [1 ]
机构
[1] Natl Inst Informat & Commun Technol, Kyoto, Japan
关键词
Positional representation; Context information; Self-attention networks; Machine translation;
D O I
10.1016/j.neucom.2021.04.055
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In self-attention networks (SANs), positional embeddings are used to model order dependencies between words in the input sentence and are added with word embeddings to gain an input representation, which enables the SAN-based neural model to perform (multi-head) and to stack (multi-layer) self-attentive functions in parallel to learn the representation of the input sentence. However, this input representation only involves static order dependencies based on discrete position indexes of words, that is, is independent of context information, which may be weak in modeling the input sentence. To address this issue, we proposed a novel positional representation method to model order dependencies based on n-gram context or sentence context in the input sentence, which allows SANs to learn a more effective sentence representation. To validate the effectiveness of the proposed method, it is applied to the neural machine translation model, which adopts a typical SAN-based neural model. Experimental results on two widely used translation tasks, i.e., WMT14 English-to-German and WMT17 Chinese-to-English, showed that the proposed approach can significantly improve the translation performance over the strong Transformer baseline. (c) 2021 Elsevier B.V. All rights reserved.
引用
收藏
页码:46 / 56
页数:11
相关论文
共 50 条
  • [31] Knowledge-Aware Self-Attention Networks for Document Grounded Dialogue Generation
    Tang, Xiangru
    Hu, Po
    KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, KSEM 2019, PT II, 2019, 11776 : 400 - 411
  • [32] Self-attention with Functional Time Representation Learning
    Xu, Da
    Ruan, Chuanwei
    Kumar, Sushant
    Korpeoglu, Evren
    Achan, Kannan
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [33] Enhanced Semantic Representation Learning for Sarcasm Detection by Integrating Context-Aware Attention and Fusion Network
    Hao, Shufeng
    Yao, Jikun
    Shi, Chongyang
    Zhou, Yu
    Xu, Shuang
    Li, Dengao
    Cheng, Yinghan
    ENTROPY, 2023, 25 (06)
  • [34] Self-Attention Generative Adversarial Networks
    Zhang, Han
    Goodfellow, Ian
    Metaxas, Dimitris
    Odena, Augustus
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
  • [35] Global Context-Aware Attention LSTM Networks for 3D Action Recognition
    Liu, Jun
    Wang, Gang
    Hu, Ping
    Duan, Ling-Yu
    Kot, Alex C.
    30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 3671 - 3680
  • [36] Self-Attention Networks for Code Search
    Fang, Sen
    Tan, You-Shuai
    Zhang, Tao
    Liu, Yepang
    INFORMATION AND SOFTWARE TECHNOLOGY, 2021, 134
  • [37] Cascaded Semantic and Positional Self-Attention Network for Document Classification
    Jiang, Juyong
    Zhang, Jie
    Zhang, Kai
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 669 - 677
  • [38] Modeling Localness for Self-Attention Networks
    Yang, Baosong
    Tu, Zhaopeng
    Wong, Derek F.
    Meng, Fandong
    Chao, Lidia S.
    Zhang, Tong
    2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 4449 - 4458
  • [39] A Context-aware Attention Network for Interactive Question Answering
    Li, Huayu
    Min, Martin Renqiang
    Ge, Yong
    Kadav, Asim
    KDD'17: PROCEEDINGS OF THE 23RD ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2017, : 927 - 935
  • [40] Hierarchical Attention Network for Context-Aware Query Suggestion
    Li, Xiangsheng
    Liu, Yiqun
    Li, Xin
    Luo, Cheng
    Nie, Jian-Yun
    Zhang, Min
    Ma, Shaoping
    INFORMATION RETRIEVAL TECHNOLOGY (AIRS 2018), 2018, 11292 : 173 - 186