Context-aware positional representation for self-attention networks q

被引:4
|
作者
Chen, Kehai [1 ]
Wang, Rui [1 ]
Utiyama, Masao [1 ]
Sumita, Eiichiro [1 ]
机构
[1] Natl Inst Informat & Commun Technol, Kyoto, Japan
关键词
Positional representation; Context information; Self-attention networks; Machine translation;
D O I
10.1016/j.neucom.2021.04.055
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In self-attention networks (SANs), positional embeddings are used to model order dependencies between words in the input sentence and are added with word embeddings to gain an input representation, which enables the SAN-based neural model to perform (multi-head) and to stack (multi-layer) self-attentive functions in parallel to learn the representation of the input sentence. However, this input representation only involves static order dependencies based on discrete position indexes of words, that is, is independent of context information, which may be weak in modeling the input sentence. To address this issue, we proposed a novel positional representation method to model order dependencies based on n-gram context or sentence context in the input sentence, which allows SANs to learn a more effective sentence representation. To validate the effectiveness of the proposed method, it is applied to the neural machine translation model, which adopts a typical SAN-based neural model. Experimental results on two widely used translation tasks, i.e., WMT14 English-to-German and WMT17 Chinese-to-English, showed that the proposed approach can significantly improve the translation performance over the strong Transformer baseline. (c) 2021 Elsevier B.V. All rights reserved.
引用
下载
收藏
页码:46 / 56
页数:11
相关论文
共 50 条
  • [41] Context-aware pyramid attention network for crowd counting
    Lingyu Gu
    Chen Pang
    Yanjun Zheng
    Chen Lyu
    Lei Lyu
    Applied Intelligence, 2022, 52 : 6164 - 6180
  • [42] Context-Aware Attention LSTM Network for Flood Prediction
    Wu, Yirui
    Liu, Zhaoyang
    Xu, Weigang
    Feng, Jun
    Palaiahnakote, Shivakumara
    Lu, Tong
    2018 24TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2018, : 1301 - 1306
  • [43] Context-aware pyramid attention network for crowd counting
    Gu, Lingyu
    Pang, Chen
    Zheng, Yanjun
    Lyu, Chen
    Lyu, Lei
    APPLIED INTELLIGENCE, 2022, 52 (06) : 6164 - 6180
  • [44] CONTEXT-AWARE ATTENTION MECHANISM FOR SPEECH EMOTION RECOGNITION
    Ramet, Gaetan
    Garner, Philip N.
    Baeriswyl, Michael
    Lazaridis, Alexandros
    2018 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2018), 2018, : 126 - 131
  • [45] Integrating Representation and Interaction for Context-Aware Document Ranking
    Chen, Haonan
    Dou, Zhicheng
    Zhu, Qiannan
    Zuo, Xiaochen
    Wen, Ji-Rong
    ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2023, 41 (01)
  • [46] Temporal Context-Aware Representation Learning for Question Routing
    Zhang, Xuchao
    Cheng, Wei
    Zong, Bo
    Chen, Yuncong
    Xu, Jianwu
    Li, Ding
    Chen, Haifeng
    PROCEEDINGS OF THE 13TH INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING (WSDM '20), 2020, : 753 - 761
  • [47] A Logical Framework for the Representation and Verification of Context-aware Agents
    Rakib, Abdur
    Ul Haque, Hafiz Mahfooz
    MOBILE NETWORKS & APPLICATIONS, 2014, 19 (05): : 585 - 597
  • [48] Research on temporal representation and reasoning for context-aware computing
    Liu, Dong
    Meng, Xiangwu
    Chen, Junliang
    Gaojishu Tongxin/Chinese High Technology Letters, 2009, 19 (04): : 342 - 347
  • [49] Graph Attention Network for Context-Aware Visual Tracking
    Shao, Yanyan
    Guo, Dongyan
    Cui, Ying
    Wang, Zhenhua
    Zhang, Liyan
    Zhang, Jianhua
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024,
  • [50] A Discriminative Convolutional Neural Network with Context-aware Attention
    Zhou, Yuxiang
    Liao, Lejian
    Gao, Yang
    Huang, Heyan
    Wei, Xiaochi
    ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2020, 11 (05)