Multi-Granularity Self-Attention for Neural Machine Translation

被引:0
|
作者
Hao, Jie [1 ,2 ]
Wang, Xing [2 ]
Shi, Shuming [2 ]
Zhang, Jinfeng [1 ]
Tu, Zhaopeng [2 ]
机构
[1] Florida State Univ, Tallahassee, FL 32306 USA
[2] Tencent AI Lab, Shenzhen, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Current state-of-the-art neural machine translation (NMT) uses a deep multi-head self-attention network with no explicit phrase information. However, prior work on statistical machine translation has shown that extending the basic translation unit from words to phrases has produced substantial improvements, suggesting the possibility of improving NMT performance from explicit modeling of phrases. In this work, we present multi-granularity self-attention (MG-SA): a neural network that combines multi-head self-attention and phrase modeling. Specifically, we train several attention heads to attend to phrases in either n-gram or syntactic formalism. Moreover, we exploit interactions among phrases to enhance the strength of structure modeling - a commonly-cited weakness of self-attention. Experimental results on WMT14 English-to-German and NIST Chinese-to-English translation tasks show the proposed approach consistently improves performance. Targeted linguistic analysis reveals that MG-SA indeed captures useful phrase information at various levels of granularities.
引用
收藏
页码:887 / 897
页数:11
相关论文
共 50 条
  • [11] Multi-Granularity Feature Aggregation with Self-Attention and Spatial Reasoning for Fine-Grained Crop Disease Classification
    Zuo, Xin
    Chu, Jiao
    Shen, Jifeng
    Sun, Jun
    [J]. AGRICULTURE-BASEL, 2022, 12 (09):
  • [12] Enlivening Redundant Heads in Multi-head Self-attention for Machine Translation
    Zhang, Tianfu
    Huang, Heyan
    Feng, Chong
    Cao, Longbing
    [J]. 2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 3238 - 3248
  • [13] A neural machine translation method based on split graph convolutional self-attention encoding
    Wan, Fei
    Li, Ping
    [J]. PEERJ COMPUTER SCIENCE, 2024, 10
  • [14] RESA: Relation Enhanced Self-Attention for Low-Resource Neural Machine Translation
    Wu, Xing
    Shi, Shumin
    Huang, Heyan
    [J]. 2021 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2021, : 159 - 164
  • [15] Multi-Granularity Attention Model for Group Recommendation
    Ji, Jianye
    Pei, Jiayan
    Lin, Shaochuan
    Zhou, Taotao
    He, Hengxu
    Jia, Jia
    Hu, Ning
    [J]. PROCEEDINGS OF THE 32ND ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2023, 2023, : 3973 - 3977
  • [16] Self-aware Multi-inductor Net: Cross Domain Person Re-identification Based on Global Self-attention and Multi-granularity Supervision
    Zhou, YuXuan
    Zhang, XiaoHua
    Song, DeYang
    Li, LongFei
    Niu, DaoHong
    [J]. 2022 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, COMPUTER VISION AND MACHINE LEARNING (ICICML), 2022, : 276 - 284
  • [17] Enhancing Machine Translation with Dependency-Aware Self-Attention
    Bugliarello, Emanuele
    Okazaki, Naoaki
    [J]. 58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 1618 - 1627
  • [18] Multi-granularity bidirectional attention stream machine comprehension method for emotion cause extraction
    Yufeng Diao
    Hongfei Lin
    Liang Yang
    Xiaochao Fan
    Yonghe Chu
    Di Wu
    Kan Xu
    Bo Xu
    [J]. Neural Computing and Applications, 2020, 32 : 8401 - 8413
  • [19] Multi-granularity bidirectional attention stream machine comprehension method for emotion cause extraction
    Diao, Yufeng
    Lin, Hongfei
    Yang, Liang
    Fan, Xiaochao
    Chu, Yonghe
    Wu, Di
    Xu, Kan
    Xu, Bo
    [J]. NEURAL COMPUTING & APPLICATIONS, 2020, 32 (12): : 8401 - 8413
  • [20] Hierarchical Multi-Granularity Attention- Based Hybrid Neural Network for Text Classification
    Liu, Zhenyu
    Lu, Chaohong
    Huang, Haiwei
    Lyu, Shengfei
    Tao, Zhenchao
    [J]. IEEE Access, 2020, 8 : 149362 - 149371