Multi-view self-attention networks

被引:9
|
作者
Xu, Mingzhou [1 ]
Yang, Baosong [2 ]
Wong, Derek F. [1 ]
Chao, Lidia S. [1 ]
机构
[1] Univ Macau, NLP2CT Lab, Macau, Peoples R China
[2] Alibaba Grp, Hangzhou, Peoples R China
关键词
Self-attention mechanism; Multi-head attention mechanism; Linguistics; Machine translation; Multi-Pattern;
D O I
10.1016/j.knosys.2022.108268
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Self-attention networks (SANs) have attracted an amount of research attention for their outstanding performance under the machine translation community. Recent studies proved that SANs can be further improved by exploiting different inductive biases, each of which guides SANs to learn a specific view of the input sentence, e.g., short-term dependencies, forward and backward views, as well as phrasal patterns. However, less studies investigate how these inductive techniques complementarily improve the capability of SANs and this would be an interesting question to be answered. In this paper we selected five inductive biases which are simple and not over parameterized to investigate their complementarily. We further propose multi-view self-attention networks, which jointly learn different linguistic aspects of the input sentence under a unified framework. Specifically, we propose and exploit a variety of inductive biases to regularize the conventional attention distribution. Different views are then aggregated by a hybrid attention mechanism to quantify and leverage the specific views and their associated representation conveniently. Experiments on various translation tasks demonstrate that different views are able to progressively improve the performance of SANs, and the proposed approach outperforms both the strong TRANSFORMER baseline and related models on TRANSFORMER-BASE and TRANSFORMER-BIG settings. Extensive analyses on 10 linguistic probing tasks verify that different views indeed tend to extract distinct linguistic features and our method gives highly effective improvements in their integration.(1) (c)& nbsp;2022 Elsevier B.V. All rights reserved.& nbsp;& nbsp;
引用
收藏
页数:10
相关论文
共 50 条
  • [1] Incomplete multi-view clustering via self-attention networks and feature reconstruction
    Zhang, Yong
    Jiang, Li
    Liu, Da
    Liu, Wenzhe
    [J]. APPLIED INTELLIGENCE, 2024, 54 (04) : 2998 - 3016
  • [2] Incomplete multi-view clustering via self-attention networks and feature reconstruction
    Yong Zhang
    Li Jiang
    Da Liu
    Wenzhe Liu
    [J]. Applied Intelligence, 2024, 54 : 2998 - 3016
  • [3] MULTI-VIEW SELF-ATTENTION BASED TRANSFORMER FOR SPEAKER RECOGNITION
    Wang, Rui
    Ao, Junyi
    Zhou, Long
    Liu, Shujie
    Wei, Zhihua
    Ko, Tom
    Li, Qing
    Zhang, Yu
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 6732 - 6736
  • [4] Multi-view 3D Reconstruction with Self-attention
    Qian, Qiuting
    [J]. 2021 14TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTER THEORY AND ENGINEERING (ICACTE 2021), 2021, : 20 - 26
  • [5] Self-attention Multi-view Representation Learning with Diversity-promoting Complementarity
    Liu, Jian-wei
    Ding, Xi-hao
    Lu, Run-kun
    Luo, Xionglin
    [J]. PROCEEDINGS OF THE 32ND 2020 CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2020), 2020, : 3972 - 3978
  • [6] Joint long and short span self-attention network for multi-view classification
    Chen, Zhikui
    Lou, Kai
    Liu, Zhenjiao
    Li, Yue
    Luo, Yiming
    Zhao, Liang
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2024, 235
  • [7] Multi-view self-attention for interpretable drug-target interaction prediction
    Agyemang, Brighter
    Wu, Wei-Ping
    Kpiebaareh, Michael Yelpengne
    Lei, Zhihua
    Nanor, Ebenezer
    Chen, Lei
    [J]. JOURNAL OF BIOMEDICAL INFORMATICS, 2020, 110
  • [8] Rumor Detection on Social Media: A Multi-view Model Using Self-attention Mechanism
    Geng, Yue
    Lin, Zheng
    Fu, Peng
    Wang, Weiping
    [J]. COMPUTATIONAL SCIENCE - ICCS 2019, PT I, 2019, 11536 : 339 - 352
  • [9] Multi-View 3D Reconstruction Method Based on Self-Attention Mechanism
    Zhu, Guangzhao
    Bo, Wei
    Yang, Afeng
    Xin, Xu
    [J]. LASER & OPTOELECTRONICS PROGRESS, 2023, 60 (16)
  • [10] Improved Multi-Head Self-Attention Classification Network for Multi-View Fetal Echocardiography Recognition
    Zhang, Yingying
    Zhu, Haogang
    Wang, Yan
    Wang, Jingyi
    He, Yihua
    [J]. 2023 45TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE & BIOLOGY SOCIETY, EMBC, 2023,