Capsule Network Improved Multi-Head Attention for Word Sense Disambiguation

被引:0
|
作者
Cheng, Jinfeng [1 ]
Tong, Weiqin [1 ,2 ]
Yan, Weian [1 ]
机构
[1] Shanghai Univ, Sch Comp Engn & Sci, Shanghai 200444, Peoples R China
[2] Shanghai Univ, Shanghai Inst Adv Commun & Data Sci, Shanghai 200444, Peoples R China
来源
APPLIED SCIENCES-BASEL | 2021年 / 11卷 / 06期
关键词
word sense disambiguation; multi-head attention; capsule network; capsule routing;
D O I
10.3390/app11062488
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Word sense disambiguation (WSD) is one of the core problems in natural language processing (NLP), which is to map an ambiguous word to its correct meaning in a specific context. There has been a lively interest in incorporating sense definition (gloss) into neural networks in recent studies, which makes great contribution to improving the performance of WSD. However, disambiguating polysemes of rare senses is still hard. In this paper, while taking gloss into consideration, we further improve the performance of the WSD system from the perspective of semantic representation. We encode the context and sense glosses of the target polysemy independently using encoders with the same structure. To obtain a better presentation in each encoder, we leverage the capsule network to capture different important information contained in multi-head attention. We finally choose the gloss representation closest to the context representation of the target word as its correct sense. We do experiments on English all-words WSD task. Experimental results show that our method achieves good performance, especially having an inspiring effect on disambiguating words of rare senses.
引用
收藏
页数:14
相关论文
共 50 条
  • [31] Multi-Head Attention with Disagreement Regularization
    Li, Jian
    Tu, Zhaopeng
    Yang, Baosong
    Lyu, Michael R.
    Zhang, Tong
    2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 2897 - 2903
  • [32] Financial Volatility Forecasting: A Sparse Multi-Head Attention Neural Network
    Lin, Hualing
    Sun, Qiubi
    INFORMATION, 2021, 12 (10)
  • [33] Bilinear Multi-Head Attention Graph Neural Network for Traffic Prediction
    Hu, Haibing
    Han, Kai
    Yin, Zhizhuo
    ICAART: PROCEEDINGS OF THE 14TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE - VOL 2, 2022, : 33 - 43
  • [34] MHASAN: Multi-Head Angular Self Attention Network for Spoof Detection
    Hasan, Md
    Roy, Koushik
    Rupty, Labiba
    Hossain, Md. Sourave
    Sengupta, Shirshajit
    Taus, Shehzad Noor
    Mohammed, Nabeel
    2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 154 - 160
  • [35] Bimodal Fusion Network with Multi-Head Attention for Multimodal Sentiment Analysis
    Zhang, Rui
    Xue, Chengrong
    Qi, Qingfu
    Lin, Liyuan
    Zhang, Jing
    Zhang, Lun
    APPLIED SCIENCES-BASEL, 2023, 13 (03):
  • [36] A Dual Multi-Head Contextual Attention Network for Hyperspectral Image Classification
    Liang, Miaomiao
    He, Qinghua
    Yu, Xiangchun
    Wang, Huai
    Meng, Zhe
    Jiao, Licheng
    REMOTE SENSING, 2022, 14 (13)
  • [37] Siamese Network cooperating with Multi-head Attention for semantic sentence matching
    Yuan, Zhao
    Jun, Sun
    2020 19TH INTERNATIONAL SYMPOSIUM ON DISTRIBUTED COMPUTING AND APPLICATIONS FOR BUSINESS ENGINEERING AND SCIENCE (DCABES 2020), 2020, : 235 - 238
  • [38] Multi-head enhanced self-attention network for novelty detection
    Zhang, Yingying
    Gong, Yuxin
    Zhu, Haogang
    Bai, Xiao
    Tang, Wenzhong
    PATTERN RECOGNITION, 2020, 107
  • [39] Personalized multi-head self-attention network for news recommendation
    Zheng, Cong
    Song, Yixuan
    NEURAL NETWORKS, 2025, 181
  • [40] Distract Your Attention: Multi-Head Cross Attention Network for Facial Expression Recognition
    Wen, Zhengyao
    Lin, Wenzhong
    Wang, Tao
    Xu, Ge
    BIOMIMETICS, 2023, 8 (02)