Capsule Network Improved Multi-Head Attention for Word Sense Disambiguation

被引:0
|
作者
Cheng, Jinfeng [1 ]
Tong, Weiqin [1 ,2 ]
Yan, Weian [1 ]
机构
[1] Shanghai Univ, Sch Comp Engn & Sci, Shanghai 200444, Peoples R China
[2] Shanghai Univ, Shanghai Inst Adv Commun & Data Sci, Shanghai 200444, Peoples R China
来源
APPLIED SCIENCES-BASEL | 2021年 / 11卷 / 06期
关键词
word sense disambiguation; multi-head attention; capsule network; capsule routing;
D O I
10.3390/app11062488
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Word sense disambiguation (WSD) is one of the core problems in natural language processing (NLP), which is to map an ambiguous word to its correct meaning in a specific context. There has been a lively interest in incorporating sense definition (gloss) into neural networks in recent studies, which makes great contribution to improving the performance of WSD. However, disambiguating polysemes of rare senses is still hard. In this paper, while taking gloss into consideration, we further improve the performance of the WSD system from the perspective of semantic representation. We encode the context and sense glosses of the target polysemy independently using encoders with the same structure. To obtain a better presentation in each encoder, we leverage the capsule network to capture different important information contained in multi-head attention. We finally choose the gloss representation closest to the context representation of the target word as its correct sense. We do experiments on English all-words WSD task. Experimental results show that our method achieves good performance, especially having an inspiring effect on disambiguating words of rare senses.
引用
收藏
页数:14
相关论文
共 50 条
  • [41] Word Sense Disambiguation Based on Convolution Neural Network
    Zhang C.-X.
    Zhao L.-Y.
    Gao X.-Y.
    Beijing Youdian Daxue Xuebao/Journal of Beijing University of Posts and Telecommunications, 2019, 42 (03): : 114 - 119
  • [42] Multi-Head Spatiotemporal Attention Graph Convolutional Network for Traffic Prediction
    Oluwasanmi, Ariyo
    Aftab, Muhammad Umar
    Qin, Zhiguang
    Sarfraz, Muhammad Shahzad
    Yu, Yang
    Rauf, Hafiz Tayyab
    SENSORS, 2023, 23 (08)
  • [43] Multi-Head Attention Neural Network for Smartphone Invariant Indoor Localization
    Tiku, Saideep
    Gufran, Danish
    Pasricha, Sudeep
    2022 IEEE 12TH INTERNATIONAL CONFERENCE ON INDOOR POSITIONING AND INDOOR NAVIGATION (IPIN 2022), 2022,
  • [44] Gaze Estimation Network Based on Multi-Head Attention, Fusion, and Interaction
    Li, Changli
    Li, Fangfang
    Zhang, Kao
    Chen, Nenglun
    Pan, Zhigeng
    SENSORS, 2025, 25 (06)
  • [45] Neural Network Models for Word Sense Disambiguation: An Overview
    Popov, Alexander
    CYBERNETICS AND INFORMATION TECHNOLOGIES, 2018, 18 (01) : 139 - 151
  • [46] Multi-sense embeddings through a word sense disambiguation process
    Ruas, Terry
    Grosky, William
    Aizawa, Akiko
    EXPERT SYSTEMS WITH APPLICATIONS, 2019, 136 : 288 - 303
  • [47] Lane Detection Method Based on Improved Multi-Head Self-Attention
    Ge, Zekun
    Tao, Fazhan
    Fu, Zhumu
    Song, Shuzhong
    Computer Engineering and Applications, 60 (02): : 264 - 271
  • [48] A Multi-scale Graph Network with Multi-head Attention for Histopathology Image Diagnosis
    Xing, Xiaodan
    Ma, Yixin
    Jin, Lei
    Sun, Tianyang
    Xue, Zhong
    Shi, Feng
    Wu, Jinsong
    Shen, Dinggang
    MICCAI WORKSHOP ON COMPUTATIONAL PATHOLOGY, VOL 156, 2021, 156 : 227 - 235
  • [49] A Hierarchical Structured Multi-Head Attention Network for Multi-Turn Response Generation
    Lin, Fei
    Zhang, Cong
    Liu, Shengqiang
    Ma, Hong
    IEEE ACCESS, 2020, 8 : 46802 - 46810
  • [50] Generating Patent Text Abstracts Based on Improved Multi-head Attention Mechanism
    Guoliang S.
    Shu Z.
    Yunfeng W.
    Chunjiang S.
    Liang L.
    Data Analysis and Knowledge Discovery, 2023, 7 (06) : 61 - 72