Unsupervised word sense disambiguation and rules extraction using non-aligned bilingual corpus

被引:0
|
作者
Oliveira, F [1 ]
Wong, F [1 ]
Li, YP [1 ]
Zheng, J [1 ]
机构
[1] Univ Macau, Fac Sci & Technol, Macao, Peoples R China
关键词
word sense disambiguation; natural language processing; machine translation;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This. paper presents a statistical Word Sense Disambiguation with application in Portuguese-Chinese Machine Translation systems.. Due to the limited availability of Portuguese-Chinese resources in the form of digital corpora and annotated Treebank, an unsupervised learning and a non-aligned bilingual corpus are applied. The proposed method first identifies words related to each of the ambiguous words based on their surrounding words and relative distance. A mathematical model is then applied in the identification of the most suitable sense of an ambiguous word in terms of the related words. All the senses discovered are converted into a set of rules and stored in the Sense Knowledge base for later use in disambiguation and translation process. Preliminary experiment results show an improvement of 6% in assigning correctly the corresponding translation over the baseline method.
引用
收藏
页码:30 / 35
页数:6
相关论文
共 50 条
  • [21] Word Sense Disambiguation in Bengali: an Unsupervised Approach
    Pal, Alok Ranjan
    Saha, Diganta
    PROCEEDINGS OF THE 2017 IEEE SECOND INTERNATIONAL CONFERENCE ON ELECTRICAL, COMPUTER AND COMMUNICATION TECHNOLOGIES (ICECCT), 2017,
  • [22] Building Sense Tagged Corpus Using Wikipedia for Supervised Word Sense Disambiguation
    Saif, Abdulgabbar
    Omar, Nazlia
    Zainodin, Ummi Zakiah
    Ab Aziz, Mohd Juziaddin
    8TH ANNUAL INTERNATIONAL CONFERENCE ON BIOLOGICALLY INSPIRED COGNITIVE ARCHITECTURES, BICA 2017 (EIGHTH ANNUAL MEETING OF THE BICA SOCIETY), 2018, 123 : 403 - 412
  • [23] Adding Intelligence to Non-corpus based Word Sense Disambiguation
    Charhate, Sayali
    Dani, Anurag
    Sugandhi, Rekha
    Patil, Varsha
    2012 12TH INTERNATIONAL CONFERENCE ON HYBRID INTELLIGENT SYSTEMS (HIS), 2012, : 173 - 178
  • [24] Learning Sense Representation from Word Representation for Unsupervised Word Sense Disambiguation
    Wang, Jie
    Fu, Zhenxin
    Li, Moxin
    Zhang, Haisong
    Zhao, Dongyan
    Yan, Rui
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 13947 - 13948
  • [25] Unsupervised Word Sense Disambiguation based on Word Embedding and Collocation
    Han, Shangzhuang
    Shirai, Kiyoaki
    ICAART: PROCEEDINGS OF THE 13TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE - VOL 2, 2021, : 1218 - 1225
  • [26] Word Sense Disambiguation in Bengali language using unsupervised methodology with modifications
    Alok Ranjan Pal
    Diganta Saha
    Sādhanā, 2019, 44
  • [27] Improving Subjectivity Detection using Unsupervised Subjectivity Word Sense Disambiguation
    Ortega, Reynier
    Fonseca, Adrian
    Gutierrez, Yoan
    Montoyo, Andres
    PROCESAMIENTO DEL LENGUAJE NATURAL, 2013, (51): : 179 - 186
  • [28] An unsupervised & statistical word sense tagging using bilingual sources
    Oliveira, F
    Wong, F
    Li, YP
    Proceedings of 2005 International Conference on Machine Learning and Cybernetics, Vols 1-9, 2005, : 3749 - 3754
  • [29] Word Sense Disambiguation in Bengali language using unsupervised methodology with modifications
    Pal, Alok Ranjan
    Saha, Diganta
    SADHANA-ACADEMY PROCEEDINGS IN ENGINEERING SCIENCES, 2019, 44 (07):
  • [30] Semi-supervised Word Sense Disambiguation Using the Web as Corpus
    Guzman-Cabrera, Rafael
    Rosso, Paolo
    Montes-y-Gomez, Manuel
    Villasenor-Pineda, Luis
    Pinto-Avendano, David
    COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, 2009, 5449 : 256 - +