Unsupervised word sense disambiguation and rules extraction using non-aligned bilingual corpus

被引:0
|
作者
Oliveira, F [1 ]
Wong, F [1 ]
Li, YP [1 ]
Zheng, J [1 ]
机构
[1] Univ Macau, Fac Sci & Technol, Macao, Peoples R China
关键词
word sense disambiguation; natural language processing; machine translation;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This. paper presents a statistical Word Sense Disambiguation with application in Portuguese-Chinese Machine Translation systems.. Due to the limited availability of Portuguese-Chinese resources in the form of digital corpora and annotated Treebank, an unsupervised learning and a non-aligned bilingual corpus are applied. The proposed method first identifies words related to each of the ambiguous words based on their surrounding words and relative distance. A mathematical model is then applied in the identification of the most suitable sense of an ambiguous word in terms of the related words. All the senses discovered are converted into a set of rules and stored in the Sense Knowledge base for later use in disambiguation and translation process. Preliminary experiment results show an improvement of 6% in assigning correctly the corresponding translation over the baseline method.
引用
收藏
页码:30 / 35
页数:6
相关论文
共 50 条
  • [1] Unsupervised bilingual word sense disambiguation using Web statistics
    Wang, Y
    Hoffmann, A
    AI 2005: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2005, 3809 : 1167 - 1172
  • [2] Unsupervised word-sense disambiguation using bilingual comparable corpora
    Kaji, H
    Morimoto, Y
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2005, E88D (02) : 289 - 301
  • [3] Unsupervised Word Sense Disambiguation Using Word Embeddings
    Moradi, Behzad
    Ansari, Ebrahim
    Zabokrtsky, Zdenek
    PROCEEDINGS OF THE 2019 25TH CONFERENCE OF OPEN INNOVATIONS ASSOCIATION (FRUCT), 2019, : 228 - 233
  • [4] Unsupervised Translated Word Sense Disambiguation in Constructing Bilingual Lexical Database
    Lynn, Htet Myet
    Choi, Chang
    Kim, Pankoo
    33RD ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, 2018, : 1824 - 1827
  • [5] Unsupervised Word Sense Disambiguation Using The WWW
    Klapaftis, Ioannis P.
    Manandhar, Suresh
    STAIRS 2006, 2006, 142 : 174 - 183
  • [6] Unsupervised word sense disambiguation using WordNet relatives
    Seo, HC
    Chung, HJ
    Rim, HC
    Myaeng, SH
    Kim, SH
    COMPUTER SPEECH AND LANGUAGE, 2004, 18 (03): : 253 - 273
  • [7] Unsupervised Korean Word Sense Disambiguation using CoreNet
    Han, Kijong
    Nam, Sangha
    Kim, Jiseong
    Hahm, Younggyun
    Choi, Key-Sun
    PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 1023 - 1026
  • [8] Unsupervised word sense disambiguation for Korean through the acyclic weighted digraph using corpus and dictionary
    Yoon, Y
    Seon, CN
    Lee, S
    Seo, J
    INFORMATION PROCESSING & MANAGEMENT, 2006, 42 (03) : 710 - 722
  • [9] An unsupervised method for word sense disambiguation
    Rahman, Nazreena
    Borah, Bhogeswar
    JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2022, 34 (09) : 6643 - 6651
  • [10] A word sense disambiguation corpus for Urdu
    Saeed, Ali
    Nawab, Rao Muhammad Adeel
    Stevenson, Mark
    Rayson, Paul
    LANGUAGE RESOURCES AND EVALUATION, 2019, 53 (03) : 397 - 418