Improving the Extraction of Bilingual Terminology from Wikipedia

被引:28
|
作者
Erdmann, Maike [1 ]
Nakayama, Kotaro [2 ]
Hara, Takahiro [1 ]
Nishio, Shojiro [1 ]
机构
[1] Osaka Univ, Suita, Osaka 565, Japan
[2] Univ Tokyo, Tokyo 1138654, Japan
关键词
Algorithms; Experimentation; Bilingual dictionary; Wikipedia mining; link analysis;
D O I
10.1145/1596990.1596995
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Research on the automatic construction of bilingual dictionaries has achieved impressive results. Bilingual dictionaries are usually constructed from parallel corpora, but since these corpora are available only for selected text domains and language pairs, the potential of other resources is being explored as well. In this article, we want to further pursue the idea of using Wikipedia as a corpus for bilingual terminology extraction. We propose a method that extracts term-translation pairs from different types of Wikipedia link information. After that, an SVM classifier trained on the features of manually labeled training data determines the correctness of unseen term-translation pairs.
引用
收藏
页数:17
相关论文
共 50 条
  • [21] Hypernym Extraction From Wikipedia and Wiktionary
    Sasmaz, Emre
    Ehsani, Razieh
    Yildiz, Olcay Taner
    2017 25TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2017,
  • [22] Mining Tibetan-Chinese Bilingual Entities from Wikipedia
    Jiang, Tao
    Yu, Hongzhi
    He, Xiangzhen
    Meng, Xianghe
    2017 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2017, : 9 - 12
  • [23] PRESENCE AND CHARACTERISTICS OF NURSING TERMINOLOGY IN WIKIPEDIA
    Sanz-Lorente, Maria
    Guardiola-Wanden-Berghe, Rocio
    Wanden-Berghe, Carmina
    Sanz-Valero, Javier
    REVISTA ROL DE ENFERMERIA, 2013, 36 (10): : 654 - 658
  • [24] Unsupervised bilingual terminology extraction algorithm for Chinese-English parallel patents
    Sun, Maosong
    Li, Li
    Liu, Zhiyuan
    Qinghua Daxue Xuebao/Journal of Tsinghua University, 2014, 54 (10): : 1339 - 1343
  • [25] Entity Extraction from Wikipedia List Pages
    Heist, Nicolas
    Paulheim, Heiko
    SEMANTIC WEB (ESWC 2020), 2020, 12123 : 327 - 342
  • [26] Automatic Extraction of Semantic Relations from Wikipedia
    Arnold, Patrick
    Rahm, Erhard
    INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2015, 24 (02)
  • [27] Towards Accurate Relation Extraction from Wikipedia
    Gu, Yulong
    Song, Jiaxing
    Liu, Weidong
    Yao, Yuan
    Zou, Lixin
    2016 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE (WI 2016), 2016, : 89 - 96
  • [28] Extraction and analysis of tripartite relationships from Wikipedia
    Nazir, Fawad
    Takeda, Hideaki
    2008 IEEE INTERNATIONAL SYMPOSIUM ON TECHNOLOGY AND SOCIETY, 2008, : 224 - 235
  • [29] Semantic Sense Extraction From Wikipedia Pages
    Pirrone, Roberto
    Pipitone, Arianna
    Russo, Giuseppe
    3RD INTERNATIONAL CONFERENCE ON HUMAN SYSTEM INTERACTION, 2010, : 543 - 547
  • [30] Bilingual Dictionary of Legal terminology
    Alcaraz-Varo, Enrique
    QUADERNS-REVISTA DE TRADUCCIO, 2006, 13 : 217 - 219