Creating a Wikipedia-based Persian-English word association dictionary

被引:4
|
作者
Rahimi Z. [1 ]
Shakery A. [1 ]
机构
[1] School of Electrical and Computer Engineering, University of Tehran, Tehran
关键词
Association dictionary; Cross language information retrieval; Wikipedia; Wikipedia Mining;
D O I
10.1109/ISTEL.2010.5734088
中图分类号
学科分类号
摘要
One of the most important issues in cross language information retrieval is how to cross the language barrier between the query and the documents. Different translation resources have been studied for this purpose. In this research, we study using Wikipedia for query translation by constructing a Wikipedia-based bilingual association dictionary. We use English and Persian Wikipedia inter-language links to align related titles and then mine word by word associations between the two languages using the extracted alignments. We use the mined word association dictionary for translating queries in Persian-English cross language information retrieval. Our experimental results on Hamshari corpus show that the proposed method is effective in extracting word associations and that Persian Wikipedia is a promising translation resource. Using the association dictionary, we can improve the pure dictionary-based method, where the only translation resource is a bilingual dictionary, by 33.6% and its recall by 26.2%. © 2010 IEEE.
引用
收藏
页码:562 / 567
页数:5
相关论文
共 12 条
  • [1] Creating a Persian-English Comparable Corpus
    Hashemi, Homa Baradaran
    Shakery, Azadeh
    Faili, Heshaam
    MULTILINGUAL AND MULTIMODAL INFORMATION ACCESS EVALUATION, 2010, 6360 : 27 - 39
  • [2] Tabouid: a Wikipedia-based word guessing game
    Bernard, Timothee
    58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020): SYSTEM DEMONSTRATIONS, 2020, : 24 - 29
  • [3] English Nominal Compound Detection with Wikipedia-Based Methods
    Nagy, Istvan T.
    Vincze, Veronika
    TEXT, SPEECH, AND DIALOGUE, TSD 2013, 2013, 8082 : 225 - 232
  • [4] Topic Based Creation of a Persian-English Comparable Corpus
    Rahimi, Zahra
    Shakery, Azadeh
    INFORMATION RETRIEVAL TECHNOLOGY, 2011, 7097 : 458 - 469
  • [5] THE SHORTER PERSIAN-ENGLISH DICTIONARY, 2ND EDITION - HAIM,S
    GLAZER, S
    MIDDLE EAST JOURNAL, 1960, 14 (01): : 97 - 98
  • [6] Creating a medical English-Swedish dictionary using interactive word alignment
    Nyström M.
    Merkel M.
    Ahrenberg L.
    Zweigenbaum P.
    Petersson H.
    Åhlfeldt H.
    BMC Medical Informatics and Decision Making, 6 (1)
  • [7] WIKIR: A Python']Python toolkit for building a large-scale Wikipedia-based English Information Retrieval Dataset
    Frej, Jibril
    Schwab, Didier
    Chevallet, Jean-Pierre
    PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 1926 - 1933
  • [8] Dictionary-based English text compression using word endings
    Yang, Jeehong
    Savari, Serap A.
    DCC 2007: DATA COMPRESSION CONFERENCE, PROCEEDINGS, 2007, : 410 - 410
  • [9] Research of Chinese-English word alignment algorithm based on bilingual dictionary
    Deng, Dan
    Liu, Qun
    Yu, Hongkui
    Jisuanji Gongcheng/Computer Engineering, 2005, 31 (16): : 45 - 47
  • [10] Creating and validating a corpus-based English academic word list for physics
    Vukovic-Stamatovic, Milica
    REVISTA ESPANOLA DE LINGUISTICA APLICADA, 2025, 38 (01): : 80 - 106