Multilingual information retrieval using machine translation, relevance feedback and decompounding

被引:31
|
作者
Chen, A [1 ]
Gey, FC
机构
[1] Univ Calif Berkeley, Sch Informat Management & Syst, Berkeley, CA 94720 USA
[2] Univ Calif Berkeley, UC Data Arch & Tech Assistance UC DATA, Berkeley, CA 94720 USA
来源
INFORMATION RETRIEVAL | 2004年 / 7卷 / 1-2期
关键词
multilingual information retrieval; cross-language information retrieval; relevance feedback; decompounding; results merging;
D O I
10.1023/B:INRT.0000009444.89549.90
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Multilingual retrieval ( querying of multiple document collections each in a different language) can be achieved by combining several individual techniques which enhance retrieval: machine translation to cross the language barrier, relevance feedback to add words to the initial query, decompounding for languages with complex term structure, and data fusion to combine monolingual retrieval results from different languages. Using the CLEF 2001 and CLEF 2002 topics and document collections, this paper evaluates these techniques within the context of a monolingual document ranking formula based upon logistic regression. Each individual technique yields improved performance over runs which do not utilize that technique. Moreover the techniques are complementary, in that combining the best techniques outperforms individual technique performance. An approximate but fast document translation using bilingual wordlists created from machine translation systems is presented and evaluated. The fast document translation is as effective as query translation in multilingual retrieval. Furthermore, when fast document translation is combined with query translation in multilingual retrieval, the performance is significantly better than that of query translation or fast document translation.
引用
收藏
页码:149 / 182
页数:34
相关论文
共 50 条
  • [31] Smoothing functions for automatic relevance feedback in information retrieval
    Amo, P
    Ferreras, FL
    Cruz, F
    Rosa, M
    11TH INTERNATIONAL WORKSHOP ON DATABASE AND EXPERT SYSTEMS APPLICATION, PROCEEDINGS, 2000, : 115 - 119
  • [32] Multilingual Test Sets for Machine Translation of Search Queries for Cross-Lingual Information Retrieval in the Medical Domain
    Uresova, Zdenka
    Dusek, Ondrej
    Hajic, Jan
    Pecina, Pavel
    LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2014, : 3244 - 3247
  • [33] Relevance feedback learning for web image retrieval using soft support vector machine
    School of Information Science and Engineering Northeastern University, Shenyang
    110004, China
    Lect. Notes Comput. Sci., 2008, (201-209):
  • [34] Relevance Feedback Learning for Web Image Retrieval Using Soft Support Vector Machine
    Zhang, Yifei
    Wang, Daling
    Yu, Ge
    ADVANCED WEB AND NETWORK TECHNOLOGIES, AND APPLICATIONS, 2008, 4977 : 201 - 209
  • [35] Image Retrieval Using ESNs and Relevance Feedback
    Yang, Yuan-feng
    Wu, Jian
    Fang, Jing
    Cui, Zhi-ming
    2012 11TH INTERNATIONAL SYMPOSIUM ON DISTRIBUTED COMPUTING AND APPLICATIONS TO BUSINESS, ENGINEERING & SCIENCE (DCABES), 2012, : 383 - 387
  • [36] Biased support vector machine for relevance feedback in image retrieval
    Hoi, CH
    Chan, CH
    Huang, KH
    Lyu, MR
    King, I
    2004 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-4, PROCEEDINGS, 2004, : 3189 - 3194
  • [37] Exeter at CLEF 2003: Experiments with machine translation for monolingual, bilingual and multilingual retrieval
    Lam-Adesina, AM
    Jones, GJF
    COMPARATIVE EVALUATION OF MULTILINGUAL INFORMATION ACCESS SYSTEMS, 2003, 3237 : 271 - 285
  • [38] Facilitating cross-language retrieval and machine translation by multilingual domain ontologies
    Knoth, Petr
    Collins, Trevor
    Sklavounou, Elsa
    Zdrahal, Zdenek
    LREC 2010 - SEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2010, : B51 - B55
  • [39] Multilingual Agreement for Multilingual Neural Machine Translation
    Yang, Jian
    Yin, Yuwei
    Ma, Shuming
    Huang, Haoyang
    Zhang, Dongdong
    Li, Zhoujun
    Wei, Furu
    ACL-IJCNLP 2021: THE 59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 2, 2021, : 233 - 239
  • [40] Multilingual Semantic Relatedness using lightweight machine translation
    Barzegar, Siamak
    Davis, Brian
    Handschuh, Siegfried
    Freitas, Andre
    2018 IEEE 12TH INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC), 2018, : 108 - 114