Using Corpus-Based Approaches in a System for Multilingual Information Retrieval

被引:0
|
作者
Martin Braschler
Peter Schäuble
机构
[1] Eurospider Information Technology AG,
[2] Eurospider Information Technology AG,undefined
来源
Information Retrieval | 2000年 / 3卷
关键词
multilingual information retrieval; cross-language information retrieval; corpus-based approaches; document alignments;
D O I
暂无
中图分类号
学科分类号
摘要
We present a system for multilingual information retrieval that allows users to formulate queries in their preferred language and retrieve relevant information from a collection containing documents in multiple languages. The system is based on a process of document level alignments, where documents of different languages are paired according to their similarity. The resulting mapping allows us to produce a multilingual comparable corpus. Such a corpus has multiple interesting applications. It allows us to build a data structure for query translation in cross-language information retrieval (CLIR). Moreover, we also perform pseudo relevance feedback on the alignments to improve our retrieval results. And finally, multiple retrieval runs can be merged into one unified result list. The resulting system is inexpensive, adaptable to domain-specific collections and new languages and has performed very well at the TREC-7 conference CLIR system comparison.
引用
收藏
页码:273 / 284
页数:11
相关论文
共 50 条
  • [31] Corpus-Based Approaches to English Language Teaching
    Poole, Brian
    ELT JOURNAL, 2011, 65 (01) : 92 - 93
  • [32] Multilingual information access system using cross-language information retrieval
    Hayashi, Yoshihiko
    Matsuo, Yoshihiro
    Nagata, Masaaki
    Furuse, Osamu
    2003, Nippon Telegraph and Telephone Corp. (52):
  • [33] Corpus-Based Information Extraction and Opinion Mining for the Restaurant Recommendation System
    Pronoza, Ekaterina
    Yagunova, Elena
    Volskaya, Svetlana
    STATISTICAL LANGUAGE AND SPEECH PROCESSING, SLSP 2014, 2014, 8791 : 272 - 284
  • [34] CORPUS-BASED RESEARCH IN TRANSLATION: CREATING OF MULTILINGUAL RESOURCES ON RARE DISEASES
    Sanchez Trigo, Elena
    LANGUAGE FOR INTERNATIONAL COMMUNICATION: LINKING INTERDISCIPLINARY PERSPECTIVES, VOL 3, 2020, : 421 - 429
  • [35] Using translation heuristics to improve a multimodal and multilingual information retrieval system
    Garcia-Cumbreras, Miguel Angel
    Martin-Valdivia, Maria Teresa
    Urena-Lopez, Luis Alfonso
    Diaz-Galiano, Manuel Carlos
    Montejo-Raez, Arturo
    APPLICATIONS OF FUZZY SETS THEORY, 2007, 4578 : 438 - +
  • [36] Weight Based Precision Oriented Metrics for Multilingual Information Retrieval System
    Adav, Parmatma Y.
    Sujatha, Pothula
    Dhavachelvan, P.
    Prasad, K.
    2014 INTERNATIONAL CONFERENCE ON ADVANCED COMMUNICATION CONTROL AND COMPUTING TECHNOLOGIES (ICACCCT), 2014, : 1114 - 1119
  • [37] Specialized communication and dissemination in the network: corpus-based approaches
    Jimenez-Yanez, Ricardo-Maria
    ESTUDIOS DE LINGUISTICA-UNIVERSIDAD DE ALICANTE-ELUA, 2023, (39): : 235 - 237
  • [38] MIRACLE approaches to multilingual information retrieval:: A baseline for future research
    Martínez, JL
    Villena, J
    Fombella, J
    Serrano, AG
    Martínez, P
    Goñi, JM
    González, JC
    COMPARATIVE EVALUATION OF MULTILINGUAL INFORMATION ACCESS SYSTEMS, 2003, 3237 : 210 - 219
  • [39] Analysing Literary Sumerian: Corpus-based Approaches.
    Attinger, Pascal
    ZEITSCHRIFT FUR ASSYRIOLOGIE UND VORDERASIATISCHE ARCHAOLOGIE, 2009, 99 (01): : 127 - 134
  • [40] Specialized communication and dissemination on the web: corpus-based approaches
    Esposito, Giorgia
    RILCE-REVISTA DE FILOLOGIA HISPANICA, 2023, 39 (02): : 811 - +