Using Corpus-Based Approaches in a System for Multilingual Information Retrieval

被引:0
|
作者
Martin Braschler
Peter Schäuble
机构
[1] Eurospider Information Technology AG,
[2] Eurospider Information Technology AG,undefined
来源
Information Retrieval | 2000年 / 3卷
关键词
multilingual information retrieval; cross-language information retrieval; corpus-based approaches; document alignments;
D O I
暂无
中图分类号
学科分类号
摘要
We present a system for multilingual information retrieval that allows users to formulate queries in their preferred language and retrieve relevant information from a collection containing documents in multiple languages. The system is based on a process of document level alignments, where documents of different languages are paired according to their similarity. The resulting mapping allows us to produce a multilingual comparable corpus. Such a corpus has multiple interesting applications. It allows us to build a data structure for query translation in cross-language information retrieval (CLIR). Moreover, we also perform pseudo relevance feedback on the alignments to improve our retrieval results. And finally, multiple retrieval runs can be merged into one unified result list. The resulting system is inexpensive, adaptable to domain-specific collections and new languages and has performed very well at the TREC-7 conference CLIR system comparison.
引用
收藏
页码:273 / 284
页数:11
相关论文
共 50 条
  • [31] Multilingual information access system using cross-language information retrieval
    Hayashi, Yoshihiko
    Matsuo, Yoshihiro
    Nagata, Masaaki
    Furuse, Osamu
    [J]. 2003, Nippon Telegraph and Telephone Corp. (52):
  • [32] Corpus-Based Information Extraction and Opinion Mining for the Restaurant Recommendation System
    Pronoza, Ekaterina
    Yagunova, Elena
    Volskaya, Svetlana
    [J]. STATISTICAL LANGUAGE AND SPEECH PROCESSING, SLSP 2014, 2014, 8791 : 272 - 284
  • [33] CORPUS-BASED RESEARCH IN TRANSLATION: CREATING OF MULTILINGUAL RESOURCES ON RARE DISEASES
    Sanchez Trigo, Elena
    [J]. LANGUAGE FOR INTERNATIONAL COMMUNICATION: LINKING INTERDISCIPLINARY PERSPECTIVES, VOL 3, 2020, : 421 - 429
  • [34] Using translation heuristics to improve a multimodal and multilingual information retrieval system
    Garcia-Cumbreras, Miguel Angel
    Martin-Valdivia, Maria Teresa
    Urena-Lopez, Luis Alfonso
    Diaz-Galiano, Manuel Carlos
    Montejo-Raez, Arturo
    [J]. APPLICATIONS OF FUZZY SETS THEORY, 2007, 4578 : 438 - +
  • [35] Weight Based Precision Oriented Metrics for Multilingual Information Retrieval System
    Adav, Parmatma Y.
    Sujatha, Pothula
    Dhavachelvan, P.
    Prasad, K.
    [J]. 2014 INTERNATIONAL CONFERENCE ON ADVANCED COMMUNICATION CONTROL AND COMPUTING TECHNOLOGIES (ICACCCT), 2014, : 1114 - 1119
  • [36] MIRACLE approaches to multilingual information retrieval:: A baseline for future research
    Martínez, JL
    Villena, J
    Fombella, J
    Serrano, AG
    Martínez, P
    Goñi, JM
    González, JC
    [J]. COMPARATIVE EVALUATION OF MULTILINGUAL INFORMATION ACCESS SYSTEMS, 2003, 3237 : 210 - 219
  • [37] Specialized communication and dissemination in the network: corpus-based approaches
    Jimenez-Yanez, Ricardo-Maria
    [J]. ESTUDIOS DE LINGUISTICA-UNIVERSIDAD DE ALICANTE-ELUA, 2023, (39): : 235 - 237
  • [38] Analysing Literary Sumerian: Corpus-based Approaches.
    Attinger, Pascal
    [J]. ZEITSCHRIFT FUR ASSYRIOLOGIE UND VORDERASIATISCHE ARCHAOLOGIE, 2009, 99 (01): : 127 - 134
  • [39] Specialized communication and dissemination on the web: corpus-based approaches
    Esposito, Giorgia
    [J]. RILCE-REVISTA DE FILOLOGIA HISPANICA, 2023, 39 (02): : 811 - +
  • [40] Corpus-based approaches to the phonological analysis of speech Introduction
    Kubozono, Haruo
    Maekawa, Kikuo
    Vance, Timothy J.
    [J]. LABORATORY PHONOLOGY, 2015, 6 (3-4): : 279 - 280