Using Corpus-Based Approaches in a System for Multilingual Information Retrieval

被引:0
|
作者
Martin Braschler
Peter Schäuble
机构
[1] Eurospider Information Technology AG,
[2] Eurospider Information Technology AG,undefined
来源
Information Retrieval | 2000年 / 3卷
关键词
multilingual information retrieval; cross-language information retrieval; corpus-based approaches; document alignments;
D O I
暂无
中图分类号
学科分类号
摘要
We present a system for multilingual information retrieval that allows users to formulate queries in their preferred language and retrieve relevant information from a collection containing documents in multiple languages. The system is based on a process of document level alignments, where documents of different languages are paired according to their similarity. The resulting mapping allows us to produce a multilingual comparable corpus. Such a corpus has multiple interesting applications. It allows us to build a data structure for query translation in cross-language information retrieval (CLIR). Moreover, we also perform pseudo relevance feedback on the alignments to improve our retrieval results. And finally, multiple retrieval runs can be merged into one unified result list. The resulting system is inexpensive, adaptable to domain-specific collections and new languages and has performed very well at the TREC-7 conference CLIR system comparison.
引用
收藏
页码:273 / 284
页数:11
相关论文
共 50 条
  • [41] CORPUS-BASED APPROACHES IN TEACHING TECHNICAL ENGLISH VOCABULARY
    Sakaeva, L.
    Khakimzyanova, D.
    Shamsutdinova, E.
    INTED2016: 10TH INTERNATIONAL TECHNOLOGY, EDUCATION AND DEVELOPMENT CONFERENCE, 2016, : 5011 - 5014
  • [42] Corpus-based approaches to the phonological analysis of speech Introduction
    Kubozono, Haruo
    Maekawa, Kikuo
    Vance, Timothy J.
    LABORATORY PHONOLOGY, 2015, 6 (3-4): : 279 - 280
  • [43] SPECIALIZED COMMUNICATION AND DISSEMINATION ON THE NETWORK: CORPUS-BASED APPROACHES
    Colantonio, Claudia
    REVISTA DE LINGUISTICA Y LENGUAS APLICADAS, 2023, 18 : 163 - 166
  • [44] Specialized communication and online dissemination: corpus-based approaches
    Espeche, Analia Beatriz
    SPANISH IN CONTEXT, 2023, 20 (03) : 651 - 658
  • [45] MODERN CORPUS-BASED LANGUAGE STUDIES: NEW APPROACHES
    Kuznetsova, Y. L.
    Veleshikova, T. V.
    VOPROSY YAZYKOZNANIYA, 2010, (06): : 108 - 124
  • [46] Corpus-based thesaurus construction for image retrieval in specialist domains
    Ahmad, K
    Tariq, M
    Vrusias, B
    Handy, C
    ADVANCES IN INFORMATION RETRIEVAL, 2003, 2633 : 502 - 510
  • [47] A multilingual approach to multilingual information retrieval
    Nie, JY
    Jin, F
    ADVANCES IN CROSS-LANGUAGE INFORMATION RETRIEVAL, 2003, 2785 : 101 - 110
  • [48] Method combining rule-based and corpus-based approaches for oracle-bone inscription information processing
    Cai, Huiying
    Jiang, Minghu
    Deng, Beixing
    Wang, Lin
    COMPUTATIONAL INTELLIGENCE, PT 2, PROCEEDINGS, 2006, 4114 : 736 - 741
  • [49] SyDoM: A multilingual information retrieval system for digital libraries
    Roussey, C
    Calabretto, S
    Pinon, JM
    ELECTRONIC PUBLISHING '01, CONFERENCE PROCEEDINGS: 2001 IN THE DIGITAL PUBLISHING ODYSSEY, 2001, : 150 - 164
  • [50] An integration of corpus-based and genre-based approaches to text analysis in EAP/ESP: Countering criticisms against corpus-based methodologies
    Flowerdew, L
    ENGLISH FOR SPECIFIC PURPOSES, 2005, 24 (03) : 321 - 332