Transliteration Retrieval Model for Cross Lingual Information Retrieval

被引:0
|
作者
Jan, Ea-Ee [1 ]
Lin, Shih-Hsiang [1 ,2 ]
Chen, Berlin [2 ]
机构
[1] IBM TJ Watson Res Ctr, Yorktown Hts, NY 10598 USA
[2] Natl Taiwan Normal Univ, Comp Sci & Informat Engn, Taipei, Taiwan
来源
关键词
cross lingual information retrieval (CLIR); transliteration; retrieval model; statistical machine translation (SMT); NTCIR; ALIGNMENT;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The performance of transliteration from a source language to a target language builds the ground work in support of proper name Cross Lingual Information Retrieval (CLIR). Traditionally, this task is accomplished by two separate modules: transliteration and retrieval. Queries are first transliterated to target language using one or multiple hypotheses. The retrieval is then carried out based on translated queries. The transliteration often results in 30-50% errors with top I hypothesis, thus leading to significant performance degradation in CUR. Therefore, we proposed a unified transliteration retrieval model that incorporates the transliteration similarity measurement into the relevance scoring function. In addition, we presented an efficient and robust method in similarity measurement for a given proper name pair using the Hidden Markov Model (HMM) based alignment and a Statistical Machine Translation (SMT) framework. Experimental data showed significant results with the proposed integrated method on the NTCIR7 IR4QA task, which demonstrated a greater flexibility and acceptance in transliteration.
引用
收藏
页码:183 / +
页数:3
相关论文
共 50 条
  • [1] Semantic Cross-Lingual Information Retrieval
    Pourmahmoud, Solmaz
    Shamsfard, Mehrnoush
    [J]. 23RD INTERNATIONAL SYMPOSIUM ON COMPUTER AND INFORMATION SCIENCES, 2008, : 80 - +
  • [2] An ensemble of transliteration models for information retrieval
    Oh, JH
    Choi, KS
    [J]. INFORMATION PROCESSING & MANAGEMENT, 2006, 42 (04) : 980 - 1002
  • [3] Cross-lingual information retrieval by feature vectors
    Lilleng, Jeanine
    Tomassen, Stein L.
    [J]. NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, PROCEEDINGS, 2007, 4592 : 229 - +
  • [4] Dictionary methods for cross-lingual information retrieval
    Ballesteros, L
    Croft, B
    [J]. DATABASE AND EXPERT SYSTEMS APPLICATIONS, 1996, 1134 : 791 - 801
  • [5] A system for supporting cross-lingual information retrieval
    Capstick, J
    Diagne, AK
    Erbach, G
    Uszkoreit, H
    Leisenberg, A
    Leisenberg, M
    [J]. INFORMATION PROCESSING & MANAGEMENT, 2000, 36 (02) : 275 - 289
  • [6] Cross-lingual information retrieval model based on bilingual topic correlation
    Luo, Yuansheng
    Le, Zhongjian
    Wang, Mingwen
    [J]. Journal of Computational Information Systems, 2013, 9 (06): : 2433 - 2440
  • [7] Cross-lingual Language Model Pretraining for Retrieval
    Yu, Puxuan
    Fei, Hongliang
    Li, Ping
    [J]. PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE 2021 (WWW 2021), 2021, : 1029 - 1039
  • [8] CrossMath: Towards Cross-lingual Math Information Retrieval
    Gore, James
    Polletta, Joseph
    Mansouri, Behrooz
    [J]. PROCEEDINGS OF THE 2024 ACM SIGIR INTERNATIONAL CONFERENCE ON THE THEORY OF INFORMATION RETRIEVAL, ICTIR 2024, 2024, : 101 - 105
  • [9] A method of cross-lingual consumer health information retrieval
    Neveol, Aurelie
    Pereira, Suzanne
    Soualmia, Lina F.
    Thirion, Benoit
    Darmoni, Stefan J.
    [J]. UBIQUITY: TECHNOLOGIES FOR BETTER HEALTH IN AGING SOCIETIES, 2006, 124 : 601 - 608
  • [10] Cross-Lingual Information Retrieval System for Indian Languages
    Jagarlamudi, Jagadeesh
    Kumaran, A.
    [J]. ADVANCES IN MULTILINGUAL AND MULTIMODAL INFORMATION RETRIEVAL, 2008, 5152 : 80 - 87