A Learning to rank framework based on cross-lingual loss function for cross-lingual information retrieval

被引:4
|
作者
Ghanbari, Elham [1 ,2 ]
Shakery, Azadeh [1 ,3 ]
机构
[1] Univ Tehran, Coll Engn, Sch Elect & Comp Engn, Tehran, Iran
[2] Islamic Azad Univ, Dept Comp Engn, Yadegar E Imam Khomeini RAH Shahre Rey Branch, Tehran, Iran
[3] Inst Res Fundamental Sci IPM, Sch Comp Sci, Tehran, Iran
关键词
Learning to rank (LTR); Cross-Lingual information retrieval (CLIR); Cross-lingual features;
D O I
10.1007/s10489-021-02592-z
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Learning to Rank (LTR) techniques use machine learning to rank documents. In this paper, we propose a new LTR based framework for cross-language information retrieval (CLIR). The core idea of the proposed framework is the use of the knowledge of training queries in the target language as well as the training queries in the source language to extract features and to construct the ranking model instead of using only the training queries in the source language. The proposed framework is composed of two main components. The first component extracts monolingual and cross-lingual features from the queries and the documents. To extract the cross-lingual features, we introduce a general approach based on translation probabilities where translation knowledge, which is created from a combination of probabilistic dictionary extracted from translation resources with the translation knowledge available in the queries in the target language, is used to fill the gap between the documents and the queries. The second component of the proposed framework trains a ranking model to optimize the proposed loss function for an input LTR algorithm, and the features. The new loss function is proposed for any listwise LTR algorithm to construct a ranking model for CLIR. To this end, the loss function of the LTR algorithm is calculated for both training data in the target language and training data in the source language. We propose a linear interpolation of the harmonic mean of two loss functions (monolingual and cross-lingual) and the ratio of these two loss functions as the new loss function. The output of this framework is a cross-lingual ranking model that is created with the goal of minimizing the proposed loss function. Experimental results show that the proposed framework outperforms the baseline information retrieval methods and other LTR ranking models in terms of Mean Average Precision (MAP). The findings also indicate that the use of cross-lingual features considerably increases the efficiency of the framework in terms of MAP and Normalized Discounted Cumulative Gain (NDCG).
引用
收藏
页码:3156 / 3174
页数:19
相关论文
共 50 条
  • [41] An evaluation framework for cross-lingual link discovery
    Tang, Ling-Xiang
    Geva, Shlomo
    Trotman, Andrew
    Xu, Yue
    Itakura, Kelly Y.
    INFORMATION PROCESSING & MANAGEMENT, 2014, 50 (01) : 1 - 23
  • [42] Cross-Lingual Blog Analysis by Cross-Lingual Comparison of Characteristic Terms and Blog Posts
    Nakasaki, Hiroyuki
    Kawaba, Mariko
    Utsuro, Takehito
    Fukuhara, Tomohiro
    Nakagawa, Hiroshi
    Kando, Noriko
    PROCEEDINGS OF THE SECOND INTERNATIONAL SYMPOSIUM ON UNIVERSAL COMMUNICATION, 2008, : 105 - +
  • [43] Cross-lingual Cross-modal Pretraining for Multimodal Retrieval
    Fei, Hongliang
    Yu, Tan
    Li, Ping
    2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 3644 - 3650
  • [44] Cross-Lingual Image Retrieval Interactions Based on a Game Competition
    Di Nunzio, Giorgio Maria
    EVALUATING SYSTEMS FOR MULTILINGUAL AND MULTIMODAL INFORMATION ACCESS, 2009, 5706 : 243 - 250
  • [45] A scalable framework for cross-lingual authorship identification
    Sarwar, Raheem
    Li, Qing
    Rakthanmanon, Thanawin
    Nutanong, Sarana
    INFORMATION SCIENCES, 2018, 465 : 323 - 339
  • [46] Cross-Lingual Information Retrieve in Sogou Search
    Xu, JingFang
    Zhai, Feifei
    Xue, Zhengshan
    SIGIR'17: PROCEEDINGS OF THE 40TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2017, : 1361 - 1361
  • [47] CLIReval: Evaluating Machine Translation as a Cross-Lingual Information Retrieval Task
    Sun, Shuo
    Sia, Suzanna
    Duh, Kevin
    58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020): SYSTEM DEMONSTRATIONS, 2020, : 134 - 141
  • [48] Mind the Gap: Cross-Lingual Information Retrieval with Hierarchical Knowledge Enhancement
    Zhang, Fuwei
    Zhang, Zhao
    Ao, Xiang
    Gao, Dehong
    Zhuang, Fuzhen
    Wei, Yi
    He, Qing
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 4345 - 4353
  • [49] Using the Web corpus to translate the queries in cross-lingual information retrieval
    Zhang, JL
    Sun, L
    Min, JM
    PROCEEDINGS OF THE 2005 IEEE INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING (IEEE NLP-KE'05), 2005, : 493 - 498
  • [50] Cross-Lingual Information to the Rescue in Keyword Extraction
    Huang, Chung-Chi
    Eskenazi, Maxine
    Carbonell, Jaime
    Ku, Lun-Wei
    Yang, Ping-Che
    PROCEEDINGS OF 52ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: SYSTEM DEMONSTRATIONS, 2014, : 1 - 6