A Learning to rank framework based on cross-lingual loss function for cross-lingual information retrieval

被引:4
|
作者
Ghanbari, Elham [1 ,2 ]
Shakery, Azadeh [1 ,3 ]
机构
[1] Univ Tehran, Coll Engn, Sch Elect & Comp Engn, Tehran, Iran
[2] Islamic Azad Univ, Dept Comp Engn, Yadegar E Imam Khomeini RAH Shahre Rey Branch, Tehran, Iran
[3] Inst Res Fundamental Sci IPM, Sch Comp Sci, Tehran, Iran
关键词
Learning to rank (LTR); Cross-Lingual information retrieval (CLIR); Cross-lingual features;
D O I
10.1007/s10489-021-02592-z
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Learning to Rank (LTR) techniques use machine learning to rank documents. In this paper, we propose a new LTR based framework for cross-language information retrieval (CLIR). The core idea of the proposed framework is the use of the knowledge of training queries in the target language as well as the training queries in the source language to extract features and to construct the ranking model instead of using only the training queries in the source language. The proposed framework is composed of two main components. The first component extracts monolingual and cross-lingual features from the queries and the documents. To extract the cross-lingual features, we introduce a general approach based on translation probabilities where translation knowledge, which is created from a combination of probabilistic dictionary extracted from translation resources with the translation knowledge available in the queries in the target language, is used to fill the gap between the documents and the queries. The second component of the proposed framework trains a ranking model to optimize the proposed loss function for an input LTR algorithm, and the features. The new loss function is proposed for any listwise LTR algorithm to construct a ranking model for CLIR. To this end, the loss function of the LTR algorithm is calculated for both training data in the target language and training data in the source language. We propose a linear interpolation of the harmonic mean of two loss functions (monolingual and cross-lingual) and the ratio of these two loss functions as the new loss function. The output of this framework is a cross-lingual ranking model that is created with the goal of minimizing the proposed loss function. Experimental results show that the proposed framework outperforms the baseline information retrieval methods and other LTR ranking models in terms of Mean Average Precision (MAP). The findings also indicate that the use of cross-lingual features considerably increases the efficiency of the framework in terms of MAP and Normalized Discounted Cumulative Gain (NDCG).
引用
收藏
页码:3156 / 3174
页数:19
相关论文
共 50 条
  • [31] Cross-lingual Language Model Pretraining for Retrieval
    Yu, Puxuan
    Fei, Hongliang
    Li, Ping
    PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE 2021 (WWW 2021), 2021, : 1029 - 1039
  • [32] On cross-lingual retrieval with multilingual text encoders
    Robert Litschko
    Ivan Vulić
    Simone Paolo Ponzetto
    Goran Glavaš
    Information Retrieval Journal, 2022, 25 : 149 - 183
  • [33] Cross-lingual Adaptation for Recipe Retrieval with Mixup
    Zhu, Bin
    Ngo, Chong-Wah
    Chen, Jingjing
    Chan, Wing-Kwong
    PROCEEDINGS OF THE 2022 INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2022, 2022, : 258 - 267
  • [34] Reinforced Transformer with Cross-Lingual Distillation for Cross-Lingual Aspect Sentiment Classification
    Wu, Hanqian
    Wang, Zhike
    Qing, Feng
    Li, Shoushan
    ELECTRONICS, 2021, 10 (03) : 1 - 14
  • [35] A Cross-Lingual Summarization method based on cross-lingual Fact-relationship Graph Generation
    Zhang, Yongbing
    Gao, Shengxiang
    Huang, Yuxin
    Tan, Kaiwen
    Yu, Zhengtao
    PATTERN RECOGNITION, 2024, 146
  • [36] Enhancing Cross-lingual Natural Language Inference by Prompt-learning from Cross-lingual Templates
    Qi, Kunxun
    Wan, Hai
    Du, Jianfeng
    Chen, Haolan
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 1910 - 1923
  • [37] CL2CM: Improving Cross-Lingual Cross-Modal Retrieval via Cross-Lingual Knowledge Transfer
    Wang, Yabing
    Wang, Fan
    Dong, Jianfeng
    Luo, Hao
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 6, 2024, : 5651 - 5659
  • [38] Ontology-based Tamil–English cross-lingual information retrieval system
    D Thenmozhi
    Chandrabose Aravindan
    Sādhanā, 2018, 43
  • [39] Monolingual and Cross-Lingual Information Retrieval Models Based on (Bilingual) Word Embeddings
    Vulic, Ivan
    Moens, Marie-Francine
    SIGIR 2015: PROCEEDINGS OF THE 38TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2015, : 363 - 372
  • [40] Cross-Lingual Cross-Modal Retrieval with Noise-Robust Learning
    Wang, Yabing
    Dong, Jianfeng
    Liang, Tianxiang
    Zhang, Minsong
    Cai, Rui
    Wang, Xun
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022,