Cross-Lingual Training of Neural Models for Document Ranking

被引:0
|
作者
Shi, Peng [1 ]
Bai, He [1 ]
Lin, Jimmy [1 ]
机构
[1] Univ Waterloo, David R Cheriton Sch Comp Sci, Waterloo, ON, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We tackle the challenge of cross-lingual training of neural document ranking models for mono-lingual retrieval, specifically leveraging relevance judgments in English to improve search in non-English languages. Our work successfully applies multi-lingual BERT (mBERT) to document ranking and additionally compares against a number of alternatives: translating the training data, translating documents, multi-stage hybrids, and ensembles. Experiments on test collections in six different languages from diverse language families reveal many interesting findings: modelbased relevance transfer using mBERT can significantly improve search quality in (non-English) mono-lingual retrieval, but other "low resource" approaches are competitive as well.
引用
收藏
页码:2768 / 2773
页数:6
相关论文
共 50 条
  • [1] Cross-Lingual Document Similarity
    Muhic, Andrej
    Rupnik, Jan
    Skraba, Primoz
    PROCEEDINGS OF THE ITI 2012 34TH INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY INTERFACES (ITI), 2012, : 387 - 392
  • [2] Cross-lingual document clustering
    Wu, Ke
    Lu, Bao-Liang
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 2007, 4426 : 956 - +
  • [3] A Study of Neural Matching Models for Cross-lingual IR
    Yu, Puxuan
    Allan, James
    PROCEEDINGS OF THE 43RD INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '20), 2020, : 1637 - 1640
  • [4] Exploring Neural Translation Models for Cross-Lingual Text Similarity
    Seki, Kazuhiro
    CIKM'18: PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2018, : 1591 - 1594
  • [5] Neural Factor Graph Models for Cross-lingual Morphological Tagging
    Malaviya, Chaitanya
    Gormley, Matthew R.
    Neubig, Graham
    PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL), VOL 1, 2018, : 2653 - 2663
  • [6] Improved Cross-Lingual Document Similarity Measurement
    Isuranga, Udhan
    Sandaruwan, Janaka
    Athukorala, Udesh
    Dias, Gihan
    2020 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP 2020), 2020, : 45 - 49
  • [7] Neural Cross-Lingual Entity Linking
    Sil, Avirup
    Kundu, Gourab
    Florian, Radu
    Hamza, Wael
    THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 5464 - 5472
  • [8] NCLS: Neural Cross-Lingual Summarization
    Zhu, Junnan
    Wang, Qian
    Wang, Yining
    Zhou, Yu
    Zhang, Jiajun
    Wang, Shaonan
    Zong, Chengqing
    2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 3054 - 3064
  • [10] Models and Datasets for Cross-Lingual Summarisation
    Perez-Beltrachini, Laura
    Lapata, Mirella
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 9408 - 9423