Cross-Lingual Training of Neural Models for Document Ranking

被引:0
|
作者
Shi, Peng [1 ]
Bai, He [1 ]
Lin, Jimmy [1 ]
机构
[1] Univ Waterloo, David R Cheriton Sch Comp Sci, Waterloo, ON, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We tackle the challenge of cross-lingual training of neural document ranking models for mono-lingual retrieval, specifically leveraging relevance judgments in English to improve search in non-English languages. Our work successfully applies multi-lingual BERT (mBERT) to document ranking and additionally compares against a number of alternatives: translating the training data, translating documents, multi-stage hybrids, and ensembles. Experiments on test collections in six different languages from diverse language families reveal many interesting findings: modelbased relevance transfer using mBERT can significantly improve search quality in (non-English) mono-lingual retrieval, but other "low resource" approaches are competitive as well.
引用
收藏
页码:2768 / 2773
页数:6
相关论文
共 50 条
  • [1] Cross-Lingual Document Similarity
    Muhic, Andrej
    Rupnik, Jan
    Skraba, Primoz
    [J]. PROCEEDINGS OF THE ITI 2012 34TH INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY INTERFACES (ITI), 2012, : 387 - 392
  • [2] Cross-lingual document clustering
    Wu, Ke
    Lu, Bao-Liang
    [J]. ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 2007, 4426 : 956 - +
  • [3] A Study of Neural Matching Models for Cross-lingual IR
    Yu, Puxuan
    Allan, James
    [J]. PROCEEDINGS OF THE 43RD INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '20), 2020, : 1637 - 1640
  • [4] Exploring Neural Translation Models for Cross-Lingual Text Similarity
    Seki, Kazuhiro
    [J]. CIKM'18: PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2018, : 1591 - 1594
  • [5] Neural Factor Graph Models for Cross-lingual Morphological Tagging
    Malaviya, Chaitanya
    Gormley, Matthew R.
    Neubig, Graham
    [J]. PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL), VOL 1, 2018, : 2653 - 2663
  • [6] Neural Cross-Lingual Entity Linking
    Sil, Avirup
    Kundu, Gourab
    Florian, Radu
    Hamza, Wael
    [J]. THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 5464 - 5472
  • [7] Improved Cross-Lingual Document Similarity Measurement
    Isuranga, Udhan
    Sandaruwan, Janaka
    Athukorala, Udesh
    Dias, Gihan
    [J]. 2020 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP 2020), 2020, : 45 - 49
  • [8] NCLS: Neural Cross-Lingual Summarization
    Zhu, Junnan
    Wang, Qian
    Wang, Yining
    Zhou, Yu
    Zhang, Jiajun
    Wang, Shaonan
    Zong, Chengqing
    [J]. 2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 3054 - 3064
  • [10] Models and Datasets for Cross-Lingual Summarisation
    Perez-Beltrachini, Laura
    Lapata, Mirella
    [J]. 2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 9408 - 9423