C2LIR: Continual Cross-Lingual Transfer for Low-Resource Information Retrieval

被引:0
|
作者
Lee, Jaeseong [1 ]
Lee, Dohyeon [1 ]
Kim, Jongho [1 ]
Hwang, Seung-Won [1 ]
机构
[1] Seoul Natl Univ, Seoul, South Korea
关键词
Low-resource language; Neural IR; Cross-lingual transfer;
D O I
10.1007/978-3-031-28238-6_37
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper proposes a method to train information retrieval (IR) model for a low-resource language with a small corpus and no parallel sentences. Although neural IR models based on pretrained language models (PLMs) have shown high performance in high-resource languages (HRLs), building PLM for LRLs is challenging. We propose (CLIR)-L-2, a method to build a high-performing neural IR model for LRL, with dictionary-based pretraining objectives for cross-lingual transfer from HRL. Experiments on the monolingual and cross-lingual IR in diverse low-resource scenarios show the effectiveness and data efficiency of (CLIR)-L-2.
引用
收藏
页码:466 / 474
页数:9
相关论文
共 50 条
  • [1] Cross-lingual embedding for cross-lingual question retrieval in low-resource community question answering
    HajiAminShirazi, Shahrzad
    Momtazi, Saeedeh
    [J]. MACHINE TRANSLATION, 2020, 34 (04) : 287 - 303
  • [2] Cross-Lingual Morphological Tagging for Low-Resource Languages
    Buys, Jan
    Botha, Jan A.
    [J]. PROCEEDINGS OF THE 54TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1, 2016, : 1954 - 1964
  • [3] Adversarial Cross-Lingual Transfer Learning for Slot Tagging of Low-Resource Languages
    He, Keqing
    Yan, Yuanmeng
    Xu, Weiran
    [J]. 2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
  • [4] Improving Low-Resource Cross-lingual Document Retrieval by Reranking with Deep Bilingual Representations
    Zhang, Rui
    Westerfield, Caitlin
    Shim, Sungrok
    Bingham, Garrett
    Fabbri, Alexander
    Hu, William
    Verma, Neha
    Radev, Dragomir
    [J]. 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 3173 - 3179
  • [5] Cross-lingual offensive speech identification with transfer learning for low-resource languages
    Shi, Xiayang
    Liu, Xinyi
    Xu, Chun
    Huang, Yuanyuan
    Chen, Fang
    Zhu, Shaolin
    [J]. COMPUTERS & ELECTRICAL ENGINEERING, 2022, 101
  • [6] Cross-Lingual Language Modeling for Low-Resource Speech Recognition
    Xu, Ping
    Fung, Pascale
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2013, 21 (06): : 1134 - 1144
  • [7] Exploiting Adapters for Cross-Lingual Low-Resource Speech Recognition
    Hou, Wenxin
    Zhu, Han
    Wang, Yidong
    Wang, Jindong
    Qin, Tao
    Xu, Renju
    Shinozaki, Takahiro
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 317 - 329
  • [8] LEARNING CROSS-LINGUAL INFORMATION WITH MULTILINGUAL BLSTM FOR SPEECH SYNTHESIS OF LOW-RESOURCE LANGUAGES
    Yu, Quanjie
    Liu, Peng
    Wu, Zhiyong
    Kang, Shiyin
    Meng, Helen
    Cai, Lianhong
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 5545 - 5549
  • [9] Design Challenges in Low-resource Cross-lingual Entity Linking
    Fu, Xingyu
    Shi, Weijia
    Yu, Xiaodong
    Zhao, Zian
    Roth, Dan
    [J]. PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 6418 - 6432
  • [10] SARAL: A Low-Resource Cross-Lingual Domain-Focused Information Retrieval System for Effective Rapid Document Triage
    Boschee, Elizabeth
    Barry, Joel
    Billa, Jayadev
    Freedman, Marjorie
    Gowda, Thamme
    Lignos, Constantine
    Palen-Michel, Chester
    Pust, Michael
    Khonglah, Banriskhem K.
    Madikeri, Srikanth
    May, Jonathan
    Miller, Scott
    [J]. PROCEEDINGS OF THE 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: SYSTEM DEMONSTRATIONS, (ACL 2019), 2019, : 19 - 24