mCLIP: Multilingual CLIP via Cross-lingual Transfer

被引:0
|
作者
Chen, Guanhua [1 ]
Hou, Lu [2 ]
Chen, Yun [3 ]
Dai, Wenliang [5 ]
Shang, Lifeng [2 ]
Jiang, Xin [2 ]
Liu, Qun [2 ]
Pan, Jia [4 ]
Wang, Wenping [6 ]
机构
[1] Southern Univ Sci & Technol, Shenzhen, Peoples R China
[2] Huawei Noahs Ark Lab, Montreal, PQ, Canada
[3] Shanghai Univ Finance & Econ, Shanghai, Peoples R China
[4] Univ Hong Kong, Hong Kong, Peoples R China
[5] Hong Kong Univ Sci & Technol, Hong Kong, Peoples R China
[6] Texas A&M Univ, College Stn, TX USA
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Large-scale vision-language pretrained (VLP) models like CLIP have shown remarkable performance on various downstream cross-modal tasks. However, they are usually biased towards English due to the lack of sufficient non-English image-text pairs. Existing multilingual VLP methods often learn retrieval-inefficient single-stream models by translation-augmented non-English image-text pairs. In this paper, we introduce mCLIP, a retrieval-efficient dual-stream multilingual VLP model, trained by aligning the CLIP model and a Multilingual Text Encoder (MTE) through a novel Triangle Cross-modal Knowledge Distillation (TriKD) method. It is parameter-efficient as only two light projectors on the top of them are updated during distillation. Furthermore, to enhance the token- and sentence-level multilingual representation of the MTE, we propose to train it with machine translation and contrastive learning jointly before the TriKD to provide a better initialization. Empirical results show that mCLIP achieves new state-of-the-art performance for both zero-shot and finetuned multilingual image-text retrieval task.
引用
收藏
页码:13028 / 13043
页数:16
相关论文
共 50 条
  • [1] Cross-lingual and Multilingual CLIP
    Carlsson, Fredrik
    Eisen, Philipp
    Rekathati, Faton
    Sahlgren, Magnus
    [J]. LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 6848 - 6854
  • [2] Gender Bias in Multilingual Embeddings and Cross-Lingual Transfer
    Zhao, Jieyu
    Mukherjee, Subhabrata
    Hosseini, Saghar
    Chang, Kai-Wei
    Awadallah, Ahmed Hassan
    [J]. 58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 2896 - 2907
  • [3] Cross-Lingual Transfer Learning for Multilingual Task Oriented Dialog
    Schuster, Sebastian
    Gupta, Sonal
    Shah, Rushin
    Lewis, Mike
    [J]. 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 3795 - 3805
  • [4] Syntax-augmented Multilingual BERT for Cross-lingual Transfer
    Ahmad, Wasi Uddin
    Li, Haoran
    Chang, Kai-Wei
    Mehdad, Yashar
    [J]. 59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (ACL-IJCNLP 2021), VOL 1, 2021, : 4538 - 4554
  • [5] Cross-Lingual Validation of Multilingual Wordnets
    Tufis, Dan
    Ion, Radu
    Barbu, Eduard
    Barbu, Verginica
    [J]. GWC 2004: SECOND INTERNATIONAL WORDNET CONFERENCE, PROCEEDINGS, 2003, : 332 - 340
  • [6] When is BERT Multilingual? Isolating Crucial Ingredients for Cross-lingual Transfer
    Deshpande, Ameet
    Talukdar, Partha
    Narasimhan, Karthik
    [J]. NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 3610 - 3623
  • [7] On cross-lingual retrieval with multilingual text encoders
    Litschko, Robert
    Vulic, Ivan
    Ponzetto, Simone Paolo
    Glavas, Goran
    [J]. INFORMATION RETRIEVAL JOURNAL, 2022, 25 (02): : 149 - 183
  • [8] Cross-lingual and multilingual ontology mapping - survey
    Ivanova, Tatyana
    [J]. COMPUTER SYSTEMS AND TECHNOLOGIES (COMPSYSTECH'18), 2018, 1641 : 50 - 57
  • [9] Cross-lingual thesaurus for multilingual knowledge management
    Yang, Christopher C.
    Wei, Chih-Ping
    Li, K. W.
    [J]. DECISION SUPPORT SYSTEMS, 2008, 45 (03) : 596 - 605
  • [10] Multilingual modeling of cross-lingual spelling variants
    Linden, Krister
    [J]. INFORMATION RETRIEVAL, 2006, 9 (03): : 295 - 310