LAMM: Language Aware Active Learning for Multilingual Models

被引:0
|
作者
Ye, Ze [1 ]
Liu, Dantong [2 ]
Pavani, Kaushik [1 ]
Dasgupta, Sunny [1 ]
机构
[1] Amazon Com Inc, Seattle, WA 98109 USA
[2] Amazon Com Inc, Sunnyvale, CA USA
关键词
D O I
10.1145/3583780.3615507
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In industrial settings, it is often necessary to achieve language-level accuracy targets. For example, Amazon business teams need to build multilingual product classifiers that operate accurately in all European languages. It is unacceptable for the accuracy of product classification to meet the target in one language (e.g, English), while falling below the target in other languages (e.g, Portuguese). To fix such issues, we propose Language Aware Active Learning for Multilingual Models (LAMM), an active learning strategy that enables a classifier to learn from a small amount of labeled data in a targeted manner to improve the accuracy of Low-resource languages (LRLs) with limited amounts of data for model training. Our empirical results on two open-source datasets and two proprietary product classification datasets demonstrate that LAMM is able to improve the LRL performance by 4%-11% when compared to strong baselines.
引用
收藏
页码:5255 / 5256
页数:2
相关论文
共 50 条
  • [21] A semantics-aware approach for multilingual natural language inference
    Phuong Le-Hong
    Erik Cambria
    [J]. Language Resources and Evaluation, 2023, 57 : 611 - 639
  • [22] Language-aware Interlingua for Multilingual Neural Machine Translation
    Zhu, Changfeng
    Yu, Heng
    Cheng, Shanbo
    Luo, Weihua
    [J]. 58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 1650 - 1655
  • [23] Factual Consistency of Multilingual Pretrained Language Models
    Fierro, Constanza
    Sogaard, Anders
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), 2022, : 3046 - 3052
  • [24] Neural Language Codes for Multilingual Acoustic Models
    Muller, Markus
    Stuker, Sebastian
    Waibel, Alex
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 2419 - 2423
  • [25] MULTILINGUAL JAILBREAK CHALLENGES IN LARGE LANGUAGE MODELS
    Deng, Yue
    Zhang, Wenxuan
    Pan, Sinno Jialin
    Bing, Lidong
    [J]. arXiv, 2023,
  • [26] Investigating Transfer Learning in Multilingual Pre-trained Language Models through Chinese Natural Language Inference
    Hu, Hai
    Zhou, He
    Tian, Zuoyu
    Zhang, Yiwen
    Ma, Yina
    Li, Yanting
    Nie, Yixin
    Richardson, Kyle
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 3770 - 3785
  • [27] Pretrained Models for Multilingual Federated Learning
    Weller, Orion
    Marone, Marc
    Braverman, Vladimir
    Lawrie, Dawn
    Van Durme, Benjamin
    [J]. NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 1413 - 1421
  • [28] Language Learning and Teaching in a Multilingual World.
    Rodriguez, Yonay Rodriguez
    [J]. RAEL-REVISTA ELECTRONICA DE LINGUISTICA APLICADA, 2019, 18 (01): : 121 - 125
  • [29] Language learning in study abroad: the multilingual turn
    Li, Zibei
    Li, Citing
    [J]. INTERNATIONAL JOURNAL OF MULTILINGUALISM, 2024, 21 (02) : 1183 - 1186
  • [30] Language Learning Internet Website for Multilingual Communication
    Nakabasami, Chieko
    Wai Kit, Tai
    [J]. ICSIT 2010: INTERNATIONAL CONFERENCE ON SOCIETY AND INFORMATION TECHNOLOGIES (POST-CONFERENCE EDITION), 2010, : 371 - 374