Multi-lingual and Multi-cultural Figurative Language Understanding

被引:0
|
作者
Kabra, Anubha [1 ]
Liu, Emmy [1 ]
Khanuja, Simran [1 ]
Aji, Alham Fikri [2 ]
Winata, Genta Indra [3 ]
Cahyawijaya, Samuel [4 ]
Aremu, Anuoluwapo [5 ]
Ogayo, Perez [1 ]
Neubig, Graham [1 ]
机构
[1] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA
[2] MBZUAI, Abu Dhabi, U Arab Emirates
[3] Bloomberg, New York, NY USA
[4] HKUST, Hong Kong, Peoples R China
[5] Masakhane, Pretoria, South Africa
关键词
D O I
暂无
中图分类号
学科分类号
摘要
Figurative language permeates human communication, but at the same time is relatively understudied in NLP. Datasets have been created in English to accelerate progress towards measuring and improving figurative language processing in language models (LMs). However, the use of figurative language is an expression of our cultural and societal experiences, making it difficult for these phrases to be universally applicable. In this work, we create a figurative language inference dataset, MABL, for seven diverse languages associated with a variety of cultures: Hindi, Indonesian, Javanese, Kannada, Sundanese, Swahili and Yoruba. Our dataset reveals that each language relies on cultural and regional concepts for figurative expressions, with the highest overlap between languages originating from the same region. We assess multilingual LMs' abilities to interpret figurative language in zero-shot and few-shot settings. All languages exhibit a significant deficiency compared to English, with variations in performance reflecting the availability of pre-training and fine-tuning data, emphasizing the need for LMs to be exposed to a broader range of linguistic and cultural variation during training. 1
引用
收藏
页码:8269 / 8284
页数:16
相关论文
共 50 条
  • [1] Multi-lingual and Multi-cultural Figurative Language Understanding
    Kabra, Anubha
    Liu, Emmy
    Khanuja, Simran
    Aji, Alham Fikri
    Winata, Genta Indra
    Cahyawijaya, Samuel
    Aremu, Anuoluwapo
    Ogayo, Perez
    Neubig, Graham
    arXiv, 2023,
  • [2] Multi-lingual and multi-cultural conditions
    Arce, CH
    TRANSPORT SURVEY QUALITY AND INNOVATION, 2003, : 209 - 213
  • [3] A Multi-lingual and Multi-cultural Tool for Learning Herbal Medicine
    Lertnattee, Verayuth
    Lueviphan, Chanisara
    EMERGENCE OF DIGITAL LIBRARIES - RESEARCH AND PRACTICES, 2014, 8839 : 371 - 378
  • [4] A multi-lingual and multi-cultural tool for learning herbal medicine
    Lertnattee, Verayuth
    Lueviphan, Chanisara
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2014, 8839 : 371 - 378
  • [5] Coping with multi-lingual and multi-cultural patient population in BMT unit
    Mahmid, I.
    BONE MARROW TRANSPLANTATION, 2011, 46 : S442 - S442
  • [6] QUALITATIVE METHODOLOGY IN A MULTI-LINGUAL, MULTI-CULTURAL CONTEXT: THE CHALLENGES AND REWARDS
    Davidson, K.
    Arber, S.
    GERONTOLOGIST, 2009, 49 : 59 - 59
  • [7] Multi-lingual and multi-cultural information literacy: perspectives, models and good practice
    Nowrin, Shohana
    Robinson, Lyn
    Bawden, David
    GLOBAL KNOWLEDGE MEMORY AND COMMUNICATION, 2019, 68 (03) : 207 - 222
  • [8] Multi-cultural and multi-lingual transport surveys, with special reference to the African experience
    van der Reis, P
    Lombard, M
    TRANSPORT SURVEY QUALITY AND INNOVATION, 2003, : 191 - 208
  • [9] NORMSAGE: Multi-Lingual Multi-Cultural Norm Discovery from Conversations On-the-Fly
    Fung, Yi R.
    Charkaborty, Tuhin
    Guo, Hao
    Rambow, Owen
    Muresan, Smaranda
    Ji, Heng
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 15217 - 15230
  • [10] Identifying Good Practices in Information Literacy Education; Creating a Multi-lingual, Multi-cultural MOOC
    Robinson, Lyn
    Bawden, David
    INFORMATION LITERACY IN THE WORKPLACE, 2018, 810 : 715 - 727