Multi-lingual and Multi-cultural Figurative Language Understanding

被引:0
|
作者
Kabra, Anubha [1 ]
Liu, Emmy [1 ]
Khanuja, Simran [1 ]
Aji, Alham Fikri [2 ]
Winata, Genta Indra [3 ]
Cahyawijaya, Samuel [4 ]
Aremu, Anuoluwapo [5 ]
Ogayo, Perez [1 ]
Neubig, Graham [1 ]
机构
[1] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA
[2] MBZUAI, Abu Dhabi, U Arab Emirates
[3] Bloomberg, New York, NY USA
[4] HKUST, Hong Kong, Peoples R China
[5] Masakhane, Pretoria, South Africa
关键词
D O I
暂无
中图分类号
学科分类号
摘要
Figurative language permeates human communication, but at the same time is relatively understudied in NLP. Datasets have been created in English to accelerate progress towards measuring and improving figurative language processing in language models (LMs). However, the use of figurative language is an expression of our cultural and societal experiences, making it difficult for these phrases to be universally applicable. In this work, we create a figurative language inference dataset, MABL, for seven diverse languages associated with a variety of cultures: Hindi, Indonesian, Javanese, Kannada, Sundanese, Swahili and Yoruba. Our dataset reveals that each language relies on cultural and regional concepts for figurative expressions, with the highest overlap between languages originating from the same region. We assess multilingual LMs' abilities to interpret figurative language in zero-shot and few-shot settings. All languages exhibit a significant deficiency compared to English, with variations in performance reflecting the availability of pre-training and fine-tuning data, emphasizing the need for LMs to be exposed to a broader range of linguistic and cultural variation during training. 1
引用
收藏
页码:8269 / 8284
页数:16
相关论文
共 50 条
  • [31] Firefighting in a multi-lingual world
    Anon
    Fire International, 2002, (194):
  • [32] The translation of multi-lingual cultures
    Shread, Carolyn
    TRANSLATION STUDIES, 2013, 6 (01) : 128 - 131
  • [33] Massively Multi-Lingual Event Understanding: Extraction, Visualization, and Search
    Jenkins, Chris
    Agarwal, Shantanu
    Barry, Joel
    Fincke, Steven
    Boschee, Elizabeth
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-DEMO 2023, VOL 3, 2023, : 247 - 256
  • [34] Multi-lingual phraseography: Second language learning and translation applications
    Ottaiano, Adriane Orenha
    YEARBOOK OF PHRASEOLOGY, 2013, 4 (01) : 120 - 125
  • [35] Effective Electrical Safety Program Training in Multi-Lingual/Cultural Environments
    Kovacic, Michael
    Cunningham, Karl
    IEEE TRANSACTIONS ON INDUSTRY APPLICATIONS, 2019, 55 (04) : 4384 - 4388
  • [36] The Multi-Cultural Mandate
    Precourt, Geoffrey
    JOURNAL OF ADVERTISING RESEARCH, 2010, 50 (03) : 227 - 228
  • [37] Multi-Domain Multi-Lingual Collaborative Design
    Wouters, Laurent
    Kaeri, Yuki
    Sugawara, Kenji
    PROCEEDINGS OF THE 2013 IEEE 17TH INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN (CSCWD), 2013, : 269 - 274
  • [38] Multi-media multi-lingual optical keyboard
    Bandyopadhyay, R
    ELECTRONICS INFORMATION & PLANNING, 1999, 26 (05): : 229 - 253
  • [39] From Deep Multi-lingual Graph Representation Learning to History Understanding
    Sharifirad, Sima
    Matwin, Stan
    Dzwinel, Witold
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, PT II, 2017, 10614 : 756 - 756
  • [40] Multi-lingual phoneme recognition and language identification using phonotactic information
    Wang, Liang
    Ambikairajah, Eliathamby
    Choi, Eric H. C.
    18TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 4, PROCEEDINGS, 2006, : 245 - +