Hybrid Emoji-Based Masked Language Models for Zero-Shot Abusive Language Detection

被引:0
|
作者
Corazza, Michele [1 ]
Menini, Stefano [2 ]
Cabrio, Elena [3 ]
Tonelli, Sara [2 ]
Villata, Serena [3 ]
机构
[1] Univ Bologna, Bologna, Italy
[2] Fdn Bruno Kessler, Trento, Italy
[3] Univ Cote Azur, CNRS, INRIA, I3S, Nice, France
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent studies have demonstrated the effectiveness of cross-lingual language model pretraining on different NLP tasks, such as natural language inference and machine translation. In our work, we test this approach on social media data, which are particularly challenging to process within this framework, since the limited length of the textual messages and the irregularity of the language make it harder to learn meaningful encodings. More specifically, we propose a hybrid emoji-based Masked Language Model (MLM) to leverage the common information conveyed by emojis across different languages and improve the learned cross-lingual representation of short text messages, with the goal to perform zeroshot abusive language detection. We compare the results obtained with the original MLM to the ones obtained by our method, showing improved performance on German, Italian and Spanish.
引用
收藏
页码:943 / 949
页数:7
相关论文
共 50 条
  • [1] Transfer language selection for zero-shot cross-lingual abusive language detection
    Eronen, Juuso
    Ptaszynski, Michal
    Masui, Fumito
    Arata, Masaki
    Leliwa, Gniewosz
    Wroczynski, Michal
    INFORMATION PROCESSING & MANAGEMENT, 2022, 59 (04)
  • [2] Large Language Models are Zero-Shot Reasoners
    Kojima, Takeshi
    Gu, Shixiang Shane
    Reid, Machel
    Matsuo, Yutaka
    Iwasawa, Yusuke
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [3] Language Models as Zero-Shot Trajectory Generators
    Kwon, Teyun
    Di Palo, Norman
    Johns, Edward
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2024, 9 (07): : 6728 - 6735
  • [4] Visual Language Based Succinct Zero-Shot Object Detection
    Zheng, Ye
    Huang, Xi
    Cui, Li
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 5410 - 5418
  • [5] Extensible Prompts for Language Models on Zero-shot Language Style Customization
    Ge, Tao
    Hu, Jing
    Dong, Li
    Mao, Shaoguang
    Xia, Yan
    Wang, Xun
    Chen, Si-Qing
    Wei, Furu
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [6] Large Language Models as Zero-Shot Conversational Recommenders
    He, Zhankui
    Xie, Zhouhang
    Jha, Rahul
    Steck, Harald
    Liang, Dawen
    Feng, Yesu
    Majumder, Bodhisattwa Prasad
    Kallus, Nathan
    McAuley, Julian
    PROCEEDINGS OF THE 32ND ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2023, 2023, : 720 - 730
  • [7] Zero-Shot Classification of Art with Large Language Models
    Tojima, Tatsuya
    Yoshida, Mitsuo
    IEEE Access, 2025, 13 : 17426 - 17439
  • [8] Generating Training Data with Language Models: Towards Zero-Shot Language Understanding
    Meng, Yu
    Huang, Jiaxin
    Zhang, Yu
    Han, Jiawei
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [9] Large Language Models are Zero-Shot Rankers for Recommender Systems
    Hou, Yupeng
    Zhang, Junjie
    Lin, Zihan
    Lu, Hongyu
    Xie, Ruobing
    McAuley, Julian
    Zhao, Wayne Xin
    ADVANCES IN INFORMATION RETRIEVAL, ECIR 2024, PT II, 2024, 14609 : 364 - 381
  • [10] Zero-Shot Recommendation as Language Modeling
    Sileo, Damien
    Vossen, Wout
    Raymaekers, Robbe
    ADVANCES IN INFORMATION RETRIEVAL, PT II, 2022, 13186 : 223 - 230