A joint learning approach with knowledge injection for zero-shot cross-lingual hate speech detection

被引：43

作者：

Pamungkas, Endang Wahyu ^{[1
]}

Basile, Valerio ^{[1
]}

Patti, Viviana ^{[1
]}

机构：

[1] Univ Turin, Dept Comp Sci, Turin, Italy

来源：

INFORMATION PROCESSING & MANAGEMENT | 2021年 / 58卷 / 04期

关键词：

Hate speech detection; Cross-lingual classification; Social media; Transfer learning; Zero-shot learning; ETHNOPHAULISMS;

D O I：

10.1016/j.ipm.2021.102544

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Hate speech is an increasingly important societal issue in the era of digital communication. Hateful expressions often make use of figurative language and, although they represent, in some sense, the dark side of language, they are also often prime examples of creative use of language. While hate speech is a global phenomenon, current studies on automatic hate speech detection are typically framed in a monolingual setting. In this work, we explore hate speech detection in low-resource languages by transferring knowledge from a resource-rich language, English, in a zero-shot learning fashion. We experiment with traditional and recent neural architectures, and propose two joint-learning models, using different multilingual language representations to transfer knowledge between pairs of languages. We also evaluate the impact of additional knowledge in our experiment, by incorporating information from a multilingual lexicon of abusive words. The results show that our joint-learning models achieve the best performance on most languages. However, a simple approach that uses machine translation and a pre-trained English language model achieves a robust performance. In contrast, Multilingual BERT fails to obtain a good performance in cross-lingual hate speech detection. We also experimentally found that the external knowledge from a multilingual abusive lexicon is able to improve the models' performance, specifically in detecting the positive class. The results of our experimental evaluation highlight a number of challenges and issues in this particular task. One of the main challenges is related to the issue of current benchmarks for hate speech detection, in particular how bias related to the topical focus in the datasets influences the classification performance. The insufficient ability of current multilingual language models to transfer knowledge between languages in the specific hate speech detection task also remain an open problem. However, our experimental evaluation and our qualitative analysis show how the explicit integration of linguistic knowledge from a structured abusive language lexicon helps to alleviate this issue.

引用

页数：19

共 50 条

[21] Cross-Lingual Few-Shot Hate Speech and Offensive Language Detection Using Meta Learning
Mozafari, Marzieh
Farahbakhsh, Reza
Crespi, Noel
IEEE ACCESS, 2022, 10 : 14880 - 14896
[22] Zero-Shot Neural Transfer for Cross-Lingual Entity Linking
Rijhwani, Shruti
Xie, Jiateng
Neubig, Graham
Carbonell, Jaime
THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 6924 - 6931
[23] Evaluating morphological typology in zero-shot cross-lingual transfer
Martinez-Garcia, Antonio
Badia, Toni
Barnes, Jeremy
59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (ACL-IJCNLP 2021), VOL 1, 2021, : 3136 - 3153
[24] Towards zero-shot cross-lingual named entity disambiguation
Barrena, Ander
Soroa, Aitor
Agirre, Eneko
EXPERT SYSTEMS WITH APPLICATIONS, 2021, 184
[25] Reinforced Zero-Shot Cross-Lingual Neural Headline Generation
Ayana
Chen, Yun
Yang, Cheng
Liu, Zhiyuan
Sun, Maosong
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 (28) : 2572 - 2584
[26] (ALMOST) ZERO-SHOT CROSS-LINGUAL SPOKEN LANGUAGE UNDERSTANDING
Upadhyay, Shyam
Faruqui, Manaal
Tur, Gokhan
Hakkani-Tur, Dilek
Heck, Larry
2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 6034 - 6038
[27] Simple and Effective Zero-shot Cross-lingual Phoneme Recognition
Xu, Qiantong
Baevski, Alexei
Auli, Michael
INTERSPEECH 2022, 2022, : 2113 - 2117
[28] Cross-Lingual BERT Transformation for Zero-Shot Dependency Parsing
Wang, Yuxuan
Che, Wanxiang
Guo, Jiang
Liu, Yijia
Liu, Ting
2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 5721 - 5727
[29] Improving Zero-Shot Cross-Lingual Dialogue State Tracking via Contrastive Learning
Xiang, Yu
Zhang, Ting
Di, Hui
Huang, Hui
Li, Chunyou
Ouchi, Kazushige
Chen, Yufeng
Xu, Jinan
CHINESE COMPUTATIONAL LINGUISTICS, CCL 2023, 2023, 14232 : 127 - 141
[30] Soft Layer Selection with Meta-Learning for Zero-Shot Cross-Lingual Transfer
Xu, Weijia
Haider, Batool
Krone, Jason
Mansour, Saab
1ST WORKSHOP ON META LEARNING AND ITS APPLICATIONS TO NATURAL LANGUAGE PROCESSING (METANLP 2021), 2021, : 11 - 18

← 1 2 3 4 5 →