Generalizing Hate Speech Detection Using Multi-Task Learning: A Case Study of Political Public Figures

被引:2
|
作者
Yuan, Lanqin [1 ]
Rizoiu, Marian-Andrei [1 ]
机构
[1] Univ Technol Sydney, 15 Broadway, Sydney, NSW 2007, Australia
来源
关键词
Hate speech; Abusive speech; Multi-task learning; Public political figures; Transfer learning;
D O I
10.1016/j.csl.2024.101690
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Automatic identification of hateful and abusive content is vital in combating the spread of harmful online content and its damaging effects. Most existing works evaluate models by examining the generalization error on train-test splits on hate speech datasets. These datasets often differ in their definitions and labeling criteria, leading to poor generalization performance when predicting across new domains and datasets. This work proposes a new Multi-task Learning (MTL) pipeline that trains simultaneously across multiple hate speech datasets to construct a more encompassing classification model. Using a dataset-level leave- one-out evaluation (designating a dataset for testing and jointly training on all others), we trial the MTL detection on new, previously unseen datasets. Our results consistently outperform a large sample of existing work. We show strong results when examining the generalization error in train-test splits and substantial improvements when predicting on previously unseen datasets. Furthermore, we assemble a novel dataset, dubbed PUBFIGS, , focusing on the problematic speech of American Public Political Figures. We crowdsource-label using Amazon MTurk more than 20,000 tweets and machine-label problematic speech in all the 305,235 tweets in PUBFIGS. . We find that the abusive and hate tweeting mainly originates from right-leaning figures and relates to six topics, including Islam, women, ethnicity, and immigrants. We show that MTL builds embeddings that can simultaneously separate abusive from hate speech, and identify its topics.
引用
收藏
页数:20
相关论文
共 50 条
  • [31] Multi-task learning for video anomaly detection*
    Chang, Xingya
    Zhang, Yuxin
    Xue, Dingyu
    Chen, Dongyue
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2022, 87
  • [32] Multi-task learning for video anomaly detection
    Chang, Xingya
    Zhang, Yuxin
    Xue, Dingyu
    Chen, Dongyue
    Journal of Visual Communication and Image Representation, 2022, 87
  • [33] Automatic Cataract Detection with Multi-Task Learning
    Wu, Hongjie
    Lv, Jiancheng
    Wang, Jian
    2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [34] MULTI-OBJECTIVE MULTI-TASK LEARNING ON RNNLM FOR SPEECH RECOGNITION
    Song, Minguang
    Zhao, Yunxin
    Wang, Shaojun
    2018 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2018), 2018, : 197 - 203
  • [35] Joint Disaster Classification and Victim Detection using Multi-Task Learning
    Tham, Mau-Luen
    Wong, Yi Jie
    Kwan, Ban Hoe
    Owada, Yasunori
    Sein, Myint Myint
    Chang, Yoong Choon
    2021 IEEE 12TH ANNUAL UBIQUITOUS COMPUTING, ELECTRONICS & MOBILE COMMUNICATION CONFERENCE (UEMCON), 2021, : 407 - 412
  • [36] NEURAL MOS PREDICTION FOR SYNTHESIZED SPEECH USING MULTI-TASK LEARNING WITH SPOOFING DETECTION AND SPOOFING TYPE CLASSIFICATION
    Choi, Yeunju
    Jung, Youngmoon
    Kim, Hoirin
    2021 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP (SLT), 2021, : 462 - 469
  • [37] Sentiment Analysis and Sarcasm Detection using Deep Multi-Task Learning
    Yik Yang Tan
    Chee-Onn Chow
    Jeevan Kanesan
    Joon Huang Chuah
    YongLiang Lim
    Wireless Personal Communications, 2023, 129 : 2213 - 2237
  • [38] Multi-Task Based Mispronunciation Detection of Children Speech Using Multi-Lingual Information
    Wei, Linxuan
    Dong, Wenwei
    Lin, Binghuai
    Zhang, Jinsong
    2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 1791 - 1794
  • [39] Sentiment Analysis and Sarcasm Detection using Deep Multi-Task Learning
    Tan, Yik Yang
    Chow, Chee-Onn
    Kanesan, Jeevan
    Chuah, Joon Huang
    Lim, YongLiang
    WIRELESS PERSONAL COMMUNICATIONS, 2023, 129 (03) : 2213 - 2237
  • [40] Automatic Hate Speech Detection using Machine Learning: A Comparative Study
    Abro, Sindhu
    Shaikh, Sarang
    Ali, Zafar
    Khan, Sajid
    Mujtaba, Ghulam
    Khand, Zahid Hussain
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2020, 11 (08) : 484 - 491