SSL-GAN-RoBERTa: A robust semi-supervised model for detecting Anti-Asian COVID-19 hate speech on social media

被引:4
|
作者
Su, Xuanyu [1 ]
Li, Yansong [1 ]
Branco, Paula [1 ]
Inkpen, Diana [1 ]
机构
[1] Univ Ottawa, Sch Elect Engn & Comp Sci, Ottawa, ON, Canada
关键词
Hate speech detection; Deep learning; Semi-supervised learning;
D O I
10.1017/S1351324923000396
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Anti-Asian speech during the COVID-19 pandemic has been a serious problem with severe consequences. A hate speech wave swept social media platforms. The timely detection of Anti-Asian COVID-19-related hate speech is of utmost importance, not only to allow the application of preventive mechanisms but also to anticipate and possibly prevent other similar discriminatory situations. In this paper, we address the problem of detecting Anti-Asian COVID-19-related hate speech from social media data. Previous approaches that tackled this problem used a transformer-based model, BERT/RoBERTa, trained on the homologous annotated dataset and achieved good performance on this task. However, this requires extensive and annotated datasets with a strong connection to the topic. Both goals are difficult to meet without employing reliable, vast, and costly resources. In this paper, we propose a robust semi-supervised model, SSL-GAN-RoBERTa, that learns from a limited heterogeneous dataset and whose performance is further enhanced by using vast amounts of unlabeled data from another related domain. Compared with the RoBERTa baseline model, the experimental results show that the model has substantial performance gains in terms of Accuracy and Macro-F1 score in different scenarios that use data from different domains. Our proposed model achieves state-of-the-art performance results while efficiently using unlabeled data, showing promising applicability to other complex classification tasks where large amounts of labeled examples are difficult to obtain.
引用
收藏
页码:1161 / 1180
页数:20
相关论文
共 9 条
  • [1] Racism is a Virus: Anti-Asian Hate and Counterspeech in Social Media during the COVID-19 Crisis
    He, Bing
    Ziems, Caleb
    Soni, Sandeep
    Ramakrishnan, Naren
    Yang, Diyi
    Kumar, Srijan
    PROCEEDINGS OF THE 2021 IEEE/ACM INTERNATIONAL CONFERENCE ON ADVANCES IN SOCIAL NETWORKS ANALYSIS AND MINING, ASONAM 2021, 2021, : 90 - 94
  • [2] Semi-Supervised Machine Learning for Analyzing COVID-19 Related Twitter Data for Asian Hate Speech
    Richardson, Caitlin
    Shah, Sandeep
    Yuan, Xiaohong
    2022 21ST IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS, ICMLA, 2022, : 1643 - 1648
  • [3] Anti-Asian Media Labeling in the COVID-19 Pandemic: The Role of Social Identity and Information Accuracy
    Sadri, Sean R. R.
    Billings, Andrew C. C.
    Hakim, Samuel D. D.
    HOWARD JOURNAL OF COMMUNICATIONS, 2024, 35 (02) : 233 - 252
  • [4] Model Minority Mutiny: addressing anti-Asian racism during the COVID-19 pandemic in social work
    Maglalang, Dale Dagar
    Rao, Smitha
    Woo, Bongki
    Wang, Kaipeng
    JOURNAL OF ETHNIC & CULTURAL DIVERSITY IN SOCIAL WORK, 2022, 31 (3-5): : 292 - 301
  • [5] Visible violence, invisible voices: media frameworks of anti-Asian hate in San Francisco and St. Louis during the COVID-19 pandemic
    Ramesh, Nithila
    SOCIOLOGICAL SPECTRUM, 2024, 44 : S27 - S27
  • [6] GIS-based analysis of anti-Asian hate speech and its socioeconomic and ideological drivers in the United States during the early COVID-19 pandemic
    Chia-Yu Wu
    Shao-Yun Chang
    Li-Yin Liu
    Alexander Hohl
    Wu, Chia-Yu (cwu001@udayton.edu), 2025, 90 (01)
  • [7] Exploring Anti-Asian Racism Activism on Twitter during the Early Era of COVID-19 Hate Crimes: Implications for Marketers' Social Purpose Communication Strategy
    Lee, Yoon-Joo
    Haley, Eric
    Shang, Yuanyuan
    JOURNAL OF CURRENT ISSUES AND RESEARCH IN ADVERTISING, 2024, 45 (01): : 88 - 111
  • [8] Progressive domain adaptation for detecting hate speech on social media with small training set and its application to COVID-19 concerned posts
    Md Abul Bashar
    Richi Nayak
    Khanh Luong
    Thirunavukarasu Balasubramaniam
    Social Network Analysis and Mining, 2021, 11
  • [9] Progressive domain adaptation for detecting hate speech on social media with small training set and its application to COVID-19 concerned posts
    Abul Bashar, Md
    Nayak, Richi
    Luong, Khanh
    Balasubramaniam, Thirunavukarasu
    SOCIAL NETWORK ANALYSIS AND MINING, 2021, 11 (01)