Abusive Language Detection in Online User Content

被引:420
|
作者
Nobata, Chikashi [1 ]
Tetreault, Joel [1 ]
Thomas, Achint [1 ,2 ]
Mehdad, Yashar [1 ]
Chang, Yi [1 ]
机构
[1] Yahoo Labs, Sunnyvale, CA USA
[2] Embibe, Bangalore, Karnataka, India
关键词
NLP; Hate Speech; Abusive Language; Stylistic Classification; Discourse Classification;
D O I
10.1145/2872427.2883062
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Detection of abusive language in user generated online content has become an issue of increasing importance in recent years. Most current commercial methods make use of blacklists and regular expressions, however these measures fall short when contending with more subtle, less ham-fisted examples of hate speech. In this work, we develop a machine learning based method to detect hate speech on online user comments from two domains which outperforms a state-of-the-art deep learning approach. We also develop a corpus of user comments annotated for abusive language, the first of its kind. Finally, we use our detection tool to analyze abusive language over time and in different settings to further enhance our knowledge of this behavior.
引用
收藏
页码:145 / 153
页数:9
相关论文
共 50 条
  • [1] Abusive Content Detection in Online User-Generated Data: A survey
    Kaur, Simrat
    Singh, Sarbjeet
    Kaushal, Sakshi
    [J]. AI IN COMPUTATIONAL LINGUISTICS, 2021, 189 : 274 - 281
  • [2] Improving Abusive Language Detection with online interaction network
    Song, Rui
    Giunchiglia, Fausto
    Shen, Qiang
    Li, Nan
    Xu, Hao
    [J]. INFORMATION PROCESSING & MANAGEMENT, 2022, 59 (05)
  • [3] Abusive Language Detection in Online Conversations by Combining Content- and Graph-Based Features
    Cecillon, Noe
    Labatut, Vincent
    Dufour, Richard
    Linares, Georges
    [J]. FRONTIERS IN BIG DATA, 2019, 2
  • [4] User-aware multilingual abusive content detection in social media
    Rehman, Mohammad Zia Ur
    Mehta, Somya
    Singh, Kuldeep
    Kaushik, Kunal
    Kumar, Nagendra
    [J]. INFORMATION PROCESSING & MANAGEMENT, 2023, 60 (05)
  • [5] Post language and user engagement in online content communities
    Noguti, Valeria
    [J]. EUROPEAN JOURNAL OF MARKETING, 2016, 50 (5-6) : 695 - 723
  • [6] Automatic Detection of Cyberbullying and Abusive Language in Arabic Content on Social Networks: A Survey
    Khairy, Marwa
    Mahmoud, Tarek M.
    Abd-El-Hafeez, Tarek
    [J]. AI IN COMPUTATIONAL LINGUISTICS, 2021, 189 : 156 - 166
  • [7] Deep learning-based approaches for abusive content detection and classification for multi-class online user-generated data
    Kaur S.
    Singh S.
    Kaushal S.
    [J]. International Journal of Cognitive Computing in Engineering, 2024, 5 : 104 - 122
  • [8] On Transfer Learning for Detecting Abusive Language Online
    Uban, Ana-Sabina
    Dinu, Liviu P.
    [J]. ADVANCES IN COMPUTATIONAL INTELLIGENCE, IWANN 2019, PT I, 2019, 11506 : 688 - 700
  • [9] A Review on Detection of Online Abusive Text
    Jain, Bhumika
    Bekal, Chaithra
    PavanKumar, S. P.
    [J]. INVENTIVE COMMUNICATION AND COMPUTATIONAL TECHNOLOGIES, ICICCT 2019, 2020, 89 : 781 - 787
  • [10] Exploiting Emojis for Abusive Language Detection
    Wiegand, Michael
    Ruppenhofer, Josef
    [J]. 16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), 2021, : 369 - 380