Multi-label Hate Speech and Abusive Language Detection in Indonesian Twitter

被引:0
|
作者
Ibrohim, Muhammad Okky [1 ]
Budi, Indra [1 ]
机构
[1] Univ Indonesia, Fac Comp Sci, Kampus UI, Depok 16424, Indonesia
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Hate speech and abusive language spreading on social media need to be detected automatically to avoid conflicts between citizens. Moreover, hate speech has a target, category, and level that also need to be detected to help the authority in prioritizing which hate speech must be addressed immediately. This research discusses multi-label text classification for abusive language and hate speech detection including detecting the target, category, and level of hate speech in Indonesian Twitter using machine learning approaches with Support Vector Machine (SVM), Naive Bayes (NB), and Random Forest Decision Tree (RFDT) classifier and Binary Relevance (BR), Label Power-set (LP), and Classifier Chains (CC) as the data transformation method. We used several kinds of feature extractions which are term frequency, orthography, and lexicon features. Our experiment results show that in general the RFDT classifier using LP as the transformation method gives the best accuracy with fast computational time.
引用
收藏
页码:46 / 57
页数:12
相关论文
共 50 条
  • [41] Asian hate speech detection on Twitter during COVID-19
    Toliyat, Amir
    Levitan, Sarah Ita
    Peng, Zheng
    Etemadpour, Ronak
    FRONTIERS IN ARTIFICIAL INTELLIGENCE, 2022, 5
  • [42] Hate Speech Detection in Indonesian Language on Instagram Comment Section Using Deep Neural Network Classification Method
    Perdana, Sakti Putra B. B.
    Irawan, Budhi
    Setianingsih, Casi
    2019 IEEE ASIA PACIFIC CONFERENCE ON WIRELESS AND MOBILE (APWIMOB), 2019, : 143 - 149
  • [43] Hate Speech Detection on Indonesian Instagram Comments using FastText Approach
    Pratiwi, Nur Indah
    Budi, Indra
    Alfina, Ika
    2018 INTERNATIONAL CONFERENCE ON ADVANCED COMPUTER SCIENCE AND INFORMATION SYSTEMS (ICACSIS), 2018, : 447 - 450
  • [44] A comparison of text preprocessing techniques for hate and offensive speech detection in Twitter
    Anna Glazkova
    Social Network Analysis and Mining, 13
  • [45] Corpus Building for Hate Speech Detection of Gujarati Language
    Vadesara, Abhilasha
    Tanna, Purna
    SOFT COMPUTING AND ITS ENGINEERING APPLICATIONS, ICSOFTCOMP 2022, 2023, 1788 : 382 - 395
  • [46] Analyzing Travel Behavior Using Multi-label Classification From Twitter
    Takahashi, Kazuki
    Kato, Daiju
    Endo, Masaki
    Araki, Tetsuya
    Hirota, Masaharu
    Ishikawa, Hiroshi
    9TH INTERNATIONAL CONFERENCE ON MANAGEMENT OF EMERGENT DIGITAL ECOSYSTEMS (MEDES 2017), 2017, : 50 - 56
  • [47] Hate Speech Detection in Social Media for the Kurdish Language
    Saeed, Ari M.
    Ismael, Aso N.
    Rasul, Danya L.
    Majeed, Rayan S.
    Rashid, Tarik A.
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON INNOVATIONS IN COMPUTING RESEARCH (ICR'22), 2022, 1431 : 253 - 260
  • [48] Hate speech detection in the Bengali language: a comprehensive survey
    Al Maruf, Abdullah
    Abidin, Ahmad Jainul
    Haque, Md. Mahmudul
    Jiyad, Zakaria Masud
    Golder, Aditi
    Alubady, Raaid
    Aung, Zeyar
    JOURNAL OF BIG DATA, 2024, 11 (01)
  • [49] Exploring Multi-label Stacking in Natural Language Processing
    Nunes, Rodrigo Mansueli
    Domingues, Marcos Aurelio
    Feltrim, Valeria Delisandra
    PROGRESS IN ARTIFICIAL INTELLIGENCE, PT II, 2019, 11805 : 708 - 718
  • [50] Multi-modal Multi-label Emotion Detection with Modality and Label Dependence
    Dong Zhang
    Ju, Xincheng
    Li, Junhui
    Li, Shoushan
    Zhu, Qiaoming
    Zhou, Guodong
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 3584 - 3593