Multi-label Hate Speech and Abusive Language Detection in Indonesian Twitter

被引:0
|
作者
Ibrohim, Muhammad Okky [1 ]
Budi, Indra [1 ]
机构
[1] Univ Indonesia, Fac Comp Sci, Kampus UI, Depok 16424, Indonesia
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Hate speech and abusive language spreading on social media need to be detected automatically to avoid conflicts between citizens. Moreover, hate speech has a target, category, and level that also need to be detected to help the authority in prioritizing which hate speech must be addressed immediately. This research discusses multi-label text classification for abusive language and hate speech detection including detecting the target, category, and level of hate speech in Indonesian Twitter using machine learning approaches with Support Vector Machine (SVM), Naive Bayes (NB), and Random Forest Decision Tree (RFDT) classifier and Binary Relevance (BR), Label Power-set (LP), and Classifier Chains (CC) as the data transformation method. We used several kinds of feature extractions which are term frequency, orthography, and lexicon features. Our experiment results show that in general the RFDT classifier using LP as the transformation method gives the best accuracy with fast computational time.
引用
收藏
页码:46 / 57
页数:12
相关论文
共 50 条
  • [1] Multi-label Classification for Hate Speech and Abusive Language in Indonesian-Local Languages
    Asti, Ajeng Dwi
    Budi, Indra
    Ibrohim, Muhammad Okky
    13TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTER SCIENCE AND INFORMATION SYSTEMS (ICACSIS 2021), 2021, : 325 - 330
  • [2] Multi-label text classification on unbalanced Twitter with monolingual model and hyperparameter optimization for hate speech and abusive language detection
    Alzahrani, Ahmad A.
    Bramantoro, Arif
    Permana, Asep
    INTERNATIONAL JOURNAL OF ADVANCED AND APPLIED SCIENCES, 2024, 11 (05): : 177 - 185
  • [3] Separating Hate Speech from Abusive Language on Indonesian Twitter
    Ibrahim, Muhammad Amien
    Sagala, Noviyanti Tri Maretta
    Arifin, Samsul
    Nariswari, Rinda
    Murnaka, Nerru Pranuta
    Prasetyo, Puguh Wahyu
    2022 INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ITS APPLICATIONS (ICODSA), 2022, : 187 - 191
  • [4] ETHOS: a multi-label hate speech detection dataset
    Mollas, Ioannis
    Chrysopoulou, Zoe
    Karlos, Stamatis
    Tsoumakas, Grigorios
    COMPLEX & INTELLIGENT SYSTEMS, 2022, 8 (06) : 4663 - 4678
  • [5] ETHOS: a multi-label hate speech detection dataset
    Ioannis Mollas
    Zoe Chrysopoulou
    Stamatis Karlos
    Grigorios Tsoumakas
    Complex & Intelligent Systems, 2022, 8 : 4663 - 4678
  • [6] Hate speech and abusive language detection in Indonesian social media: Progress and challenges
    Ibrohim, Muhammad Okky
    Budi, Indra
    HELIYON, 2023, 9 (08)
  • [7] Identification of Hate Speech and Abusive Language on Indonesian Twitter Using theWord2vec, Part of Speech and Emoji Features
    Ibrohim, Muhammad Okky
    Setiadi, Muhammad Akbar
    Budi, Indra
    PROCEEDINGS OF THE 1ST INTERNATIONAL CONFERENCE ON ADVANCED INFORMATION SCIENCE AND SYSTEM, AISS 2019, 2019,
  • [8] L-HSAB: A Levantine Twitter Dataset for Hate Speech and Abusive Language
    Mulki, Hala
    Haddad, Hatem
    Ali, Chedi Bechikh
    Alshabani, Halima
    THIRD WORKSHOP ON ABUSIVE LANGUAGE ONLINE, 2019, : 111 - 118
  • [9] Racial Bias in Hate Speech and Abusive Language Detection Datasets
    Davidson, Thomas
    Bhattacharya, Debasmita
    Weber, Ingmar
    THIRD WORKSHOP ON ABUSIVE LANGUAGE ONLINE, 2019, : 25 - 35
  • [10] Hate Speech Detection in the Indonesian Language: A Dataset and Preliminary Study
    Alfina, Ika
    Mulia, Rio
    Fanany, Mohamad Ivan
    Ekanata, Yudo
    2017 INTERNATIONAL CONFERENCE ON ADVANCED COMPUTER SCIENCE AND INFORMATION SYSTEMS (ICACSIS), 2017, : 233 - 237