Detecting Arabic Offensive Language in Microblogs Using Domain-Specific Word Embeddings and Deep Learning

被引:3
|
作者
Aljuhani, Khulood O. [1 ]
Alyoubi, Khaled H. [1 ]
Alotaibi, Fahd S. [1 ]
机构
[1] King Abdulaziz Univ, Fac Comp & Informat Technol, Informat Syst Dept, Jeddah, Saudi Arabia
来源
TEHNICKI GLASNIK-TECHNICAL JOURNAL | 2022年 / 16卷 / 03期
关键词
Arabic Natural Language Processing; Arabic Tweets; Offensive Language Detection; Offensive Language; Word Embeddings;
D O I
10.31803/tg-20220305120018
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
In recent years, social media networks are emerging as a key player by providing platforms for opinions expression, communication, and content distribution. However, users often take advantage of perceived anonymity on social media platforms to share offensive or hateful content. Thus, offensive language has grown as a significant issue with the increase in online communication and the popularity of social media platforms. This problem has attracted significant attention for devising methods for detecting offensive content and preventing its spread on online social networks. Therefore, this paper aims to develop an effective Arabic offensive language detection model by employing deep learning and semantic and contextual features. This paper proposes a deep learning approach that utilizes the bidirectional long short-term memory (BiLSTM) model and domain-specific word embeddings extracted from an Arabic offensive dataset. The detection approach was evaluated on an Arabic dataset collected from Twitter. The results showed the highest performance accuracy of 0.93% with the BiLSTM model trained using a combination of domain-specific and agnostic-domain word embeddings.
引用
收藏
页码:394 / 400
页数:7
相关论文
共 50 条
  • [1] Lifelong Learning of Topics and Domain-Specific Word Embeddings
    Qin, Xiaorui
    Lu, Yuyin
    Chen, Yufu
    Rao, Yanghui
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 2294 - 2309
  • [2] Expansion of domain-specific opinion lexicons using word embeddings
    Lopez Solaz, Tomas
    Cruz, Fermin L.
    Enriquez, Fernando
    [J]. PROCESAMIENTO DEL LENGUAJE NATURAL, 2016, (57): : 49 - 56
  • [3] Evaluation of Domain-specific Word Embeddings using Knowledge Resources
    Nooralahzadeh, Farhad
    Ovrelid, Lilja
    Lonning, Jan Tore
    [J]. PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 1438 - 1445
  • [4] Domain-specific word embeddings for patent classification
    Risch, Julian
    Krestel, Ralf
    [J]. DATA TECHNOLOGIES AND APPLICATIONS, 2019, 53 (01) : 108 - 122
  • [5] Domain-Specific Word Embeddings with Structure Prediction
    Lassner, David
    Brandl, Stephanie
    Baillot, Anne
    Nakajima, Shinichi
    [J]. TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2023, 11 : 320 - 335
  • [6] Detecting Domain-specific Ambiguities: an NLP Approach based on Wikipedia Crawling and Word Embeddings
    Ferrari, Alessio
    Donati, Beatrice
    Gnesi, Stefania
    [J]. 2017 IEEE 25TH INTERNATIONAL REQUIREMENTS ENGINEERING CONFERENCE WORKSHOPS (REW), 2017, : 393 - 399
  • [7] Learning Domain-Specific Word Embeddings from COVID-19 Tweets
    Aigbe, Steve Aibuedefe
    Eick, Christoph
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2021, : 4307 - 4312
  • [8] Arabic Quran Verses Authentication Using Deep Learning and Word Embeddings
    Touati-Hamad, Zineb
    Laouar, Mohamed Ridda
    Bendib, Issam
    Hakak, Saqib
    [J]. INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2022, 19 (04) : 681 - 688
  • [9] Application-specific word embeddings for hate and offensive language detection
    Claver P. Soto
    Gustavo M. S. Nunes
    José Gabriel R. C. Gomes
    Nadia Nedjah
    [J]. Multimedia Tools and Applications, 2022, 81 : 27111 - 27136
  • [10] Summarization of biomedical articles using domain-specific word embeddings and graph ranking
    Moradi, Milad
    Dashti, Maedeh
    Samwald, Matthias
    [J]. JOURNAL OF BIOMEDICAL INFORMATICS, 2020, 107