Spam detection on social networks using deep contextualized word representation

被引:8
|
作者
Ghanem, Razan [1 ]
Erbay, Hasan [2 ]
机构
[1] Kirikkale Univ, Dept Comp Engn, Kirikkale, Turkey
[2] Univ Turkish Aeronaut Assoc, Dept Comp Engn, Ankara, Turkey
关键词
Spam detection; Deep learning; Word embedding; Recurrent neural network; Embedding from language model; ACCOUNTS;
D O I
10.1007/s11042-022-13397-8
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Spam detection on social networks, considered a short text classification problem, is a challenging task in natural language processing due to the sparsity and ambiguity of the text. One of the key tasks to address this problem is a powerful text representation. Traditional word embedding models solve the data sparsity problem by representing words with dense vectors, but these models have some limitations that prevent them from handling some problems effectively. The most common limitation is the "out of vocabulary" problem, in which the models fail to provide any vector representation for the words that are not present in the model's dictionary. Another problem these models face is the independence from the context, in which the models output just one vector for each word regardless of the position of the word in the sentence. To overcome these problems, we propose to build a new model based on deep contextualized word representation, consequently, in this study, we develop CBLSTM (Contextualized Bi-directional Long Short Term Memory neural network), a novel deep learning architecture based on bidirectional long short term neural network with embedding from language models, to address the spam texts problem on social networks. The experimental results on three benchmark datasets show that our proposed method achieves high accuracy and outperforms the existing state-of-the-art methods to detect spam on social networks.
引用
收藏
页码:3697 / 3712
页数:16
相关论文
共 50 条
  • [1] Spam detection on social networks using deep contextualized word representation
    Razan Ghanem
    Hasan Erbay
    [J]. Multimedia Tools and Applications, 2023, 82 : 3697 - 3712
  • [2] Spam detection in online social networks by deep learning
    Ameen, Aso Khaleel
    Kaya, Buket
    [J]. 2018 INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND DATA PROCESSING (IDAP), 2018,
  • [3] Beyond Word-Based Model Embeddings: Contextualized Representations for Enhanced Social Media Spam Detection
    Alshattnawi, Sawsan
    Shatnawi, Amani
    AlSobeh, Anas M. R.
    Magableh, Aws A.
    [J]. APPLIED SCIENCES-BASEL, 2024, 14 (06):
  • [4] Using Word Embeddings and Deep Learning for Supervised Topic Detection in Social Networks
    Gutierrez-Batista, Karel
    Campana, Jesus R.
    Vila, Maria-Amparo
    Martin-Bautista, Maria J.
    [J]. FLEXIBLE QUERY ANSWERING SYSTEMS, 2019, 11529 : 155 - 165
  • [5] Investigating Maps of Science Using Contextual Proximity of Citations Based on Deep Contextualized Word Representation
    Roman, Muhammad
    Shahid, Abdul
    Khan, Shafiullah
    Yu, Lisu
    Asif, Muhammad
    Ghadi, Yazeed Yasin
    [J]. IEEE ACCESS, 2022, 10 : 31397 - 31419
  • [6] Spam Detection In Social Networks: A Review
    Eshraqi, Nasim
    Jalali, Mehrdad
    Moattar, Mohammad Hossein
    [J]. SECOND INTERNATIONAL CONGRESS ON TECHNOLOGY, COMMUNICATION AND KNOWLEDGE (ICTCK 2015), 2015, : 148 - 152
  • [7] Semantic Representation Based on Deep Learning for Spam Detection
    Saidani, Nadjate
    Adi, Kamel
    Allili, Mohand Said
    [J]. FOUNDATIONS AND PRACTICE OF SECURITY, FPS 2019, 2020, 12056 : 72 - 81
  • [8] Deep Learning Empowered Cybersecurity Spam Bot Detection for Online Social Networks
    Al Duhayyim, Mesfer
    Alshahrani, Haya Mesfer
    Al-Wesabi, Fahd N.
    Alamgeer, Mohammed
    Hilal, Anwer Mustafa
    Rizwanullah, Mohammed
    [J]. CMC-COMPUTERS MATERIALS & CONTINUA, 2022, 70 (03): : 6257 - 6270
  • [9] Multistage and Elastic Spam Detection in Mobile Social Networks through Deep Learning
    Feng, Bo
    Fu, Qiang
    Dong, Mianxiong
    Guo, Dong
    Li, Qiang
    [J]. IEEE NETWORK, 2018, 32 (04): : 15 - 21
  • [10] IMPROVING SPOKEN QUESTION ANSWERING USING CONTEXTUALIZED WORD REPRESENTATION
    Su, Dan
    Fung, Pascale
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 8004 - 8008