Gender identification for Egyptian Arabic dialect in twitter using deep learning models

被引:9
|
作者
ElSayed, Shereen [1 ]
Farouk, Mona [1 ]
机构
[1] Cairo Univ, Fac Engn, Giza, Egypt
关键词
Gender identification; Egyptian Arabic text classification; Deep learning; Natural language processing; Social Media analysis and mining;
D O I
10.1016/j.eij.2020.04.001
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Although the number of Arabic language writers in social media is increasing, the research work targeting Author Profiling (AP) is at the initial development phase. This paper investigates Gender Identification (GI) (male or female) of authors posting Egyptian dialect tweets using Neural Networks (NN) models. Various architectures of NN are explored with extensive parameters' selection such as simple Artificial Neural Network (ANN), Convolutional Neural Network (CNN), Long-Short Term Memory (LSTM), Convolutional Bidirectional Long-Short Term Memory (C-Bi-LSTM) and Convolutional Bidirectional Gated Recurrent Units (C-Bi-GRU) NN which is tuned for the GI problem at hand. The best acquired GI accuracy using C-Bi-GRU multichannel model is 91.37%. It is worth noting that the presence of the bidirectional layer as well as the convolutional layer in the NN models has significantly enhanced the GI accuracy. (C) 2020 Production and hosting by Elsevier B.V. on behalf of Faculty of Computers and Artificial Intelligence, Cairo University.
引用
下载
收藏
页码:159 / 167
页数:9
相关论文
共 50 条
  • [41] Twitter Arabic Sentiment Analysis to Detect Depression Using Machine Learning
    Musleh, Dhiaa A.
    Alkhales, Taef A.
    Almakki, Reem A.
    Alnajim, Shahad E.
    Almarshad, Shaden K.
    Alhasaniah, Rana S.
    Aljameel, Sumayh S.
    Almuqhim, Abdullah A.
    CMC-COMPUTERS MATERIALS & CONTINUA, 2022, 71 (02): : 3463 - 3477
  • [42] Sentiment identification on Twitter using machine learning
    Morales-Castro, Wendy
    Careta, Eduardo Perez
    Rayas, Angelica Hernandez
    Mukhopadhyay, Tirtha Prasad
    Crespo, J. Armando Perez
    Cabrera, Rafael Guzman
    2022 EURO-ASIA CONFERENCE ON FRONTIERS OF COMPUTER SCIENCE AND INFORMATION TECHNOLOGY, FCSIT, 2022, : 28 - 31
  • [43] Deep Learning for Identification of Adverse Effect Mentions in Twitter Data
    Barry, Paul
    Uzuner, Ozlem
    SOCIAL MEDIA MINING FOR HEALTH APPLICATIONS (#SMM4H) WORKSHOP & SHARED TASK, 2019, : 99 - 101
  • [44] Code-mixing unveiled: Enhancing the hate speech detection in Arabic dialect tweets using machine learning models
    Alhazmi, Ali
    Mahmud, Rohana
    Idris, Norisma
    Abo, Mohamed Elhag Mohamed
    Eke, Christopher Ifeanyi
    PLOS ONE, 2024, 19 (07):
  • [45] Emotional Analysis of Arabic Saudi Dialect Tweets Using a Supervised Learning Approach
    AlFutamani, Abeer A.
    Al-Baity, Heyam H.
    INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2021, 29 (01): : 89 - 109
  • [46] Effective Deep Learning Models for Automatic Diacritization of Arabic Text
    Madhfar, Mokthar Ali Hasan
    Qamar, Ali Mustafa
    IEEE ACCESS, 2021, 9 : 273 - 288
  • [47] Analyzing Arabic Twitter-Based Patient Experience Sentiments Using Multi-Dialect Arabic Bidirectional Encoder Representations from Transformers
    Almuhaideb, Sarab
    Alnegheimish, Yasmeen
    Alomar, Taif
    Alsabti, Reem
    Alkathery, Maha
    Alolyyan, Ghala
    CMC-COMPUTERS MATERIALS & CONTINUA, 2023, 76 (01): : 195 - 220
  • [48] MORPHEME-BASED FEATURE-RICH LANGUAGE MODELS USING DEEP NEURAL NETWORKS FOR LVCSR OF EGYPTIAN ARABIC
    Mousa, Amr El-Desoky
    Kuo, Hong-Kwang Jeff
    Mangu, Lidia
    Soltau, Hagen
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 8435 - 8439
  • [49] Using Shallow and Deep Learning to Automatically Detect Hate Motivated by Gender and Sexual Orientation on Twitter in Spanish
    Arcila-Calderon, Carlos
    Amores, Javier J.
    Sanchez-Holgado, Patricia
    Blanco-Herrero, David
    MULTIMODAL TECHNOLOGIES AND INTERACTION, 2021, 5 (10)
  • [50] Detecting racism and xenophobia using deep learning models on Twitter data: CNN, LSTM and BERT
    Benítez-Andrades J.A.
    González-Jiménez Á.
    López-Brea Á.
    Aveleira-Mata J.
    Alija-Pérez J.-M.
    García-Ordás M.T.
    PeerJ Computer Science, 2022, 8