Gender identification for Egyptian Arabic dialect in twitter using deep learning models

被引:9
|
作者
ElSayed, Shereen [1 ]
Farouk, Mona [1 ]
机构
[1] Cairo Univ, Fac Engn, Giza, Egypt
关键词
Gender identification; Egyptian Arabic text classification; Deep learning; Natural language processing; Social Media analysis and mining;
D O I
10.1016/j.eij.2020.04.001
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Although the number of Arabic language writers in social media is increasing, the research work targeting Author Profiling (AP) is at the initial development phase. This paper investigates Gender Identification (GI) (male or female) of authors posting Egyptian dialect tweets using Neural Networks (NN) models. Various architectures of NN are explored with extensive parameters' selection such as simple Artificial Neural Network (ANN), Convolutional Neural Network (CNN), Long-Short Term Memory (LSTM), Convolutional Bidirectional Long-Short Term Memory (C-Bi-LSTM) and Convolutional Bidirectional Gated Recurrent Units (C-Bi-GRU) NN which is tuned for the GI problem at hand. The best acquired GI accuracy using C-Bi-GRU multichannel model is 91.37%. It is worth noting that the presence of the bidirectional layer as well as the convolutional layer in the NN models has significantly enhanced the GI accuracy. (C) 2020 Production and hosting by Elsevier B.V. on behalf of Faculty of Computers and Artificial Intelligence, Cairo University.
引用
下载
收藏
页码:159 / 167
页数:9
相关论文
共 50 条
  • [1] Gender identification of egyptian dialect in twitter
    Husseina, Shereen
    Farouk, Mona
    Hemayed, ElSayed
    EGYPTIAN INFORMATICS JOURNAL, 2019, 20 (02) : 109 - 116
  • [2] Automatic Arabic Dialect Classification Using Deep Learning Models
    Lulu, Leena
    Elnagar, Ashraf
    ARABIC COMPUTATIONAL LINGUISTICS, 2018, 142 : 262 - 269
  • [3] Hierarchical Deep Learning for Arabic Dialect Identification
    de Francony, Gael
    Guichard, Victor
    Joshi, Praveen
    Afli, Haithem
    Bouchekif, Abdessalam
    FOURTH ARABIC NATURAL LANGUAGE PROCESSING WORKSHOP (WANLP 2019), 2019, : 249 - 253
  • [4] Arabic Dialect Identification for Travel and Twitter Text
    Mishra, Pruthwik
    Mujadia, Vandan
    FOURTH ARABIC NATURAL LANGUAGE PROCESSING WORKSHOP (WANLP 2019), 2019, : 234 - 238
  • [5] Location Analysis for Arabic COVID-19 Twitter Data Using Enhanced Dialect Identification Models
    Essam, Nader
    Moussa, Abdullah M.
    Elsayed, Khaled M.
    Abdou, Sherif
    Rashwan, Mohsen
    Khatoon, Shaheen
    Hasan, Md. Maruf
    Asif, Amna
    Alshamari, Majed A.
    APPLIED SCIENCES-BASEL, 2021, 11 (23):
  • [6] Arabic Dialect Identification with Deep Learning and Hybrid Frequency Based Features
    Fares, Youssef
    El-Zanaty, Zeyad
    Abdel-Salam, Kareem
    Ezzeldin, Muhammed
    Mohamed, Aliaa
    El-Awaad, Karim
    Torki, Marwan
    FOURTH ARABIC NATURAL LANGUAGE PROCESSING WORKSHOP (WANLP 2019), 2019, : 224 - 228
  • [7] Learning Intonation Pattern Embeddings for Arabic Dialect Identification
    Alvarez, Aitor Arronte
    Issa, Elsayed Sabry Abdelaal
    INTERSPEECH 2020, 2020, : 472 - 476
  • [8] Using Prosody and Phonotactics in Arabic Dialect Identification
    Biadsy, Fadi
    Hirschberg, Julia
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 208 - 211
  • [9] Arabic text classification using deep learning models
    Elnagar, Ashraf
    Al-Debsi, Ridhwan
    Einea, Omar
    INFORMATION PROCESSING & MANAGEMENT, 2020, 57 (01)
  • [10] Biological gender identification in Turkish news text using deep learning models
    Pınar Tüfekci
    Melike Bektaş Kösesoy
    Multimedia Tools and Applications, 2024, 83 : 50669 - 50689