A Comparative Study of Pre-trained Word Embeddings for Arabic Sentiment Analysis

被引:1
|
作者
Zouidine, Mohamed [1 ]
Khalil, Mohammed [1 ]
机构
[1] Hassan II Univ Casablanca, LMCSA, FSTM, Casablanca, Morocco
关键词
NLP; Sentiment analysis; Deep learning; CNN; Word embeddings; Word2Vec; FastText; BERT; AraBERT;
D O I
10.1109/COMPSAC54236.2022.00196
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
In this paper, we conduct a series of experiments to systematically study both context-independent and context-dependent word embeddings for the purpose of Arabic sentiment analysis. We use pre-trained word embeddings as fixed features extractors to provide input features for a CNN model. Experimental results with two different Arabic sentiment analysis datasets indicate that the pre-trained contextualized AraBERT model is the most suitable for such tasks. AraBERT reaches an accuracy score of 91.4% and 95.49% on the large Arabic book reviews dataset (LABR) and the hotel Arabic-reviews dataset (HARD), respectively.
引用
收藏
页码:1243 / 1248
页数:6
相关论文
共 50 条
  • [1] Sentiment analysis based on improved pre-trained word embeddings
    Rezaeinia, Seyed Mahdi
    Rahmani, Rouhollah
    Ghodsi, Ali
    Veisi, Hadi
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2019, 117 : 139 - 147
  • [2] Pre-trained Word Embeddings for Arabic Aspect-Based Sentiment Analysis of Airline Tweets
    Ashi, Mohammed Matuq
    Siddiqui, Muazzam Ahmed
    Nadeem, Farrukh
    [J]. PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON ADVANCED INTELLIGENT SYSTEMS AND INFORMATICS 2018, 2019, 845 : 241 - 251
  • [3] AraXLNet: pre-trained language model for sentiment analysis of Arabic
    Alduailej, Alhanouf
    Alothaim, Abdulrahman
    [J]. JOURNAL OF BIG DATA, 2022, 9 (01)
  • [4] AraXLNet: pre-trained language model for sentiment analysis of Arabic
    Alhanouf Alduailej
    Abdulrahman Alothaim
    [J]. Journal of Big Data, 9
  • [5] Evaluating Pre-trained Word Embeddings and Neural Network Architectures for Sentiment Analysis in Spanish Financial Tweets
    Antonio Garcia-Diaz, Jose
    Apolinario-Arzube, Oscar
    Valencia-Garcia, Rafael
    [J]. ADVANCES IN COMPUTATIONAL INTELLIGENCE, MICAI 2020, PT II, 2020, 12469 : 167 - 178
  • [6] Word Embeddings for Arabic Sentiment Analysis
    Altowayan, A. Aziz
    Tao, Lixin
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2016, : 3820 - 3825
  • [7] An Enhanced Sentiment Analysis Framework Based on Pre-Trained Word Embedding
    Mohamed, Ensaf Hussein
    Moussa, Mohammed ElSaid
    Haggag, Mohamed Hassan
    [J]. INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE AND APPLICATIONS, 2020, 19 (04)
  • [8] The impact of using pre-trained word embeddings in Sinhala chatbots
    Gamage, Bimsara
    Pushpananda, Randil
    Weerasinghe, Ruvan
    [J]. 2020 20TH INTERNATIONAL CONFERENCE ON ADVANCES IN ICT FOR EMERGING REGIONS (ICTER-2020), 2020, : 161 - 165
  • [9] Arabic Fake News Detection in Social Media Context Using Word Embeddings and Pre-trained Transformers
    Azzeh, Mohammad
    Qusef, Abdallah
    Alabboushi, Omar
    [J]. ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2024,
  • [10] Disambiguating Clinical Abbreviations using Pre-trained Word Embeddings
    Jaber, Areej
    Martinez, Paloma
    [J]. HEALTHINF: PROCEEDINGS OF THE 14TH INTERNATIONAL JOINT CONFERENCE ON BIOMEDICAL ENGINEERING SYSTEMS AND TECHNOLOGIES - VOL. 5: HEALTHINF, 2021, : 501 - 508