Arabic Sentiment Analysis Based on Word Embeddings and Deep Learning

被引:6
|
作者
Elhassan, Nasrin [1 ]
Varone, Giuseppe [2 ]
Ahmed, Rami [1 ]
Gogate, Mandar [3 ]
Dashtipour, Kia [3 ]
Almoamari, Hani [4 ]
El-Affendi, Mohammed A. [5 ]
Al-Tamimi, Bassam Naji [6 ]
Albalwy, Faisal [7 ,8 ]
Hussain, Amir [3 ]
机构
[1] Sudan Univ Sci & Technol, Coll Comp Sci & Informat Technol, POB 407, Khartoum, Sudan
[2] Northeastern Univ, Dept Phys Therapy Movement & Rehabil Sci, Boston, MA 02115 USA
[3] Edinburgh Napier Univ, Sch Comp, Edinburgh EH10 5DT, Midlothian, Scotland
[4] Islamic Univ Madinah, Fac Comp & Informat Syst, Medina 42351, Saudi Arabia
[5] Prince Sultan Univ, Coll Comp & Informat Sci, Dept Comp Sci, Riyadh 12435, Saudi Arabia
[6] Birmingham City Univ, Sch Comp & Digital Technol, Birmingham B4 7XG, W Midlands, England
[7] Taibah Univ, Coll Comp Sci & Engn, Dept Comp Sci, Madinah 42353, Saudi Arabia
[8] Univ Manchester, Fac Biol Med & Hlth, Sch Hlth Sci, Div Informat Imaging & Data Sci, Stopford Bldg,Oxford Rd, Manchester M13 9PL, England
基金
英国工程与自然科学研究理事会;
关键词
Arabic Sentiment Analysis; Word2Vec; FastText; convolutional neural networks; long short-term memory; recurrent neural networks; HYBRID CNN-LSTM; MODEL;
D O I
10.3390/computers12060126
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Social media networks have grown exponentially over the last two decades, providing the opportunity for users of the internet to communicate and exchange ideas on a variety of topics. The outcome is that opinion mining plays a crucial role in analyzing user opinions and applying these to guide choices, making it one of the most popular areas of research in the field of natural language processing. Despite the fact that several languages, including English, have been the subjects of several studies, not much has been conducted in the area of the Arabic language. The morphological complexities and various dialects of the language make semantic analysis particularly challenging. Moreover, the lack of accurate pre-processing tools and limited resources are constraining factors. This novel study was motivated by the accomplishments of deep learning algorithms and word embeddings in the field of English sentiment analysis. Extensive experiments were conducted based on supervised machine learning in which word embeddings were exploited to determine the sentiment of Arabic reviews. Three deep learning algorithms, convolutional neural networks (CNNs), long short-term memory (LSTM), and a hybrid CNN-LSTM, were introduced. The models used features learned by word embeddings such as Word2Vec and fastText rather than hand-crafted features. The models were tested using two benchmark Arabic datasets: Hotel Arabic Reviews Dataset (HARD) for hotel reviews and Large-Scale Arabic Book Reviews (LARB) for book reviews, with different setups. Comparative experiments utilized the three models with two-word embeddings and different setups of the datasets. The main novelty of this study is to explore the effectiveness of using various word embeddings and different setups of benchmark datasets relating to balance, imbalance, and binary and multi-classification aspects. Findings showed that the best results were obtained in most cases when applying the fastText word embedding using the HARD 2-imbalance dataset for all three proposed models: CNN, LSTM, and CNN-LSTM. Further, the proposed CNN model outperformed the LSTM and CNN-LSTM models for the benchmark HARD dataset by achieving 94.69%, 94.63%, and 94.54% accuracy with fastText, respectively. Although the worst results were obtained for the LABR 3-imbalance dataset using both Word2Vec and FastText, they still outperformed other researchers' state-of-the-art outcomes applying the same dataset.
引用
收藏
页数:21
相关论文
共 50 条
  • [1] Word Embeddings for Arabic Sentiment Analysis
    Altowayan, A. Aziz
    Tao, Lixin
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2016, : 3820 - 3825
  • [2] Learning Word Embeddings for Aspect-Based Sentiment Analysis
    Duc-Hong Pham
    Anh-Cuong Le
    Thi-Kim-Chung Le
    [J]. COMPUTATIONAL LINGUISTICS, PACLING 2017, 2018, 781 : 28 - 40
  • [3] Sentiment classification and aspect-based sentiment analysis on yelp reviews using deep learning and word embeddings
    Alamoudi, Eman Saeed
    Alghamdi, Norah Saleh
    [J]. JOURNAL OF DECISION SYSTEMS, 2021, 30 (2-3) : 259 - 281
  • [4] A Comparative Analysis of Word Embedding and Deep Learning for Arabic Sentiment Classification
    Sabbeh, Sahar F.
    Fasihuddin, Heba A.
    [J]. ELECTRONICS, 2023, 12 (06)
  • [5] Learning emotional word embeddings for sentiment analysis
    Zeng, Qingtian
    Zhao, Xishi
    Hu, Xiaohui
    Duan, Hua
    Zhao, Zhongying
    Li, Chao
    [J]. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2021, 40 (05) : 9515 - 9527
  • [6] A Comparative Study of Pre-trained Word Embeddings for Arabic Sentiment Analysis
    Zouidine, Mohamed
    Khalil, Mohammed
    [J]. 2022 IEEE 46TH ANNUAL COMPUTERS, SOFTWARE, AND APPLICATIONS CONFERENCE (COMPSAC 2022), 2022, : 1243 - 1248
  • [7] Arabic Sentiment Analysis with Federated Deep Learning
    Al-refai, Mohammed
    Alzu'bi, Ahmad
    Yaseen, Naba Bani
    Obeidat, Taymaa
    [J]. ADVANCES IN COMPUTATIONAL INTELLIGENCE SYSTEMS, UKCI 2023, 2024, 1453 : 29 - 38
  • [8] Deep learning in Arabic sentiment analysis: An overview
    Alharbi, Amal
    Taileb, Mounira
    Kalkatawi, Manal
    [J]. JOURNAL OF INFORMATION SCIENCE, 2021, 47 (01) : 129 - 140
  • [9] Arabic Quran Verses Authentication Using Deep Learning and Word Embeddings
    Touati-Hamad, Zineb
    Laouar, Mohamed Ridda
    Bendib, Issam
    Hakak, Saqib
    [J]. INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2022, 19 (04) : 681 - 688
  • [10] Deep learning approaches for Arabic sentiment analysis
    Mohammed, Ammar
    Kora, Rania
    [J]. SOCIAL NETWORK ANALYSIS AND MINING, 2019, 9 (01)