Sentiment lexicon for sentiment analysis of Saudi dialect tweets

被引:22
|
作者
Al-Thubaity, Abdulmohsen [1 ]
Alqahtani, Qubayl [2 ]
Aljandal, Abdulaziz [2 ]
机构
[1] King Abdulaziz City Sci & Technol, Riyadh, Saudi Arabia
[2] King Saud Univ, AlMuzahmiyah Branch, Riyadh, Saudi Arabia
来源
关键词
Arabic sentiment analysis; Arabic sentiment lexicon; Arabic text mining; Arabic language resources; Saudi dialect; EXTRACTION; REPUTATION;
D O I
10.1016/j.procs.2018.10.494
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Twitter is one of the most widely used social media platforms in Saudi Arabia and is a rich source for mining the public's attitude towards political, social, and economic matters. Sentiment analysis is a technique used for identifying the polarity (positive, negative, or neutral) of a given tweet, using either machine learning approaches or sentiment lexicons. This paper presents two resources. The first is the Saudi dialect sentiment lexicon (SauDiSenti), which is a sentiment lexicon for sentiment analysis of Saudi dialect tweets. SauDiSenti comprises 4431 words and phrases from modem standard Arabic (MSA) and Saudi dialects manually extracted from a previously labelled dataset of tweets obtained from trending hashtags in Saudi Arabia. The second is a testing dataset comprising 1500 tweets evenly distributed over three classes: positive, negative, and neutral. To evaluate the performance of SauDiSenti, we used precision, recall, and F measure and compared it to AraSenTi a larger Arabic sentiment dictionary. The data suggest that AraSenTi outperforms SauDiSenti only when both positive and negative tweets are considered, whereas SauDiSenti outperforms AraSenTi when positive, negative, and neutral tweets are considered. Despite the small size of SauDiSenti, its use for sentiment analysis of Saudi dialect tweets shows promising results in comparison to the automatically constructed larger dictionary AraSenTi. SauDiSenti and the testing dataset are available for download at http://corpus.kacstedu.sa/more_info.jsp. (C) 2018 The Authors. Published by Elsevier B.V.
引用
收藏
页码:301 / 307
页数:7
相关论文
共 50 条
  • [1] Developing Lexicon-based Algorithms and Sentiment Lexicon for Sentiment Analysis of Saudi Dialect Tweets
    Al-Ghaith, Waleed
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2019, 10 (11) : 83 - 88
  • [2] Developing lexicon-based algorithms and sentiment lexicon for sentiment analysis of saudi dialect tweets
    Al-Ghaith, Waleed
    [J]. International Journal of Advanced Computer Science and Applications, 2019, 10 (11): : 83 - 88
  • [3] Sentiment Analysis of Arabic Tweets in Smart Cities: A Review of Saudi Dialect
    Alotaibi, Shoayee
    Mehmood, Rashid
    Katib, Iyad
    [J]. 2019 FOURTH INTERNATIONAL CONFERENCE ON FOG AND MOBILE EDGE COMPUTING (FMEC), 2019, : 330 - 335
  • [4] Real-Time Sentiment Analysis of Saudi Dialect Tweets Using SPARK
    Assiri, Adel
    Emam, Ahmed
    Al-dossari, Hmood
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2016, : 3947 - 3950
  • [5] Sentiment Analysis of Arabic Jordanian Dialect Tweets
    Atoum, Jalal Omer
    Nouman, Mais
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2019, 10 (02) : 256 - 262
  • [6] Towards enhancement of a lexicon-based approach for Saudi dialect sentiment analysis
    Assiri, Adel
    Emam, Ahmed
    Al-Dossari, Hmood
    [J]. JOURNAL OF INFORMATION SCIENCE, 2018, 44 (02) : 184 - 202
  • [7] AraCust: a Saudi Telecom Tweets corpus for sentiment analysis
    Almuqren, Latifah
    Cristea, Alexandra
    [J]. PeerJ Computer Science, 2021, 7 : 1 - 30
  • [8] AraCust: a Saudi Telecom Tweets corpus for sentiment analysis
    Almuqren, Latifah
    Cristea, Alexandra
    [J]. PEERJ COMPUTER SCIENCE, 2021,
  • [9] Lexicon-Based Sentiment Analysis for Movie Review Tweets
    Azizan, Azilawati
    Jamal, Nurul Najwa S. K. Abdul
    Abdullah, Mohammad Nasir
    Mohamad, Masurah
    Khairuddin, Nurkhairizan
    [J]. 2019 1ST INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND DATA SCIENCES (AIDAS2019), 2019, : 132 - 136
  • [10] A Saudi Dialect Twitter Corpus for Sentiment and Emotion Analysis
    Al-Thubaity, Abdulmohsen
    Alharbi, Mohammed
    Alqahtani, Saif
    Aljandal, Abdulrahman
    [J]. 2018 21ST SAUDI COMPUTER SOCIETY NATIONAL COMPUTER CONFERENCE (NCC), 2018,