AraSenTi-Tweet: A Corpus for Arabic Sentiment Analysis of Saudi Tweets

被引:81
|
作者
Al-Twairesh, Nora [1 ]
Al-Khalifa, Hend [1 ]
Al-Salman, AbdulMalik [1 ]
Al-Ohali, Yousef [1 ]
机构
[1] King Saud Univ, Coll Comp & Informat Sci, Riyadh, Saudi Arabia
关键词
Sentiment Analysis; Arabic NLP; Corpus Sentiment Annotation; Arabic tweets; Saudi Dialect; RESOURCES;
D O I
10.1016/j.procs.2017.10.094
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Arabic Sentiment Analysis is an active research area these days. However, the Arabic language still lacks sufficient language resources to enable the tasks of sentiment analysis. In this paper, we present the details of collecting and constructing a large dataset of Arabic tweets. The techniques used in cleaning and pre-processing the collected dataset are explained. A corpus of Arabic tweets annotated for sentiment analysis was extracted from this dataset. The corpus consists mainly of tweets written in Modern Standard Arabic and the Saudi dialect. The corpus was manually annotated for sentiment. The annotation process is explained in detail and the challenges during the annotation are highlighted. The corpus contains 17,573 tweets labelled with four labels for sentiment: positive, negative, neutral and mixed. Baseline experiments were conducted to provide benchmark results for future work. (c) 2017 The Authors. Published by Elsevier B.V.
引用
收藏
页码:63 / 72
页数:10
相关论文
共 50 条
  • [1] AraCust: a Saudi Telecom Tweets corpus for sentiment analysis
    Almuqren, Latifah
    Cristea, Alexandra
    [J]. PeerJ Computer Science, 2021, 7 : 1 - 30
  • [2] AraCust: a Saudi Telecom Tweets corpus for sentiment analysis
    Almuqren, Latifah
    Cristea, Alexandra
    [J]. PEERJ COMPUTER SCIENCE, 2021,
  • [3] Sentiment Analysis in Arabic Tweets
    Duwairi, R. M.
    Marji, Raed
    Sha'ban, Narmeen
    Rushaidat, Sally
    [J]. 2014 5TH INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION SYSTEMS (ICICS), 2014,
  • [4] Sentiment Analysis of Arabic Tweets in Smart Cities: A Review of Saudi Dialect
    Alotaibi, Shoayee
    Mehmood, Rashid
    Katib, Iyad
    [J]. 2019 FOURTH INTERNATIONAL CONFERENCE ON FOG AND MOBILE EDGE COMPUTING (FMEC), 2019, : 330 - 335
  • [5] Clustering Arabic Tweets for Sentiment Analysis
    Abuaiadah, Diab
    Rajendran, Dileep
    Jarrar, Mustafa
    [J]. 2017 IEEE/ACS 14TH INTERNATIONAL CONFERENCE ON COMPUTER SYSTEMS AND APPLICATIONS (AICCSA), 2017, : 449 - 456
  • [6] Annotation of a Corpus of Tweets for Sentiment Analysis
    dos Santos, Allisfrank
    Barros Junior, Jorge Daniel
    Camargo, Heloisa de Arruda
    [J]. COMPUTATIONAL PROCESSING OF THE PORTUGUESE LANGUAGE, PROPOR 2018, 2018, 11122 : 294 - 302
  • [7] Cyberbullying Detection by Sentiment Analysis of Tweets' Contents Written in Arabic in Saudi Arabia Society
    Almutairi, Amjad Rasmi
    Al-Hagery, Muhammad Abdullah
    [J]. INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2021, 21 (03): : 112 - 119
  • [8] Sentiment lexicon for sentiment analysis of Saudi dialect tweets
    Al-Thubaity, Abdulmohsen
    Alqahtani, Qubayl
    Aljandal, Abdulaziz
    [J]. ARABIC COMPUTATIONAL LINGUISTICS, 2018, 142 : 301 - 307
  • [9] Arabic tweets sentiment analysis - a hybrid scheme
    Aldayel, Haifa K.
    Azmi, Aqil M.
    [J]. JOURNAL OF INFORMATION SCIENCE, 2016, 42 (06) : 782 - 797
  • [10] Sentiment Analysis of Arabic Jordanian Dialect Tweets
    Atoum, Jalal Omer
    Nouman, Mais
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2019, 10 (02) : 256 - 262