Clustering Arabic Tweets for Sentiment Analysis

被引:12
|
作者
Abuaiadah, Diab [1 ]
Rajendran, Dileep [1 ]
Jarrar, Mustafa [2 ]
机构
[1] Waikato Inst Technol, Ctr Business Informat Technol & Enterprise, Hamilton, New Zealand
[2] Birzeit Univ, Dept Comp Sci, Birzeit, Palestine
关键词
D O I
10.1109/AICCSA.2017.162
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The focus of this study is to evaluate the impact of linguistic preprocessing and similarity functions for clustering Arabic Twitter tweets. The experiments apply an optimized version of the standard K-Means algorithm to assign tweets into positive and negative categories. The results show that root-based stemming has a significant advantage over light stemming in all settings. The Averaged Kullback-Leibler Divergence similarity function clearly outperforms the Cosine, Pearson Correlation, Jaccard Coefficient and Euclidean functions. The combination of the Averaged Kullback-Leibler Divergence and root-based stemming achieved the highest purity of 0.764 while the second-best purity was 0.719. These results are of importance as it is contrary to normal-sized documents where, in many information retrieval applications, light stemming performs better than root based stemming and the Cosine function is commonly used.
引用
收藏
页码:449 / 456
页数:8
相关论文
共 50 条
  • [1] Sentiment Analysis in Arabic Tweets
    Duwairi, R. M.
    Marji, Raed
    Sha'ban, Narmeen
    Rushaidat, Sally
    [J]. 2014 5TH INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION SYSTEMS (ICICS), 2014,
  • [2] Arabic tweets sentiment analysis - a hybrid scheme
    Aldayel, Haifa K.
    Azmi, Aqil M.
    [J]. JOURNAL OF INFORMATION SCIENCE, 2016, 42 (06) : 782 - 797
  • [3] Sentiment Analysis of Arabic Jordanian Dialect Tweets
    Atoum, Jalal Omer
    Nouman, Mais
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2019, 10 (02) : 256 - 262
  • [4] Sentiment Analysis of Modern Standard Arabic and Egyptian Dialectal Arabic Tweets
    El-Naggar, Nadine
    El-Sonbaty, Yasser
    Abou El-Nasr, Mohamad
    [J]. 2017 COMPUTING CONFERENCE, 2017, : 880 - 887
  • [5] Sentiment Analysis of Arabic Tweets: Opinion Target Extraction
    Salima, Behdenna
    Fatiha, Barigou
    Ghalem, Belalem
    [J]. MODELLING AND IMPLEMENTATION OF COMPLEX SYSTEMS, 2019, 64 : 158 - 167
  • [6] Sentiment Analysis of Arabic Tweets using Deep Learning
    Heikal, Maha
    Torki, Marwan
    El-Makky, Nagwa
    [J]. ARABIC COMPUTATIONAL LINGUISTICS, 2018, 142 : 114 - 122
  • [7] Sentiment Analysis on Arabic Tweets: Challenges to Dissecting the Language
    Abdullah, Malak
    Hadzikadic, Mirsad
    [J]. SOCIAL COMPUTING AND SOCIAL MEDIA: APPLICATIONS AND ANALYTICS, SCSM 2017, PT II, 2017, 10283 : 191 - 202
  • [8] Detecting Epidemic Diseases Using Sentiment Analysis of Arabic Tweets
    Baker, Qanita Bani
    Shatnawi, Farah
    Rawashdeh, Saif
    Al-Smadi, Mohammad
    Jararweh, Yaser
    [J]. JOURNAL OF UNIVERSAL COMPUTER SCIENCE, 2020, 26 (01) : 50 - 70
  • [9] Sentiment analysis of Arabic tweets using text mining techniques
    Al-Horaibi, Lamia
    Khan, Muhammad Badruddin
    [J]. FIRST INTERNATIONAL WORKSHOP ON PATTERN RECOGNITION, 2016, 0011
  • [10] Sentiment Analysis Model for Fake News Identification in Arabic Tweets
    Sawan, Aktham
    Thaher, Thaer
    Abu-el-rub, Noor
    [J]. 2021 IEEE 15TH INTERNATIONAL CONFERENCE ON APPLICATION OF INFORMATION AND COMMUNICATION TECHNOLOGIES (AICT2021), 2021,