Clustering Arabic Tweets for Sentiment Analysis

被引:12
|
作者
Abuaiadah, Diab [1 ]
Rajendran, Dileep [1 ]
Jarrar, Mustafa [2 ]
机构
[1] Waikato Inst Technol, Ctr Business Informat Technol & Enterprise, Hamilton, New Zealand
[2] Birzeit Univ, Dept Comp Sci, Birzeit, Palestine
关键词
D O I
10.1109/AICCSA.2017.162
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The focus of this study is to evaluate the impact of linguistic preprocessing and similarity functions for clustering Arabic Twitter tweets. The experiments apply an optimized version of the standard K-Means algorithm to assign tweets into positive and negative categories. The results show that root-based stemming has a significant advantage over light stemming in all settings. The Averaged Kullback-Leibler Divergence similarity function clearly outperforms the Cosine, Pearson Correlation, Jaccard Coefficient and Euclidean functions. The combination of the Averaged Kullback-Leibler Divergence and root-based stemming achieved the highest purity of 0.764 while the second-best purity was 0.719. These results are of importance as it is contrary to normal-sized documents where, in many information retrieval applications, light stemming performs better than root based stemming and the Cosine function is commonly used.
引用
收藏
页码:449 / 456
页数:8
相关论文
共 50 条
  • [41] Sentiment Analysis on Naija-Tweets
    Kolajo, Taiwo
    Daramola, Olawande
    Adebiyi, Ayodele
    [J]. 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019:): STUDENT RESEARCH WORKSHOP, 2019, : 338 - 343
  • [42] Annotation of a Corpus of Tweets for Sentiment Analysis
    dos Santos, Allisfrank
    Barros Junior, Jorge Daniel
    Camargo, Heloisa de Arruda
    [J]. COMPUTATIONAL PROCESSING OF THE PORTUGUESE LANGUAGE, PROPOR 2018, 2018, 11122 : 294 - 302
  • [43] Classification of Tweets for sentiment and Trend Analysis
    Arulselvi, Christiyana A.
    Sendhilkumar, S.
    Mahalakshmi, S.
    [J]. 2017 INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND CONTROL SYSTEMS (ICICCS), 2017, : 566 - 573
  • [44] A BERT Framework to Sentiment Analysis of Tweets
    Bello, Abayomi
    Ng, Sin-Chun
    Leung, Man-Fai
    [J]. SENSORS, 2023, 23 (01)
  • [45] Sentiment Analysis on Tweets for Social Events
    Zhou, Xujuan
    Tao, Xiaohui
    Yong, Jianming
    Yang, Zhenyu
    [J]. PROCEEDINGS OF THE 2013 IEEE 17TH INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN (CSCWD), 2013, : 557 - 562
  • [46] Sentiment analysis of tweets on prior authorization
    Prasanna, Shivika
    Premnath, Naveen
    Angraal, Suveen
    Sedhom, Ramy
    Khera, Rohan
    Parsons, Helen
    Hussaini, Syed
    Johnson, Pamela T.
    Lou, Emil
    Beg, Muhammad Shaalan
    Subbiah, Ishwaria Mohan
    Rao, Praveen
    Gupta, Arjun
    [J]. JOURNAL OF CLINICAL ONCOLOGY, 2021, 39 (28)
  • [47] Sentiment Analysis of Tweets Using Semantic Analysis
    Kale, Snehal
    Padmadas, Vijaya
    [J]. 2017 INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION, CONTROL AND AUTOMATION (ICCUBEA), 2017,
  • [48] Sentiment Analysis of Tweets on Soda Taxes
    An, Ruopeng
    Yang, Yuyi
    Batcheller, Quinlan
    Zhou, Qianzi
    [J]. JOURNAL OF PUBLIC HEALTH MANAGEMENT AND PRACTICE, 2023, 29 (05): : 633 - 639
  • [49] Arabic Tweets Sentiment Analysis about Online Learning during COVID-19 in Saudi Arabia
    Althagafi, Asma
    Althobaiti, Ghofran
    Alhakami, Hosam
    Alsubait, Tahani
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2021, 12 (03) : 620 - 625
  • [50] Leveraging sentiment analysis of Arabic Tweets for the 2022 FIFA World Cup insights, incorporating the gulf region
    Ishac, Wadih
    Javani, Vajiheh
    Youssef, Daoud
    [J]. MANAGING SPORT AND LEISURE, 2024,