A performance comparison of supervised machine learning models for Covid-19 tweets sentiment analysis

被引:134
|
作者
Rustam, Furqan [1 ]
Khalid, Madiha [1 ]
Aslam, Waqar [2 ]
Rupapara, Vaibhav [3 ]
Mehmood, Arif [2 ]
Choi, Gyu Sang [4 ]
机构
[1] Khwaja Fareed Univ Engn & Informat Technol, Dept Comp Sci, Rahim Yar Khan, Pakistan
[2] Islamia Univ Bahawalpur, Dept Comp Sci & Informat Technol, Bahawalpur, Punjab, Pakistan
[3] Florida Int Univ, Sch Comp & Informat Sci, Miami, FL 33199 USA
[4] Yeungnam Univ, Dept Informat & Commun Engn, Gyongsan, Gyeongbuk, South Korea
来源
PLOS ONE | 2021年 / 16卷 / 02期
基金
新加坡国家研究基金会;
关键词
CLASSIFICATION;
D O I
10.1371/journal.pone.0245909
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
The spread of Covid-19 has resulted in worldwide health concerns. Social media is increasingly used to share news and opinions about it. A realistic assessment of the situation is necessary to utilize resources optimally and appropriately. In this research, we perform Covid-19 tweets sentiment analysis using a supervised machine learning approach. Identification of Covid-19 sentiments from tweets would allow informed decisions for better handling the current pandemic situation. The used dataset is extracted from Twitter using IDs as provided by the IEEE data port. Tweets are extracted by an in-house built crawler that uses the Tweepy library. The dataset is cleaned using the preprocessing techniques and sentiments are extracted using the TextBlob library. The contribution of this work is the performance evaluation of various machine learning classifiers using our proposed feature set. This set is formed by concatenating the bag-of-words and the term frequency-inverse document frequency. Tweets are classified as positive, neutral, or negative. Performance of classifiers is evaluated on the accuracy, precision, recall, and F-1 score. For completeness, further investigation is made on the dataset using the Long Short-Term Memory (LSTM) architecture of the deep learning model. The results show that Extra Trees Classifiers outperform all other models by achieving a 0.93 accuracy score using our proposed concatenated features set. The LSTM achieves low accuracy as compared to machine learning classifiers. To demonstrate the effectiveness of our proposed feature set, the results are compared with the Vader sentiment analysis technique based on the GloVe feature extraction approach.
引用
下载
收藏
页数:23
相关论文
共 50 条
  • [31] A Deep Learning Approach for Sentiment Classification of COVID-19 Vaccination Tweets
    Said, Haidi
    Tawfik, BenBella S.
    Makhlouf, Mohamed A.
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2023, 14 (04) : 530 - 538
  • [32] Multiclass sentiment analysis on COVID-19-related tweets using deep learning models
    Vernikou, Sotiria
    Lyras, Athanasios
    Kanavos, Andreas
    NEURAL COMPUTING & APPLICATIONS, 2022, 34 (22): : 19615 - 19627
  • [33] Sentiment analysis of epidemiological surveillance reports on COVID-19 in Greece using machine learning models
    Stefanis, Christos
    Giorgi, Elpida
    Kalentzis, Konstantinos
    Tselemponis, Athanasios
    Nena, Evangelia
    Tsigalou, Christina
    Kontogiorgis, Christos
    Kourkoutas, Yiannis
    Chatzak, Ekaterini
    Dokas, Ioannis
    Constantinidis, Theodoros
    Bezirtzoglou, Eugenia
    FRONTIERS IN PUBLIC HEALTH, 2023, 11
  • [34] Sentiment Analysis of Tweets Using Supervised Learning Algorithms
    Mehta, Raj P.
    Sanghvi, Meet A.
    Shah, Darshin K.
    Singh, Artika
    FIRST INTERNATIONAL CONFERENCE ON SUSTAINABLE TECHNOLOGIES FOR COMPUTATIONAL INTELLIGENCE, 2020, 1045 : 323 - 338
  • [35] Performance Analysis of Supervised Machine Learning Techniques for Sentiment Analysis
    Samal, Biswa Ranjan
    Behera, Anil Kumar
    Panda, Mrutyunjaya
    2017 IEEE 3RD INTERNATIONAL CONFERENCE ON SENSING, SIGNAL PROCESSING AND SECURITY (ICSSS), 2017, : 128 - 133
  • [36] Sentiment Analysis of COVID-19 Tweets Using Deep Learning and Lexicon-Based Approaches
    Ainapure, Bharati Sanjay
    Pise, Reshma Nitin
    Reddy, Prathiba
    Appasani, Bhargav
    Srinivasulu, Avireni
    Khan, Mohammad S. S.
    Bizon, Nicu
    SUSTAINABILITY, 2023, 15 (03)
  • [37] Arabic Tweets Sentiment Analysis about Online Learning during COVID-19 in Saudi Arabia
    Althagafi, Asma
    Althobaiti, Ghofran
    Alhakami, Hosam
    Alsubait, Tahani
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2021, 12 (03) : 620 - 625
  • [38] A novel fusion-based deep learning model for sentiment analysis of COVID-19 tweets
    Basiri, Mohammad Ehsan
    Nemati, Shahla
    Abdar, Moloud
    Asadi, Somayeh
    Acharrya, U. Rajendra
    KNOWLEDGE-BASED SYSTEMS, 2021, 228
  • [39] Sentiment analysis of tweets about COVID-19 disease during pandemic
    Matosevic, Goran
    Bevanda, Vanja
    2020 43RD INTERNATIONAL CONVENTION ON INFORMATION, COMMUNICATION AND ELECTRONIC TECHNOLOGY (MIPRO 2020), 2020, : 1290 - 1295
  • [40] Analysis and Prediction of User Sentiment on COVID-19 Pandemic Using Tweets
    Yeasmin, Nilufa
    Mahbub, Nosin Ibna
    Baowaly, Mrinal Kanti
    Singh, Bikash Chandra
    Alom, Zulfikar
    Aung, Zeyar
    Azim, Mohammad Abdul
    BIG DATA AND COGNITIVE COMPUTING, 2022, 6 (02)