A performance comparison of supervised machine learning models for Covid-19 tweets sentiment analysis

被引:134
|
作者
Rustam, Furqan [1 ]
Khalid, Madiha [1 ]
Aslam, Waqar [2 ]
Rupapara, Vaibhav [3 ]
Mehmood, Arif [2 ]
Choi, Gyu Sang [4 ]
机构
[1] Khwaja Fareed Univ Engn & Informat Technol, Dept Comp Sci, Rahim Yar Khan, Pakistan
[2] Islamia Univ Bahawalpur, Dept Comp Sci & Informat Technol, Bahawalpur, Punjab, Pakistan
[3] Florida Int Univ, Sch Comp & Informat Sci, Miami, FL 33199 USA
[4] Yeungnam Univ, Dept Informat & Commun Engn, Gyongsan, Gyeongbuk, South Korea
来源
PLOS ONE | 2021年 / 16卷 / 02期
基金
新加坡国家研究基金会;
关键词
CLASSIFICATION;
D O I
10.1371/journal.pone.0245909
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
The spread of Covid-19 has resulted in worldwide health concerns. Social media is increasingly used to share news and opinions about it. A realistic assessment of the situation is necessary to utilize resources optimally and appropriately. In this research, we perform Covid-19 tweets sentiment analysis using a supervised machine learning approach. Identification of Covid-19 sentiments from tweets would allow informed decisions for better handling the current pandemic situation. The used dataset is extracted from Twitter using IDs as provided by the IEEE data port. Tweets are extracted by an in-house built crawler that uses the Tweepy library. The dataset is cleaned using the preprocessing techniques and sentiments are extracted using the TextBlob library. The contribution of this work is the performance evaluation of various machine learning classifiers using our proposed feature set. This set is formed by concatenating the bag-of-words and the term frequency-inverse document frequency. Tweets are classified as positive, neutral, or negative. Performance of classifiers is evaluated on the accuracy, precision, recall, and F-1 score. For completeness, further investigation is made on the dataset using the Long Short-Term Memory (LSTM) architecture of the deep learning model. The results show that Extra Trees Classifiers outperform all other models by achieving a 0.93 accuracy score using our proposed concatenated features set. The LSTM achieves low accuracy as compared to machine learning classifiers. To demonstrate the effectiveness of our proposed feature set, the results are compared with the Vader sentiment analysis technique based on the GloVe feature extraction approach.
引用
收藏
页数:23
相关论文
共 50 条
  • [1] Sentiment Analysis of COVID-19 Tweets by Machine Learning and Deep Learning Classifiers
    Jain, Ritanshi
    Bawa, Seema
    Sharma, Seemu
    ADVANCES IN DATA AND INFORMATION SCIENCES, 2022, 318 : 329 - 339
  • [2] NLP and Machine Learning for Sentiment Analysis in COVID-19 Tweets: A Comparative Study
    Shaik, Shahedhadeennisa
    Chaitra, S.P.
    EAI Endorsed Transactions on Pervasive Health and Technology, 2024, 10
  • [3] Sentiment Analysis on COVID-19 Vaccine Tweets using Machine Learning and Deep Learning Algorithms
    Jain, Tarun
    Verma, Vivek Kumar
    Sharma, Akhilesh Kumar
    Saini, Bhavna
    Purohit, Nishant
    Mahdin, Hairulnizam
    Ahmad, Masitah
    Darman, Rozanawati
    Haw, Su-Cheng
    Shaharudin, Shazlyn Milleana
    Arshad, Mohammad Syafwan
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2023, 14 (05) : 32 - 41
  • [4] COVID-19 Public Sentiment Insights and Machine Learning for Tweets Classification
    Samuel, Jim
    Ali, G. G. Md Nawaz
    Rahman, Md Mokhlesur
    Esawi, Ek
    Samuel, Yana
    INFORMATION, 2020, 11 (06)
  • [5] Multi-Class Sentiment Analysis of COVID-19 Tweets by Machine Learning and Deep Learning Approaches
    Moustafa, Maaskri
    Mokhtar-Mostefaoui, Sid Ahmed
    Hadj-Meghazi, Madani
    Goismi, Mohamed
    COMPUTACION Y SISTEMAS, 2024, 28 (02): : 507 - 516
  • [6] COVID-19 Sentiment Analysis Based on Tweets
    La Gatta, Valerio
    Moscato, Vincenzo
    Postiglione, Marco
    Sperli, Giancarlo
    IEEE INTELLIGENT SYSTEMS, 2023, 38 (03) : 51 - 55
  • [7] Comparative analysis of machine learning-based classification models using sentiment classification of tweets related to COVID-19 pandemic
    Gulati, Kamal
    Kumar, S. Saravana
    Boddu, Raja Sarath Kumar
    Sarvakar, Ketan
    Sharma, Dilip Kumar
    Nomani, M. Z. M.
    MATERIALS TODAY-PROCEEDINGS, 2022, 51 : 38 - 41
  • [8] Leveraging machine learning to analyze sentiment from COVID-19 tweets: A global perspective
    Rahman, Md Mahbubar
    Khan, Nafiz Imtiaz
    Sarker, Iqbal H.
    Ahmed, Mohiuddin
    Islam, Muhammad Nazrul
    ENGINEERING REPORTS, 2023, 5 (03)
  • [9] Sentiment analysis and causal learning of COVID-19 tweets prior to the rollout of vaccines
    Zhang, Qihuang
    Yi, Grace Y.
    Chen, Li-Pang
    He, Wenqing
    PLOS ONE, 2023, 18 (02):
  • [10] A Proposed Sentiment Analysis Deep Learning Algorithm for Analyzing COVID-19 Tweets
    Harleen Kaur
    Shafqat Ul Ahsaan
    Bhavya Alankar
    Victor Chang
    Information Systems Frontiers, 2021, 23 : 1417 - 1429