A Machine Learning-Sentiment Analysis on Monkeypox Outbreak: An Extensive Dataset to Show the Polarity of Public Opinion From Twitter Tweets

被引:22
|
作者
Bengesi, Staphord [1 ]
Oladunni, Timothy [2 ]
Olusegun, Ruth [1 ]
Audu, Halima [1 ]
机构
[1] Bowie State Univ, Dept Comp Sci, Bowie, MD 20715 USA
[2] Morgan State Univ, Dept Comp Sci, Baltimore, MD 21251 USA
关键词
Social networking (online); Sentiment analysis; Blogs; Classification algorithms; Computational modeling; Machine learning; Count vectorizer; machine learning algorithm; monkeypox; sentiment analysis; twitter; TF-IDF; TextBlob; Vader;
D O I
10.1109/ACCESS.2023.3242290
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Research on sentiment analysis has proven to be very useful in public health, particularly in analyzing infectious diseases. As the world recovers from the onslaught of the COVID-19 pandemic, concerns are rising that another pandemic, known as monkeypox, might hit the world again. Monkeypox is an infectious disease reported in over 73 countries across the globe. This sudden outbreak has become a major concern for many individuals and health authorities. Different social media channels have presented discussions, views, opinions, and emotions about the monkeypox outbreak. Social media sentiments often result in panic, misinformation, and stigmatization of some minority groups. Therefore, accurate information, guidelines, and health protocols related to this virus are critical. We aim to analyze public sentiments on the recent monkeypox outbreak, with the purpose of helping decision-makers gain a better understanding of the public perceptions of the disease. We hope that government and health authorities will find the work useful in crafting health policies and mitigating strategies to control the spread of the disease, and guide against its misrepresentations. Our study was conducted in two stages. In the first stage, we collected over 500,000 multilingual tweets related to the monkeypox post on Twitter and then performed sentiment analysis on them using VADER and TextBlob, to annotate the extracted tweets into positive, negative, and neutral sentiments. The second stage of our study involved the design, development, and evaluation of 56 classification models. Stemming and lemmatization techniques were used for vocabulary normalization. Vectorization was based on CountVectorizer and TF-IDF methodologies. K-Nearest Neighbor (KNN), Support Vector Machine (SVM), Random Forest, Logistic Regression, Multilayer Perceptron (MLP), Naive Bayes, and XGBoost were deployed as learning algorithms. Performance evaluation was based on accuracy, F1 Score, Precision, and Recall. Our experimental results showed that the model developed using TextBlob annotation + Lemmatization + CountVectorizer + SVM yielded the highest accuracy of about 0.9348.
引用
收藏
页码:11811 / 11826
页数:16
相关论文
共 25 条
  • [21] A Hybrid Approach of Machine Learning and Lexicons to Sentiment Analysis: Enhanced Insights from Twitter Data of Natural Disasters
    Shalak Mendon
    Pankaj Dutta
    Abhishek Behl
    Stefan Lessmann
    Information Systems Frontiers, 2021, 23 : 1145 - 1168
  • [22] A Hybrid Approach of Machine Learning and Lexicons to Sentiment Analysis: Enhanced Insights from Twitter Data of Natural Disasters
    Mendon, Shalak
    Dutta, Pankaj
    Behl, Abhishek
    Lessmann, Stefan
    INFORMATION SYSTEMS FRONTIERS, 2021, 23 (05) : 1145 - 1168
  • [23] Machine learning in prediction of stock market indicators based on historical data and data from Twitter sentiment analysis.
    Porshnev, Alexander
    Redkin, Ilya
    Shevchenko, Alexey
    2013 IEEE 13TH INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW), 2013, : 440 - 444
  • [24] Identifying Significance of Product Features on Customer Satisfaction Recognizing Public Sentiment Polarity: Analysis of Smart Phone Industry Using Machine-Learning Approaches
    Imtiaz, Md. Niaz
    Ben Islam, Md. Khaled Ben
    APPLIED ARTIFICIAL INTELLIGENCE, 2020, 34 (11) : 832 - 848
  • [25] Analyzing online public opinion on Thailand-China high-speed train and Laos-China railway mega-projects using advanced machine learning for sentiment analysis
    Nokkaew, Manussawee
    Nongpong, Kwankamol
    Yeophantong, Tapanan
    Ploykitikoon, Pattravadee
    Arjharn, Weerachai
    Siritaratiwat, Apirat
    Narkglom, Sorawit
    Wongsinlatam, Wullapa
    Remsungnen, Tawun
    Namvong, Ariya
    Surawanitkun, Chayada
    SOCIAL NETWORK ANALYSIS AND MINING, 2023, 14 (01)