Mining Social Media Data to Predict COVID-19 Case Counts

被引:0
|
作者
Kazijevs, Maksims [1 ]
Akyelken, Furkan A. [1 ]
Samad, Manar D. [1 ]
机构
[1] Tennessee State Univ, Dept Comp Sci, Nashville, TN 37203 USA
基金
美国国家卫生研究院;
关键词
pandemic prediction; social media; Twitter; LSTM; natural language processing;
D O I
10.1109/ICHI54592.2022.00027
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The unpredictability and unknowns surrounding the ongoing coronavirus disease (COVID-19) pandemic have led to an unprecedented consequence taking a heavy toll on the lives and economies of all countries. There have been efforts to predict COVID-19 case counts (CCC) using epidemiological data and numerical tokens online, which may allow early preventive measures to slow the spread of the disease. In this paper, we use state-of-the-art natural language processing (NLP) algorithms to numerically encode COVID-19 related tweets originated from eight cities in the United States and predict city-specific CCC up to eight days in the future. A city-embedding is proposed to obtain a time series representation of daily tweets posted from a city, which is then used to predict case counts using a custom long-short term memory (LSTM) model. The universal sentence encoder yields the best normalized root mean squared error (NRMSE) 0.090 (0.039), averaged across all cities in predicting CCC six days in the future. The R-2 scores in predicting CCC are more than 0.70 and often over 0.8, which suggests a strong correlation between the actual and our model predicted CCC values. Our analyses show that the NRMSE and R-2 scores are consistently robust across different cities and different numbers of time steps in time series data. Results show that the LSTM model can learn the mapping between the NLP-encoded tweet semantics and the case counts, which infers that social media text can be directly mined to identify the future course of the pandemic.
引用
收藏
页码:104 / 111
页数:8
相关论文
共 50 条
  • [1] Social Media Mining with Dynamic Clustering: A Case Study by COVID-19 Tweets
    Ito, Hidetoshi
    Chakraborty, Basabi
    [J]. 2020 11TH INTERNATIONAL CONFERENCE ON AWARENESS SCIENCE AND TECHNOLOGY (ICAST), 2020,
  • [2] Using Reports of Symptoms and Diagnoses on Social Media to Predict COVID-19 Case Counts in Mainland China: Observational Infoveillance Study
    Shen, Cuihua
    Chen, Anfan
    Luo, Chen
    Zhang, Jingwen
    Feng, Bo
    Liao, Wang
    [J]. JOURNAL OF MEDICAL INTERNET RESEARCH, 2020, 22 (05)
  • [3] Using data mining to track the information spreading on social media about the COVID-19 outbreak
    Xing, Yunfei
    He, Wu
    Cao, Gaohui
    Li, Yuhai
    [J]. ELECTRONIC LIBRARY, 2022, 40 (1-2): : 63 - 82
  • [4] Predicting COVID-19 Case Counts using Twitter Image Data
    Ockerman, Seth
    Carrier, Erin
    [J]. 2022 21ST IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS, ICMLA, 2022, : 1695 - 1701
  • [5] Using data mining technology to analyse the spatiotemporal public opinion of COVID-19 vaccine on social media
    Li, Tingting
    Zeng, Ziming
    Sun, Jingjing
    Sun, Shouqiang
    [J]. ELECTRONIC LIBRARY, 2022, 40 (04): : 435 - 452
  • [6] Modeling Spatiotemporal Pattern of Depressive Symptoms Caused by COVID-19 Using Social Media Data Mining
    Li, Diya
    Chaudhary, Harshita
    Zhang, Zhe
    [J]. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH, 2020, 17 (14) : 1 - 23
  • [7] Public attention about COVID-19 on social media: An investigation based on data mining and text analysis
    Hou, Keke
    Hou, Tingting
    Cai, Lili
    [J]. PERSONALITY AND INDIVIDUAL DIFFERENCES, 2021, 175
  • [8] COVID-19 Fake News Prediction On Social Media Data
    Ul Hussna, Asma
    Trisha, Iffat Immami
    Karim, Md Sanaul
    Alam, Md Golam Rabiul
    [J]. 2021 IEEE REGION 10 SYMPOSIUM (TENSYMP), 2021,
  • [9] SOCIAL MEDIA AND THE CASE OF SPORT LEAGUES DURING COVID-19
    Solanellas Donato, Francesc
    Munoz, Joshua
    Romero Jara, Edgar
    [J]. MOVIMENTO-PORTO ALEGRE, 2022, 28
  • [10] Mining the Characteristics of COVID-19 Patients in China: Analysis of Social Media Posts
    Huang, Chunmei
    Xu, Xinjie
    Cai, Yuyang
    Ge, Qinmin
    Zeng, Guangwang
    Li, Xiaopan
    Zhang, Weide
    Ji, Chen
    Yang, Ling
    [J]. JOURNAL OF MEDICAL INTERNET RESEARCH, 2020, 22 (05)