Predicting New Daily COVID-19 Cases and Deaths Using Search Engine Query Data in South Korea From 2020 to 2021: Infodemiology Study

被引:10
|
作者
Husnayain, Atina [1 ]
Shim, Eunha [2 ]
Fuad, Anis [3 ]
Su, Emily Chia-Yu [1 ,4 ]
机构
[1] Taipei Med Univ, Coll Med Sci & Technol, Grad Inst Biomed Informat, 172-1 Keelung Rd,Sec 2, Taipei 106, Taiwan
[2] Soongsil Univ, Dept Math, Seoul, South Korea
[3] Univ Gadjah Mada, Fac Med Publ Hlth & Nursing, Dept Biostat Epidemiol & Populat Hlth, Yogyakarta, Indonesia
[4] Taipei Med Univ Hosp, Clin Big Data Res Ctr, Taipei, Taiwan
基金
新加坡国家研究基金会;
关键词
prediction; internet search; COVID-19; South Korea; infodemiology; TRENDS; POPULATION; VOLUMES;
D O I
10.2196/34178
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
Background: Given the ongoing COVID-19 pandemic situation, accurate predictions could greatly help in the health resource management for future waves. However, as a new entity, COVID-19's disease dynamics seemed difficult to predict. External factors, such as internet search data, need to be included in the models to increase their accuracy. However, it remains unclear whether incorporating online search volumes into models leads to better predictive performances for long-term prediction. Objective: The aim of this study was to analyze whether search engine query data are important variables that should be included in the models predicting new daily COVID-19 cases and deaths in short- and long-term periods. Methods: We used country-level case-related data, NAVER search volumes, and mobility data obtained from Google and Apple for the period of January 20, 2020, to July 31, 2021, in South Korea. Data were aggregated into four subsets: 3, 6, 12, and 18 months after the first case was reported. The first 80% of the data in all subsets were used as the training set, and the remaining data served as the test set. Generalized linear models (GLMs) with normal, Poisson, and negative binomial distribution were developed, along with linear regression (LR) models with lasso, adaptive lasso, and elastic net regularization. Root mean square error values were defined as a loss function and were used to assess the performance of the models. All analyses and visualizations Results: GLMs with different types of distribution functions may have been beneficial in predicting new daily COVID-19 cases and deaths in the early stages of the outbreak. Over longer periods, as the distribution of cases and deaths became more normally distributed, LR models with regularization may have outperformed the GLMs. This study also found that models performed better when predicting new daily deaths compared to new daily cases. In addition, an evaluation of feature effects in the models showed that NAVER search volumes were useful variables in predicting new daily COVID-19 cases, particularly in the first 6 months of the outbreak. Searches related to logistical needs, particularly for "thermometer" and "mask strap," showed higher feature effects in that period. For longer prediction periods, NAVER search volumes were still found to constitute an important variable, although with a lower feature effect. This finding suggests that search term use should be considered to maintain the predictive Conclusions: NAVER search volumes were important variables in short- and long-term prediction, with higher feature effects for predicting new daily COVID-19 cases in the first 6 months of the outbreak. Similar results were also found for death predictions.
引用
收藏
页数:13
相关论文
共 50 条
  • [31] Same but Different? Comparing the Epidemiology, Treatments and Outcomes of COVID-19 and Non-COVID-19 ARDS Cases in Germany Using a Sample of Claims Data from 2021 and 2019
    Bernauer, Eva
    Alebrand, Felix
    Heurich, Manuel
    VIRUSES-BASEL, 2023, 15 (06):
  • [32] Impact of COVID-19 on human immunodeficiency virus tests, new diagnoses, and healthcare visits in the Republic of Korea: a retrospective study from 2016 to 2021
    Kim, Yeonju
    Park, Eonjoo
    Jung, Yoonhee
    Kim, Koun
    Kim, Taeyoung
    Kim, Hwa Su
    OSONG PUBLIC HEALTH AND RESEARCH PERSPECTIVES, 2024, 15 (04) : 340 - 352
  • [33] Vaccination Rate and Incidence of COVID-19 and Case Fatality Rate (CFR): A Correlational Study Using Data From 2019 to 2021
    Muttappallymyalil, Jayakumary
    Nair, Satish Chandrasekhar
    Changerath, Ramadas
    Sreejith, Anusha
    Manda, Sashank
    Sreedharan, Jayadevan
    CUREUS JOURNAL OF MEDICAL SCIENCE, 2022, 14 (08)
  • [34] Parameter estimation for networked SIR models with stochastic perturbations using JEKF: a study using COVID-19 daily data from Indian states
    Achankunju, Prince
    Dash, Saroj Kumar
    SYSTEMS SCIENCE & CONTROL ENGINEERING, 2024, 12 (01)
  • [35] One-year post-acute COVID-19 syndrome and mortality in South Korea: a nationwide matched cohort study using claims data
    Won, Jung-Hyun
    Hong, Yesol
    Kim, Siun
    Lee, Howard
    FRONTIERS IN PUBLIC HEALTH, 2024, 12
  • [36] A case study of 2019-nCOV cases in Argentina with the real data based on daily cases from March 03, 2020 to March 29, 2021 using classical and fractional derivatives
    Kumar, Pushpendra
    Erturk, Vedat Suat
    Murillo-Arcila, Marina
    Banerjee, Ramashis
    Manickam, A.
    ADVANCES IN DIFFERENCE EQUATIONS, 2021, 2021 (01)
  • [37] A case study of 2019-nCOV cases in Argentina with the real data based on daily cases from March 03, 2020 to March 29, 2021 using classical and fractional derivatives
    Pushpendra Kumar
    Vedat Suat Erturk
    Marina Murillo-Arcila
    Ramashis Banerjee
    A. Manickam
    Advances in Difference Equations, 2021
  • [38] Impact of the Covid-19 pandemic on inpatient health care in Switzerland 2020-2021-A descriptive retrospective study using admission data of all Swiss hospitals
    Wirth, Brigitte
    Stucki, Michael
    Joerg, Reto
    Thommen, Christoph
    Hoglinger, Marc
    PLOS ONE, 2024, 19 (07):
  • [39] Incidence and presentation of new-onset type 1 diabetes in children and adolescents from Germany during the COVID-19 pandemic 2020 and 2021: Current data from the DPV Registry
    Baechle, C.
    Eckert, A.
    Kamrath, C.
    Neu, A.
    Manuwald, U.
    Thiele-Schmitz, S.
    Weidler, O.
    Knauer-Fischer, S.
    Rosenbauer, J.
    Holl, R. W.
    DIABETES RESEARCH AND CLINICAL PRACTICE, 2023, 197
  • [40] Changes in Physical Activity and Depression among Korean Adolescents Due to COVID-19: Using Data from the 17th (2021) Korea Youth Risk Behavior Survey
    Eo, Yong-Sook
    Kim, Myo-Sung
    HEALTHCARE, 2023, 11 (04)