An interpretable hybrid predictive model of COVID-19 cases using autoregressive model and LSTM

被引:6
|
作者
Zhang, Yangyi [1 ]
Tang, Sui [1 ]
Yu, Guo [2 ]
机构
[1] Univ Calif Santa Barbara, Dept Math, Santa Barbara, CA 93106 USA
[2] Univ Calif Santa Barbara, Dept Stat & Appl Probabil, Santa Barbara, CA 93106 USA
关键词
ARIMA; XGBOOST;
D O I
10.1038/s41598-023-33685-z
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
The Coronavirus Disease 2019 (COVID-19) has had a profound impact on global health and economy, making it crucial to build accurate and interpretable data-driven predictive models for COVID-19 cases to improve public policy making. The extremely large scale of the pandemic and the intrinsically changing transmission characteristics pose a great challenge for effectively predicting COVID-19 cases. To address this challenge, we propose a novel hybrid model in which the interpretability of the Autoregressive model (AR) and the predictive power of the long short-term memory neural networks (LSTM) join forces. The proposed hybrid model is formalized as a neural network with an architecture that connects two composing model blocks, of which the relative contribution is decided data-adaptively in the training procedure. We demonstrate the favorable performance of the hybrid model over its two single composing models as well as other popular predictive models through comprehensive numerical studies on two data sources under multiple evaluation metrics. Specifically, in county-level data of 8 California counties, our hybrid model achieves 4.173% MAPE, outperforming the composing AR (5.629%) and LSTM (4.934%) alone on average. In country-level datasets, our hybrid model outperforms the widely-used predictive models such as AR, LSTM, Support Vector Machines, Gradient Boosting, and Random Forest, in predicting the COVID-19 cases in Japan, Canada, Brazil, Argentina, Singapore, Italy, and the United Kingdom. In addition to the predictive performance, we illustrate the interpretability of our proposed hybrid model using the estimated AR component, which is a key feature that is not shared by most black-box predictive models for COVID-19 cases. Our study provides a new and promising direction for building effective and interpretable data-driven models for COVID-19 cases, which could have significant implications for public health policy making and control of the current COVID-19 and potential future pandemics.
引用
收藏
页数:12
相关论文
共 50 条
  • [31] A predictive model for the severity of COVID-19 in elderly patients
    Zeng, Furong
    Deng, Guangtong
    Cui, Yanhui
    Zhang, Yan
    Dai, Minhui
    Chen, Lingli
    Han, Duoduo
    Li, Wen
    Guo, Kehua
    Chen, Xiang
    Shen, Minxue
    Pan, Pinhua
    AGING-US, 2020, 12 (21): : 20982 - 20996
  • [32] Machine learning predictive model for severe COVID-19
    Kang, Jianhong
    Chen, Ting
    Luo, Honghe
    Luo, Yifeng
    Du, Guipeng
    Jiming-Yang, Mia
    INFECTION GENETICS AND EVOLUTION, 2021, 90
  • [33] Univariate and Multivariate Long Short Term Memory (LSTM) Model to Predict Covid-19 Cases in Malaysia Using Integrated Data
    Shen, Ng Wei
    Abu Bakar, Azuraliza
    Mohamad, Hazura
    MALAYSIAN JOURNAL OF FUNDAMENTAL AND APPLIED SCIENCES, 2023, 19 (04): : 653 - 667
  • [34] Forecasting daily Covid-19 cases in the world with a hybrid ARIMA and neural network model
    Morais, Lucas Rabelo de Araujo
    Gomes, Gecynalda Soares da Silva
    APPLIED SOFT COMPUTING, 2022, 126
  • [35] LSTM-based Model for Forecasting of COVID-19 Vaccines in Pakistan
    Bashir, Saba
    Rohail, Kinza
    Qureshi, Rizwan
    PROCEEDINGS OF 2ND IEEE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE (ICAI 2022), 2022, : 94 - 99
  • [36] Prediction of COVID-19 spread via LSTM and the deterministic SEIR model
    Yang, Yifan
    Yu, Wenwu
    Chen, Duxin
    PROCEEDINGS OF THE 39TH CHINESE CONTROL CONFERENCE, 2020, : 782 - 785
  • [37] Predicting COVID-19 cases using bidirectional LSTM on multivariate time series
    Ahmed Ben Said
    Abdelkarim Erradi
    Hussein Ahmed Aly
    Abdelmonem Mohamed
    Environmental Science and Pollution Research, 2021, 28 : 56043 - 56052
  • [38] Predicting COVID-19 cases using bidirectional LSTM on multivariate time series
    Said, Ahmed Ben
    Erradi, Abdelkarim
    Aly, Hussein Ahmed
    Mohamed, Abdelmonem
    ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH, 2021, 28 (40) : 56043 - 56052
  • [40] Forecasting COVID-19 Total Daily Cases in Indonesia Using LSTM Networks
    Indriyani, Clarissa Angelita
    Wijaya, Claudia Rachel
    Qomariyah, Nunung Nurul
    5TH INTERNATIONAL CONFERENCE ON COMPUTING AND INFORMATICS (ICCI 2022), 2022, : 385 - 391