PAN-LDA: A latent Dirichlet allocation based novel feature extraction model for COVID-19 data using machine learning

被引:21
|
作者
Gupta, Aakansha [1 ]
Katarya, Rahul [1 ]
机构
[1] Delhi Technol Univ, Dept Comp Sci & Engn, Big Data Analyt & Web Intelligence Lab, New Delhi, India
关键词
COVID-19; Latent dirichlet allocation; Collapsed gibbs sampling; Data mining; Feature extraction; Backpropagation;
D O I
10.1016/j.compbiomed.2021.104920
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
The recent outbreak of novel Coronavirus disease or COVID-19 is declared a pandemic by the World Health Organization (WHO). The availability of social media platforms has played a vital role in providing and obtaining information about any ongoing event. However, consuming a vast amount of online textual data to predict an event's trends can be troublesome. To our knowledge, no study analyzes the online news articles and the disease data about coronavirus disease. Therefore, we propose an LDA-based topic model, called PAN-LDA (Pandemic Latent Dirichlet allocation), that incorporates the COVID-19 cases data and news articles into common LDA to obtain a new set of features. The generated features are introduced as additional features to Machine learning (ML) algorithms to improve the forecasting of time series data. Furthermore, we are employing collapsed Gibbs sampling (CGS) as the underlying technique for parameter inference. The results from experiments suggest that the obtained features from PAN-LDA generate more identifiable topics and empirically add value to the outcome.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] A Bibliometric Analysis of COVID-19 Research in Malaysia using Latent Dirichlet Allocation
    Zamzuri, Zamira Hasanah
    SAINS MALAYSIANA, 2021, 50 (06): : 1815 - 1825
  • [2] Prediction of Covid-19 and post Covid-19 patients with reduced feature extraction using Machine Learning Techniques
    Bano, Shehr
    Hussain, Syed Fawad
    2021 INTERNATIONAL CONFERENCE ON FRONTIERS OF INFORMATION TECHNOLOGY (FIT 2021), 2021, : 37 - 42
  • [3] A Novel Machine Learning based Model for COVID-19 Prediction
    Mazen, Tamer Sh
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2020, 11 (11) : 523 - 531
  • [4] A Novel Machine Learning based Model for COVID-19 Prediction
    Sh. Mazen T.
    International Journal of Advanced Computer Science and Applications, 2020, 11 (11): : 523 - 531
  • [5] A Latent Dirichlet Allocation and Fuzzy Clustering Based Machine Learning Model for Text Thesaurus
    Luo, J.
    Yu, D.
    Dai, Z.
    INTERNATIONAL JOURNAL OF COMPUTERS COMMUNICATIONS & CONTROL, 2020, 15 (02)
  • [6] An overview of literature on COVID-19, MERS and SARS: Using text mining and latent Dirichlet allocation
    Cheng, Xian
    Cao, Qiang
    Liao, Stephen Shaoyi
    JOURNAL OF INFORMATION SCIENCE, 2022, 48 (03) : 304 - 320
  • [7] The application of network agenda setting model during the COVID-19 pandemic based on latent dirichlet allocation topic modeling
    Liu, Kai
    Geng, Xiaoyu
    Liu, Xiaoyan
    FRONTIERS IN PSYCHOLOGY, 2022, 13
  • [8] COVID-19 Prediction model using Machine Learning
    Jadi, Amr
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2021, 21 (08): : 247 - 253
  • [9] Deep Feature Extraction for Detection of COVID-19 Using Deep Learning
    Rafiq, Arisa
    Imran, Muhammad
    Alhajlah, Mousa
    Mahmood, Awais
    Karamat, Tehmina
    Haneef, Muhammad
    Alhajlah, Ashwaq
    ELECTRONICS, 2022, 11 (23)
  • [10] Indonesia's News Topic Discussion about Covid-19 Outbreak using Latent Dirichlet Allocation
    Faculty of Mathematics and Natural Science, Universitas Syiah Kuala, Banda Aceh, Indonesia
    不详
    Int. Conf. Informatics Comput., ICIC, 2020,