Assessment of SQL and NoSQL Systems to Store and Mine COVID-19 Data

被引:4
|
作者
Antas, Joao [1 ]
Rocha Silva, Rodrigo [2 ,3 ]
Bernardino, Jorge [1 ,2 ]
机构
[1] Coimbra Inst Engn ISEC, Polytech Coimbra, P-3030199 Coimbra, Portugal
[2] Ctr Informat & Syst Univ Coimbra CISUC, P-3030290 Coimbra, Portugal
[3] Sao Paulo Technol Coll, FATEC Mogi Cruzes, BR-08773600 Mogi Das Cruzes, SP, Brazil
关键词
big data; COVID-19; Data Mining; SQL and NoSQL databases;
D O I
10.3390/computers11020029
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
COVID-19 has provoked enormous negative impacts on human lives and the world economy. In order to help in the fight against this pandemic, this study evaluates different databases' systems and selects the most suitable for storing, handling, and mining COVID-19 data. We evaluate different SQL and NoSQL database systems using the following metrics: query runtime, memory used, CPU used, and storage size. The databases systems assessed were Microsoft SQL Server, MongoDB, and Cassandra. We also evaluate Data Mining algorithms, including Decision Trees, Random Forest, Naive Bayes, and Logistic Regression using Orange Data Mining software data classification tests. Classification tests were performed using cross-validation in a table with about 3 M records, including COVID-19 exams with patients' symptoms. The Random Forest algorithm has obtained the best average accuracy, recall, precision, and F1 Score in the COVID-19 predictive model performed in the mining stage. In performance evaluation, MongoDB has presented the best results for almost all tests with a large data volume.
引用
收藏
页数:23
相关论文
共 50 条
  • [21] COVID 19 Sex and gender missing in COVID-19 data
    O'Grady, Cathleen
    SCIENCE, 2021, 373 (6551) : 145 - 145
  • [22] Revisit the First 60 Days of COVID-19: Assessment of the Global Healthcare Systems using Data Envelopment Analysis
    Ngo, Thanh
    Nguyen, Duc Khuong
    Vo, Dinh-Tri
    JOURNAL OF HEALTH MANAGEMENT, 2025,
  • [23] Smart Data Analytics on COVID-19 Data
    Leung, Carson K.
    Zhao, Chenru
    Zheng, Hao
    IEEE CONGRESS ON CYBERMATICS / 2021 IEEE INTERNATIONAL CONFERENCES ON INTERNET OF THINGS (ITHINGS) / IEEE GREEN COMPUTING AND COMMUNICATIONS (GREENCOM) / IEEE CYBER, PHYSICAL AND SOCIAL COMPUTING (CPSCOM) / IEEE SMART DATA (SMARTDATA), 2021, : 372 - 379
  • [24] Big Data Science on COVID-19 Data
    Leung, Carson K.
    Chen, Yubo
    Shang, Siyuan
    Deng, Deyu
    2020 IEEE 14TH INTERNATIONAL CONFERENCE ON BIG DATA SCIENCE AND ENGINEERING (BIGDATASE 2020), 2020, : 14 - 21
  • [25] Fiscal Policy for COVID-19: An Assessment
    Borland, Jeff
    AUSTRALIAN ECONOMIC REVIEW, 2023, 56 (01) : 61 - 69
  • [26] Frailty assessment in the COVID-19 pandemic
    Ng Cheong Chung, Kenneth Jordan
    Kunadian, Vijay
    JOURNAL OF INVESTIGATIVE MEDICINE, 2020, 68 (07) : 1300 - 1301
  • [27] COVID-19 Disaster Response Assessment
    Hickerson, William L.
    JOURNAL OF BURN CARE & RESEARCH, 2020, 41 (04): : 918 - 918
  • [28] Assessment of risk scores in Covid-19
    Garcia Clemente, Marta Maria
    Herrero Huertas, Julia
    Fernandez Fernandez, Alejandro
    De La Escosura Munoz, Covadonga
    Enriquez Rodriguez, Ana Isabel
    Perez Martinez, Liliana
    Gomez Manas, Santiago
    Iscar Urrutia, Marta
    Lopez Gonzalez, Francisco Julian
    Madrid Carbajal, Claudia Janeth
    Bedate Diaz, Pedro
    Arias Guillen, Miguel
    Bailon Cuadrado, Cristina
    Hermida Valverde, Tamara
    INTERNATIONAL JOURNAL OF CLINICAL PRACTICE, 2021, 75 (12)
  • [29] Early Detection and Assessment of Covid-19
    Hashmi, Hafiz Abdul Sattar
    Asif, Hafiz Muhammad
    FRONTIERS IN MEDICINE, 2020, 7
  • [30] Cardiovascular Risk Assessment in COVID-19
    Zdanyte, Monika
    Rath, Dominik
    HAMOSTASEOLOGIE, 2021, 41 (05): : 350 - 355