Data quality model for assessing public COVID-19 big datasets

被引:1
|
作者
Ngueilbaye, Alladoumbaye [1 ,2 ]
Huang, Joshua Zhexue [1 ,2 ]
Khan, Mehak [3 ]
Wang, Hongzhi [4 ]
机构
[1] Shenzhen Univ, Big Data Inst, Coll Comp Sci & Software Engn, Shenzhen 518060, Guangdong, Peoples R China
[2] Shenzhen Univ, Natl Engn Lab Big Data Syst Comp Technol, Shenzhen 518060, Guangdong, Peoples R China
[3] Oslo Metropolitan Univ, Dept Comp Sci, AI Lab, Oslo, Norway
[4] Harbin Inst Technol, Sch Comp Sci & Technol, Harbin 150001, Heilongjiang, Peoples R China
来源
JOURNAL OF SUPERCOMPUTING | 2023年 / 79卷 / 17期
关键词
Data quality model; COVID-19 big dataset; 4A; Canonical data model; Benford's law; CEMAC region; GROWTH; IMPACT;
D O I
10.1007/s11227-023-05410-0
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
For decision-making support and evidence based on healthcare, high quality data are crucial, particularly if the emphasized knowledge is lacking. For public health practitioners and researchers, the reporting of COVID-19 data need to be accurate and easily available. Each nation has a system in place for reporting COVID-19 data, albeit these systems' efficacy has not been thoroughly evaluated. However, the current COVID-19 pandemic has shown widespread flaws in data quality. We propose a data quality model (canonical data model, four adequacy levels, and Benford's law) to assess the quality issue of COVID-19 data reporting carried out by the World Health Organization (WHO) in the six Central African Economic and Monitory Community (CEMAC) region countries between March 6,2020, and June 22, 2022, and suggest potential solutions. These levels of data quality sufficiency can be interpreted as dependability indicators and sufficiency of Big Dataset inspection. This model effectively identified the quality of the entry data for big dataset analytics. The future development of this model requires scholars and institutions from all sectors to deepen their understanding of its core concepts, improve integration with other data processing technologies, and broaden the scope of its applications.
引用
收藏
页码:19574 / 19606
页数:33
相关论文
共 50 条
  • [31] Mobile Big Data in the fight against COVID-19
    Benjamins, Richard
    Vos, Jeanine
    Verhulst, Stefaan
    [J]. DATA & POLICY, 2022, 4
  • [32] Combat COVID-19 with artificial intelligence and big data
    Lin, Leesa
    Hou, Zhiyuan
    [J]. JOURNAL OF TRAVEL MEDICINE, 2020, 27 (05)
  • [33] Between urgency and data quality: assessing the FAIRness of data in social science research on the COVID-19 pandemic
    Batzdorfer, Veronika
    Zenk-Moeltgen, Wolfgang
    Young, Laura
    Katsanidou, Alexia
    Breuer, Johannes
    Bishop, Libby
    [J]. RESEARCH ETHICS, 2024,
  • [34] Hospital discharge data quality and COVID-19
    O'Donovan, Cliona
    Reid, Beth
    [J]. HEALTH INFORMATION MANAGEMENT JOURNAL, 2021, 50 (1-2) : 93 - 94
  • [35] Open Collaboration, Data Quality, and COVID-19
    van Genuchten, Michiel
    Hatton, Les
    [J]. IEEE SOFTWARE, 2021, 38 (03) : 137 - 141
  • [36] Quality of online information for the general public on COVID-19
    Jayasinghe, Ravindri
    Ranasinghe, Sonali
    Jayarajah, Umesh
    Seneviratne, Sanjeewa
    [J]. PATIENT EDUCATION AND COUNSELING, 2020, 103 (12) : 2594 - 2597
  • [37] COVID-19 Datasets: A Brief Overview
    Sun, Ke
    Li, Wuyang
    Saikrishna, Vidya
    Chadhar, Mehmood
    Xia, Feng
    [J]. COMPUTER SCIENCE AND INFORMATION SYSTEMS, 2022, 19 (03) : 1115 - 1132
  • [38] Protecting Privacy and Transforming COVID-19 Case Surveillance Datasets for Public Use
    Lee, Brian
    Dupervil, Brandi
    Deputy, Nicholas P.
    Duck, Wil
    Soroka, Stephen
    Bottichio, Lyndsay
    Silk, Benjamin
    Price, Jason
    Sweeney, Patricia
    Fuld, Jennifer
    Weber, J. Todd
    Pollock, Dan
    [J]. PUBLIC HEALTH REPORTS, 2021, 136 (05) : 554 - 561
  • [39] Data Quality Applied to Open Databases: "COVID-19 Cases" and "COVID-19 Vaccines"
    Pasini, Ariel
    Torres, Juan Ignacio
    Esponda, Silvia
    Pesado, Patricia
    [J]. COMPUTER SCIENCE, CACIC 2021, 2022, 1584 : 297 - 311
  • [40] A data capture model and its associate study on the public web published COVID-19 data
    Liang, Zhiwei
    Zhang, Pan
    Liu, Baoyan
    Xu, Nenggui
    Tian, Lu
    Lu, Ying
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE, 2020, : 2005 - 2008