Data quality model for assessing public COVID-19 big datasets

被引:1
|
作者
Ngueilbaye, Alladoumbaye [1 ,2 ]
Huang, Joshua Zhexue [1 ,2 ]
Khan, Mehak [3 ]
Wang, Hongzhi [4 ]
机构
[1] Shenzhen Univ, Big Data Inst, Coll Comp Sci & Software Engn, Shenzhen 518060, Guangdong, Peoples R China
[2] Shenzhen Univ, Natl Engn Lab Big Data Syst Comp Technol, Shenzhen 518060, Guangdong, Peoples R China
[3] Oslo Metropolitan Univ, Dept Comp Sci, AI Lab, Oslo, Norway
[4] Harbin Inst Technol, Sch Comp Sci & Technol, Harbin 150001, Heilongjiang, Peoples R China
来源
JOURNAL OF SUPERCOMPUTING | 2023年 / 79卷 / 17期
关键词
Data quality model; COVID-19 big dataset; 4A; Canonical data model; Benford's law; CEMAC region; GROWTH; IMPACT;
D O I
10.1007/s11227-023-05410-0
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
For decision-making support and evidence based on healthcare, high quality data are crucial, particularly if the emphasized knowledge is lacking. For public health practitioners and researchers, the reporting of COVID-19 data need to be accurate and easily available. Each nation has a system in place for reporting COVID-19 data, albeit these systems' efficacy has not been thoroughly evaluated. However, the current COVID-19 pandemic has shown widespread flaws in data quality. We propose a data quality model (canonical data model, four adequacy levels, and Benford's law) to assess the quality issue of COVID-19 data reporting carried out by the World Health Organization (WHO) in the six Central African Economic and Monitory Community (CEMAC) region countries between March 6,2020, and June 22, 2022, and suggest potential solutions. These levels of data quality sufficiency can be interpreted as dependability indicators and sufficiency of Big Dataset inspection. This model effectively identified the quality of the entry data for big dataset analytics. The future development of this model requires scholars and institutions from all sectors to deepen their understanding of its core concepts, improve integration with other data processing technologies, and broaden the scope of its applications.
引用
收藏
页码:19574 / 19606
页数:33
相关论文
共 50 条
  • [1] Data quality model for assessing public COVID-19 big datasets
    Alladoumbaye Ngueilbaye
    Joshua Zhexue Huang
    Mehak Khan
    Hongzhi Wang
    [J]. The Journal of Supercomputing, 2023, 79 : 19574 - 19606
  • [2] Common data model for COVID-19 datasets
    Wegner, Philipp
    Jose, Geena Mariya
    Lage-Rupprecht, Vanessa
    Khatami, Sepehr Golriz
    Zhang, Bide
    Springstubbe, Stephan
    Jacobs, Marc
    Linden, Thomas
    Ku, Cindy
    Schultz, Bruce
    Hofmann-Apitius, Martin
    Kodamullil, Alpha Tom
    [J]. BIOINFORMATICS, 2022, 38 (24) : 5466 - 5468
  • [3] Public Policy, Big Data, Network Theory and COVID-19
    Isaza Villamizar, Emiliano
    Gonzalez-Casabianca, Felipe
    Herrera, Santiago
    Rodriguez-Barraquer, Tomas
    Angel, Andres
    Corredor, Vladimir
    Feged-Rivadeneira, Alejandro
    [J]. DESAFIOS, 2020, 32 (02):
  • [4] Assessing the Quality of Covid-19 Open Data Portals
    Ballhausen Sampaio, Igor Garcia
    Andrade, Eduardo de O.
    Bernardini, Flavia
    Viterbo, Jose
    [J]. ELECTRONIC GOVERNMENT, EGOV 2022, 2022, 13391 : 212 - 227
  • [5] Big Data Science on COVID-19 Data
    Leung, Carson K.
    Chen, Yubo
    Shang, Siyuan
    Deng, Deyu
    [J]. 2020 IEEE 14TH INTERNATIONAL CONFERENCE ON BIG DATA SCIENCE AND ENGINEERING (BIGDATASE 2020), 2020, : 14 - 21
  • [6] Impact of big data prediction model based on circulation model on public decision of COVID-19 epidemic situation
    Shen, Quanquan
    Ying, Xindi
    [J]. BASIC & CLINICAL PHARMACOLOGY & TOXICOLOGY, 2021, 128 : 227 - 228
  • [7] Making Data Big for a Deep-learning Analysis: Aggregation of Public COVID-19 Datasets of Lung Computed Tomography Scans
    Lizzi, Francesca
    Brero, Francesca
    Cabini, Raffaella Fiamma
    Fantacci, Maria Evelina
    Piffer, Stefano
    Postuma, Ian
    Rinaldi, Lisa
    Retico, Alessandra
    [J]. PROCEEDINGS OF THE 10TH INTERNATIONAL CONFERENCE ON DATA SCIENCE, TECHNOLOGY AND APPLICATIONS (DATA), 2021, : 316 - 321
  • [8] Big data and cutaneous manifestations of COVID-19
    Grant-Kels, Jane M.
    Sloan, Brett
    Kantor, Jonathan
    Elston, Dirk M.
    [J]. JOURNAL OF THE AMERICAN ACADEMY OF DERMATOLOGY, 2020, 83 (02) : 365 - 366
  • [9] COVID-19: Challenges to GIS with Big Data
    Zhou, Chenghu
    Su, Fenzhen
    Pei, Tao
    Zhang, An
    Du, Yunyan
    Luo, Bin
    Cao, Zhidong
    Wang, Juanle
    Yuan, Wen
    Zhu, Yunqiang
    Song, Ci
    Chen, Jie
    Xu, Jun
    Li, Fujia
    Ma, Ting
    Jiang, Lili
    Yan, Fengqin
    Yi, Jiawei
    Hu, Yunfeng
    Liao, Yilan
    Xiao, Han
    [J]. GEOGRAPHY AND SUSTAINABILITY, 2020, 1 (01) : 77 - 87
  • [10] Covid-19 Imaging Tools: How Big Data is Big?
    Santosh, K. C.
    Ghosh, Sourodip
    [J]. JOURNAL OF MEDICAL SYSTEMS, 2021, 45 (07)