BIG DATA, BIG DATA QUALITY PROBLEM

被引:0
|
作者
Becker, David [1 ]
McMullen, Bill [1 ]
King, Trish Dunn [1 ]
机构
[1] Mitre Corp, Dayton, OH 45431 USA
关键词
Big Data; Data Quality; Returns to Scale;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A USAF sponsored MITRE research team undertook four separate, domain-specific case studies about Big Data applications. Those case studies were initial investigations into the question of whether or not data quality issues encountered in Big Data collections are substantially different in cause, manifestation, or detection than those data quality issues encountered in more traditionally sized data collections. The study addresses several factors affecting Big Data Quality at multiple levels, including collection, processing, and storage. Though not unexpected, the key findings of this study reinforce that the primary factors affecting Big Data reside in the limitations and complexities involved with handling Big Data while maintaining its integrity. These concerns are of a higher magnitude than the provenance of the data, the processing, and the tools used to prepare, manipulate, and store the data. Data quality is extremely important for all data analytics problems. From the study's findings, the " truth about Big Data" is there are no fundamentally new DQ issues in Big Data analytics projects. Some DQ issues exhibit return-s-to-scale effects, and become more or less pronounced in Big Data analytics, though. Big Data Quality varies from one type of Big Data to another and from one Big Data technology to another.
引用
收藏
页码:2644 / 2653
页数:10
相关论文
共 50 条
  • [31] Cancer's Big Data Problem
    Breaux, Justin H. S.
    [J]. COMPUTING IN SCIENCE & ENGINEERING, 2017, 19 (02) : 79 - 81
  • [32] The politics of big data. Big data, big brother?
    Mager, Astrid
    [J]. INFORMATION COMMUNICATION & SOCIETY, 2019, 22 (10) : 1523 - 1525
  • [33] Definition and Scope of Big Data Problem
    Lwin, Thurein Kyaw
    Bogdanov, A. V.
    [J]. PROCEEDINGS OF 2018 2ND INTERNATIONAL CONFERENCE ON CLOUD AND BIG DATA COMPUTING (ICCBDC 2018), 2018, : 38 - 41
  • [34] Neurotrauma as a big-data problem
    Huie, J. Russell
    Almeida, Carlos A.
    Ferguson, Adam R.
    [J]. CURRENT OPINION IN NEUROLOGY, 2018, 31 (06) : 702 - 708
  • [35] On the Problem of Clustering Spatial Big Data
    Schoier, Gabriella
    Borruso, Giuseppe
    [J]. COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2015, PT III, 2015, 9157 : 688 - 697
  • [36] Big Data, Big Knowledge: Big Data for Personalized Healthcare
    Viceconti, Marco
    Hunter, Peter
    Hose, Rod
    [J]. IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2015, 19 (04) : 1209 - 1215
  • [37] Big data or good data? Improving the quality of big data by open source clinical research protocols
    Oberije, C.
    Roelofs, E.
    Nalbantov, G.
    Dekker, A.
    Wiessler, W.
    Eble, M.
    Dries, W.
    Janvary, L.
    Bulens, P.
    Lambin, P.
    [J]. RADIOTHERAPY AND ONCOLOGY, 2014, 111 : S274 - S274
  • [38] From Big Data to Smart Data: A Data Quality Perspective
    Baldassarre, Maria Teresa
    Caballero, Ismael
    Caivano, Danilo
    Garcia, Bibiano Rivas
    Piattini, Mario
    [J]. PROCEEDINGS OF THE 1ST ACM SIGSOFT INTERNATIONAL WORKSHOP ON ENSEMBLE-BASED SOFTWARE ENGINEERING (ENSEMBLE '18), 2018, : 19 - 24
  • [39] BIG DATA. BIG DATA WITH NETEZZA
    Velicanu, Manole
    Titirisca, Aurelian
    [J]. INTERNATIONAL CONFERENCE ON INFORMATICS IN ECONOMY, 2013, : 377 - 380
  • [40] Big data, data science, and big contributions
    Broome, Marion E.
    [J]. NURSING OUTLOOK, 2016, 64 (02) : 113 - 114