BIG DATA, BIG DATA QUALITY PROBLEM

被引:0
|
作者
Becker, David [1 ]
McMullen, Bill [1 ]
King, Trish Dunn [1 ]
机构
[1] Mitre Corp, Dayton, OH 45431 USA
关键词
Big Data; Data Quality; Returns to Scale;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A USAF sponsored MITRE research team undertook four separate, domain-specific case studies about Big Data applications. Those case studies were initial investigations into the question of whether or not data quality issues encountered in Big Data collections are substantially different in cause, manifestation, or detection than those data quality issues encountered in more traditionally sized data collections. The study addresses several factors affecting Big Data Quality at multiple levels, including collection, processing, and storage. Though not unexpected, the key findings of this study reinforce that the primary factors affecting Big Data reside in the limitations and complexities involved with handling Big Data while maintaining its integrity. These concerns are of a higher magnitude than the provenance of the data, the processing, and the tools used to prepare, manipulate, and store the data. Data quality is extremely important for all data analytics problems. From the study's findings, the " truth about Big Data" is there are no fundamentally new DQ issues in Big Data analytics projects. Some DQ issues exhibit return-s-to-scale effects, and become more or less pronounced in Big Data analytics, though. Big Data Quality varies from one type of Big Data to another and from one Big Data technology to another.
引用
收藏
页码:2644 / 2653
页数:10
相关论文
共 50 条
  • [22] Information Governance, Big Data and Data Quality
    de Freitas, Patricia Alves
    dos Reis, Everson Andrade
    Michel, Wanderson Senra
    Gronovicz, Mauro Edson
    de Macedo Rodrigues, Marcio Alexandre
    [J]. 2013 IEEE 16TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND ENGINEERING (CSE 2013), 2013, : 1142 - 1143
  • [23] Data Quality: The other Face of Big Data
    Saha, Barna
    Srivastava, Divesh
    [J]. 2014 IEEE 30TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2014, : 1294 - 1297
  • [24] A Data Quality in Use Model for Big Data
    Caballero, Ismael
    Serrano, Manuel
    Piattini, Mario
    [J]. ADVANCES IN CONCEPTUAL MODELING, 2014, 8823 : 65 - 74
  • [25] Data Quality Management for Big Data Applications
    Khaleel, Majida Yaseen
    Hamad, Murtadha M.
    [J]. 12TH INTERNATIONAL CONFERENCE ON THE DEVELOPMENTS IN ESYSTEMS ENGINEERING (DESE 2019), 2019, : 357 - 362
  • [26] Big Data Quality: a Roadmap for Open Data
    Ciancarini, Paolo
    Poggi, Francesco
    Russo, Daniel
    [J]. PROCEEDINGS 2016 IEEE SECOND INTERNATIONAL CONFERENCE ON BIG DATA COMPUTING SERVICE AND APPLICATIONS (BIGDATASERVICE 2016), 2016, : 210 - 215
  • [27] Data Quality Issues in Big Data: A Review
    Salih, Fathi Ibrahim
    Ismail, Saiful Adli
    Hamed, Mosaab M.
    Yusop, Othman Mohd
    Azmi, Azri
    Azmi, Nurulhuda Firdaus Mohd
    [J]. RECENT TRENDS IN DATA SCIENCE AND SOFT COMPUTING, IRICT 2018, 2019, 843 : 105 - 116
  • [28] A Data Quality in Use model for Big Data
    Merino, Jorge
    Caballero, Ismael
    Rivas, Bibiano
    Serrano, Manuel
    Piattini, Mario
    [J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2016, 63 : 123 - 130
  • [29] A Systematic Review of Data Models for the Big Data Problem
    Mostajabi, Faezeh
    Safaei, Ali Asghar
    Sahafi, Amir
    [J]. IEEE ACCESS, 2021, 9 : 128889 - 128904
  • [30] Big data interfaces and the problem of inclusion
    Chan, Anita
    [J]. MEDIA CULTURE & SOCIETY, 2015, 37 (07) : 1078 - 1083