From Data Quality to Big Data Quality

被引:67
|
作者
Batini, Carlo [1 ]
Rula, Anisa [1 ]
Scannapieco, Monica [2 ]
Viscusi, Gianluigi [3 ]
机构
[1] Univ Milano Bicocca, Dept Informat Syst & Commun DISCo, Milan, Italy
[2] Italian Natl Inst Stat Istat, Rome, Italy
[3] Ecole Polytech Fed Lausanne, CDM MTEI CSI, Lausanne, Switzerland
关键词
Data Quality; Big Data; Linked Open Data; Maps; Semi-Structured Texts; SENSOR;
D O I
10.4018/JDM.2015010103
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This article investigates the evolution of data quality issues from traditional structured data managed in relational databases to Big Data. In particular, the paper examines the nature of the relationship between Data Quality and several research coordinates that are relevant in Big Data, such as the variety of data types, data sources and application domains, focusing on maps, semi-structured texts, linked open data, sensor & sensor networks and official statistics. Consequently a set of structural characteristics is identified and a systematization of the a posteriori correlation between them and quality dimensions is provided. Finally, Big Data quality issues are considered in a conceptual framework suitable to map the evolution of the quality paradigm according to three core coordinates that are significant in the context of the Big Data phenomenon: the data type considered, the source of data, and the application domain. Thus, the framework allows ascertaining the relevant changes in data quality emerging with the Big Data phenomenon, through an integrative and theoretical literature review.
引用
收藏
页码:60 / 82
页数:23
相关论文
共 50 条
  • [1] From Big Data to Smart Data: A Data Quality Perspective
    Baldassarre, Maria Teresa
    Caballero, Ismael
    Caivano, Danilo
    Garcia, Bibiano Rivas
    Piattini, Mario
    [J]. PROCEEDINGS OF THE 1ST ACM SIGSOFT INTERNATIONAL WORKSHOP ON ENSEMBLE-BASED SOFTWARE ENGINEERING (ENSEMBLE '18), 2018, : 19 - 24
  • [2] BIG DATA, BIG DATA QUALITY PROBLEM
    Becker, David
    McMullen, Bill
    King, Trish Dunn
    [J]. PROCEEDINGS 2015 IEEE INTERNATIONAL CONFERENCE ON BIG DATA, 2015, : 2644 - 2653
  • [3] Big Data Quality: A Data Quality Profiling Model
    Taleb, Ikbal
    Serhani, Mohamed Adel
    Dssouli, Rachida
    [J]. SERVICES - SERVICES 2019, 2019, 11517 : 61 - 77
  • [4] Big Data and Data Quality Dimensions
    Rambli, Yanty Rahayu
    Shahibi, Mohd Sazili
    Ibrahim, Zaharudin
    Ismail, Mohd Nasir
    [J]. INNOVATION MANAGEMENT AND EDUCATION EXCELLENCE THROUGH VISION 2020, VOLS I -XI, 2018, : 6959 - 6964
  • [5] Data Quality Issues in Big Data
    Rao, Dhana
    Gudivada, Venkat N.
    Raghavan, Vijay V.
    [J]. PROCEEDINGS 2015 IEEE INTERNATIONAL CONFERENCE ON BIG DATA, 2015, : 2654 - 2660
  • [6] MEDICAL BIG DATA AND BIG DATA QUALITY PROBLEMS
    Hoffman, Sharona
    [J]. CONNECTICUT INSURANCE LAW JOURNAL, 2014, 21 (01): : 289 - 316
  • [7] Data Quality: Experiences and Lessons from Operationalizing Big Data
    Ganapathi, Archana
    Chen, Yanpei
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2016, : 1595 - 1602
  • [8] Big Data Quality: A Quality Dimensions Evaluation
    Taleb, Ikbal
    El Kassabi, Hadeel T.
    Serhani, Mohamed Adel
    Dssouli, Rachida
    Bouhaddioui, Chafik
    [J]. 2016 INT IEEE CONFERENCES ON UBIQUITOUS INTELLIGENCE & COMPUTING, ADVANCED & TRUSTED COMPUTING, SCALABLE COMPUTING AND COMMUNICATIONS, CLOUD AND BIG DATA COMPUTING, INTERNET OF PEOPLE, AND SMART WORLD CONGRESS (UIC/ATC/SCALCOM/CBDCOM/IOP/SMARTWORLD), 2016, : 759 - 765
  • [9] Big Data Quality: A Survey
    Taleb, Ikbal
    Serhani, Mohamed Adel
    Dssouli, Rachida
    [J]. 2018 IEEE INTERNATIONAL CONGRESS ON BIG DATA (IEEE BIGDATA CONGRESS), 2018, : 166 - 173
  • [10] Quality Assessment : Big Data
    Shadrin, A.
    Afonichkina, E.
    [J]. EDUCATION EXCELLENCE AND INNOVATION MANAGEMENT THROUGH VISION 2020, 2019, : 8865 - 8869