Ensuring High-Quality Private Data for Responsible Data Science: Vision and Challenges

被引:16
|
作者
Srivastava, Divesh [1 ]
Scannapieco, Monica [2 ]
Redman, Thomas C. [3 ]
机构
[1] AT&T Labs Res, Room 4C202B,1 AT&T Way, Bedminster, NJ 07921 USA
[2] Italian Natl Inst Stat, Via C Balbo 16, I-00184 Rome, Italy
[3] Data Qual Solut, 12 Monmouth Ave, Rumson, NJ 07760 USA
来源
关键词
Responsible data science; data trust; private data; quality of private data;
D O I
10.1145/3287168
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
High-quality data is critical for effective data science. As the use of data science has grown, so too have concerns that individuals' rights to privacy will be violated. This has led to the development of data protection regulations around the globe and the use of sophisticated anonymization techniques to protect privacy. Such measures make it more challenging for the data scientist to understand the data, exacerbating issues of data quality. Responsible data science aims to develop useful insights from the data while fully embracing these considerations. We pose the high-level problem in this article, "How can a data scientist develop the needed trust that private data has high quality?" We then identify a series of challenges for various data-centric communities and outline research questions for data quality and privacy researchers, which would need to be addressed to effectively answer the problem posed in this article.
引用
下载
收藏
页码:1 / 9
页数:9
相关论文
共 50 条
  • [21] THE NEED FOR HIGH-QUALITY SPECTRA REFERENCE DATA
    WILKINS, CL
    GRIFFITHS, PR
    APPLIED SPECTROSCOPY, 1988, 42 (04) : 537 - 537
  • [22] Land seismic techniques for high-quality data
    Bagaini, Claudio
    Bunting, Tim
    El-Emam, Adel
    Laake, Andreas
    Strobbia, Claudio
    Oilfield Review, 2010, 22 (02): : 28 - 39
  • [23] A vision for data science
    Mattmann, Chris A.
    NATURE, 2013, 493 (7433) : 473 - 475
  • [24] A vision for data science
    Chris A. Mattmann
    Nature, 2013, 493 : 473 - 475
  • [25] Responsible and accountable data science
    Wagner, Ben
    Mueller-Birn, Claudia
    PATTERNS, 2022, 3 (11):
  • [26] Teaching Responsible Data Science
    Stoyanovich, Julia
    PROCEEDINGS OF THE 1ST ACM SIGMOD INTERNATIONAL WORKSHOP ON DATASYSTEMS EDUCATION: BRIDGING EDUCATION PRACTICE WITH EDUCATION RESEARCH, DATAED 2022, 2022, : 4 - 9
  • [27] Ensuring Data Governace and Enhancing Data Security in a Private Cloud Environment
    Monday, Happy N.
    Li, Jian P.
    Nneji, Grace U.
    Ukwuoma, Chiagoziem C.
    Agomuo, David
    Nneji, Richard I.
    2018 IEEE 9TH ANNUAL INFORMATION TECHNOLOGY, ELECTRONICS AND MOBILE COMMUNICATION CONFERENCE (IEMCON), 2018, : 1018 - 1024
  • [28] Challenges in developing a high-quality surface wind-speed data-set for Australia
    Jakob, Doerte
    AUSTRALIAN METEOROLOGICAL AND OCEANOGRAPHIC JOURNAL, 2010, 60 (04): : 227 - 236
  • [29] Understanding Data Quality Ensuring Data Quality by Design in the Rail Industry
    Fu, Qian
    Easton, John M.
    2017 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2017, : 3792 - 3799
  • [30] Scopus as a curated, high-quality bibliometric data source for academic research in quantitative science studies
    Baas, Jeroen
    Schotten, Michiel
    Plume, Andrew
    Cote, Gregoire
    Karimi, Reza
    QUANTITATIVE SCIENCE STUDIES, 2020, 1 (01): : 377 - 386