Approximate Data Deletion from Machine Learning Models

被引:0
|
作者
Izzo, Zachary [1 ]
Smart, Mary Anne [2 ]
Chaudhuri, Kamalika [2 ]
Zou, James [3 ]
机构
[1] Stanford Univ, Dept Math, Stanford, CA 94305 USA
[2] Univ Calif San Diego, Dept CS&E, San Diego, CA USA
[3] Stanford Univ, Dept BDS, Stanford, CA 94305 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deleting data from a trained machine learning (ML) model is a critical task in many applications. For example, we may want to remove the influence of training points that might be out of date or outliers. Regulations such as EU's General Data Protection Regulation also stipulate that individuals can request to have their data deleted. The naive approach to data deletion is to retrain the ML model on the remaining data, but this is too time consuming. In this work, we propose a new approximate deletion method for linear and logistic models whose computational cost is linear in the the feature dimension d and in-dependent of the number of training data n. This is a significant gain over all existing methods, which all have superlinear time dependence on the dimension. We also develop a new feature-injection test to evaluate the thoroughness of data deletion from ML models.
引用
收藏
页数:10
相关论文
共 50 条
  • [41] Learning approximate MRFs from large transaction data
    Wang, Chao
    Parthasarathy, Srinivasan
    KNOWLEDGE DISCOVERY IN DATABASES: PKDD 2006, PROCEEDINGS, 2006, 4213 : 641 - 649
  • [42] Understanding Learning from EEG Data: Combining Machine Learning and Feature Engineering Based on Hidden Markov Models and Mixed Models
    Palma, Gabriel R.
    Thornberry, Conor
    Commins, Sean
    Moral, Rafael A.
    NEUROINFORMATICS, 2024, : 487 - 497
  • [43] Understanding the performance of machine learning models from data- to patient-level
    Valeriano, Maria Gabriela
    Matran-Fernandez, Ana
    Kiffer, Carlos
    Lorena, Ana Carolina
    Journal of Data and Information Quality, 2024, 16 (04)
  • [44] Development of Machine Learning Models for Prediction of Osteoporosis from Clinical Health Examination Data
    Yang, Wen-Yu Ou
    Lai, Cheng-Chien
    Tsou, Meng-Ting
    Hwang, Lee-Ching
    INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH, 2021, 18 (14)
  • [45] MACHINE LEARNING AND MATHEMATICAL MODELS FROM MULTI-OMICS DATA FOR PERSONALIZED MEDICINE
    Saez-Rodriguez, Julio
    TISSUE ENGINEERING PART A, 2022, 28 : S651 - S651
  • [46] Estimation of soil temperature from meteorological data using different machine learning models
    Feng, Yu
    Cui, Ningbo
    Hao, Weiping
    Gao, Lili
    Gong, Daozhi
    GEODERMA, 2019, 338 : 67 - 77
  • [47] Machine-learning error models for approximate solutions to parameterized systems of nonlinear equations
    Freno, Brian A.
    Carlberg, Kevin T.
    COMPUTER METHODS IN APPLIED MECHANICS AND ENGINEERING, 2019, 348 : 250 - 296
  • [48] Time -series machine -learning error models for approximate solutions to parameterized dynamical systems
    Parish, Eric J.
    Carlberg, Kevin T.
    COMPUTER METHODS IN APPLIED MECHANICS AND ENGINEERING, 2020, 365
  • [49] Learning Kinematic Machine Models from Videos
    Thies, Lucas
    Stamminger, Marc
    Bauer, Frank
    2020 IEEE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND VIRTUAL REALITY (AIVR 2020), 2020, : 107 - 114
  • [50] Fast approximate STEM image simulations from a machine learning model
    Combs, Aidan H.
    Maldonis, Jason J.
    Feng, Jie
    Xu, Zhongnan
    Voyles, Paul M.
    Morgan, Dane
    ADVANCED STRUCTURAL AND CHEMICAL IMAGING, 2019, 5