Approximate Data Deletion from Machine Learning Models

被引：0

作者：

Izzo, Zachary ^{[1
]}

Smart, Mary Anne ^{[2
]}

Chaudhuri, Kamalika ^{[2
]}

Zou, James ^{[3
]}

机构：

[1] Stanford Univ, Dept Math, Stanford, CA 94305 USA

[2] Univ Calif San Diego, Dept CS&E, San Diego, CA USA

[3] Stanford Univ, Dept BDS, Stanford, CA 94305 USA

来源：

24TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS (AISTATS) | 2021年 / 130卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Deleting data from a trained machine learning (ML) model is a critical task in many applications. For example, we may want to remove the influence of training points that might be out of date or outliers. Regulations such as EU's General Data Protection Regulation also stipulate that individuals can request to have their data deleted. The naive approach to data deletion is to retrain the ML model on the remaining data, but this is too time consuming. In this work, we propose a new approximate deletion method for linear and logistic models whose computational cost is linear in the the feature dimension d and in-dependent of the number of training data n. This is a significant gain over all existing methods, which all have superlinear time dependence on the dimension. We also develop a new feature-injection test to evaluate the thoroughness of data deletion from ML models.

引用

页数：10

共 50 条

[41] Learning approximate MRFs from large transaction data
Wang, Chao
Parthasarathy, Srinivasan
KNOWLEDGE DISCOVERY IN DATABASES: PKDD 2006, PROCEEDINGS, 2006, 4213 : 641 - 649
[42] Understanding Learning from EEG Data: Combining Machine Learning and Feature Engineering Based on Hidden Markov Models and Mixed Models
Palma, Gabriel R.
Thornberry, Conor
Commins, Sean
Moral, Rafael A.
NEUROINFORMATICS, 2024, : 487 - 497
[43] Understanding the performance of machine learning models from data- to patient-level
Valeriano, Maria Gabriela
Matran-Fernandez, Ana
Kiffer, Carlos
Lorena, Ana Carolina
Journal of Data and Information Quality, 2024, 16 (04)
[44] Development of Machine Learning Models for Prediction of Osteoporosis from Clinical Health Examination Data
Yang, Wen-Yu Ou
Lai, Cheng-Chien
Tsou, Meng-Ting
Hwang, Lee-Ching
INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH, 2021, 18 (14)
[45] MACHINE LEARNING AND MATHEMATICAL MODELS FROM MULTI-OMICS DATA FOR PERSONALIZED MEDICINE
Saez-Rodriguez, Julio
TISSUE ENGINEERING PART A, 2022, 28 : S651 - S651
[46] Estimation of soil temperature from meteorological data using different machine learning models
Feng, Yu
Cui, Ningbo
Hao, Weiping
Gao, Lili
Gong, Daozhi
GEODERMA, 2019, 338 : 67 - 77
[47] Machine-learning error models for approximate solutions to parameterized systems of nonlinear equations
Freno, Brian A.
Carlberg, Kevin T.
COMPUTER METHODS IN APPLIED MECHANICS AND ENGINEERING, 2019, 348 : 250 - 296
[48] Time -series machine -learning error models for approximate solutions to parameterized dynamical systems
Parish, Eric J.
Carlberg, Kevin T.
COMPUTER METHODS IN APPLIED MECHANICS AND ENGINEERING, 2020, 365
[49] Learning Kinematic Machine Models from Videos
Thies, Lucas
Stamminger, Marc
Bauer, Frank
2020 IEEE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND VIRTUAL REALITY (AIVR 2020), 2020, : 107 - 114
[50] Fast approximate STEM image simulations from a machine learning model
Combs, Aidan H.
Maldonis, Jason J.
Feng, Jie
Xu, Zhongnan
Voyles, Paul M.
Morgan, Dane
ADVANCED STRUCTURAL AND CHEMICAL IMAGING, 2019, 5

← 1 2 3 4 5 →