Approximate Data Deletion from Machine Learning Models

被引：0

作者：

Izzo, Zachary ^{[1
]}

Smart, Mary Anne ^{[2
]}

Chaudhuri, Kamalika ^{[2
]}

Zou, James ^{[3
]}

机构：

[1] Stanford Univ, Dept Math, Stanford, CA 94305 USA

[2] Univ Calif San Diego, Dept CS&E, San Diego, CA USA

[3] Stanford Univ, Dept BDS, Stanford, CA 94305 USA

来源：

24TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS (AISTATS) | 2021年 / 130卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Deleting data from a trained machine learning (ML) model is a critical task in many applications. For example, we may want to remove the influence of training points that might be out of date or outliers. Regulations such as EU's General Data Protection Regulation also stipulate that individuals can request to have their data deleted. The naive approach to data deletion is to retrain the ML model on the remaining data, but this is too time consuming. In this work, we propose a new approximate deletion method for linear and logistic models whose computational cost is linear in the the feature dimension d and in-dependent of the number of training data n. This is a significant gain over all existing methods, which all have superlinear time dependence on the dimension. We also develop a new feature-injection test to evaluate the thoroughness of data deletion from ML models.

引用

页数：10

共 50 条

[1] Certified Data Removal from Machine Learning Models
Guo, Chuan
Goldstein, Tom
Hannun, Awni
van der Maaten, Laurens
25TH AMERICAS CONFERENCE ON INFORMATION SYSTEMS (AMCIS 2019), 2019,
[2] Certified Data Removal from Machine Learning Models
Guo, Chuan
Goldstein, Tom
Hannun, Awni
van der Maaten, Laurens
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119
[3] On Efficient Approximate Queries over Machine Learning Models
Ding, Dujian
Amer-Yahia, Sihem
Lakshmanan, Laks
PROCEEDINGS OF THE VLDB ENDOWMENT, 2022, 16 (04): : 918 - 931
[4] Learning from approximate data
Cheung, S
COMPUTING AND COMBINATORICS, PROCEEDINGS, 2000, 1858 : 407 - 415
[5] Learning EPON delay models from data: a machine learning approach
Alberto Hernandez, Jose
Ebrahimzadeh, Amin
Maier, Martin
Larrabeiti, David
JOURNAL OF OPTICAL COMMUNICATIONS AND NETWORKING, 2021, 13 (12) : 322 - 330
[6] Approximate Imputation Method for Missing Data in Machine Learning
Cao W.
Chu Y.
Li X.
1600, Xi'an Jiaotong University (51): : 142 - 148
[7] Generating models of mental retardation from data with machine learning
Mani, S
McDermott, S
Pazzani, MJ
1997 IEEE KNOWLEDGE AND DATA ENGINEERING EXCHANGE WORKSHOP, PROCEEDINGS, 1997, : 114 - 119
[8] Information Leakage from Data Updates in Machine Learning Models
Hui, Tian
Farokhi, Farhad
Ohrimenko, Olga
PROCEEDINGS OF THE 16TH ACM WORKSHOP ON ARTIFICIAL INTELLIGENCE AND SECURITY, AISEC 2023, 2023, : 35 - 41
[9] Stealing Your Data from Compressed Machine Learning Models
Xu, Nuo
Liu, Qi
Liu, Tao
Liu, Zihao
Guo, Xiaochen
Wen, Wujie
PROCEEDINGS OF THE 2020 57TH ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2020,
[10] Towards Machine Learning of Predictive Models from Ecological Data
Tamaddoni-Nezhad, Alireza
Bohan, David
Raybould, Alan
Muggleton, Stephen
INDUCTIVE LOGIC PROGRAMMING, ILP 2014, 2015, 9046 : 154 - 167

← 1 2 3 4 5 →