A new approach to data differential privacy based on regression models under heteroscedasticity with applications to machine learning repository data

被引:6
|
作者
Manchini, Carlos [1 ]
Ospina, Raydonal [1 ,2 ]
Leiva, Victor [3 ]
Martin-Barreiro, Carlos [4 ,5 ]
机构
[1] Univ Fed Pernambuco, Dept Stat, CASTLab, Recife, Brazil
[2] Univ Fed Bahia, Dept Estat, IME, Salvador, Brazil
[3] Pontifica Univ Catolica Valparaiso, Sch Ind Engn, Valparaiso, Chile
[4] Escuela Super Politecn Litoral ESPOL, Fac Nat Sci & Math, Guayaquil, Ecuador
[5] Univ Espiritu Santo, Fac Engn, Samborondon, Ecuador
关键词
Anonymity; Confidentiality; Data breach and fitting; Linear and logistic regressions; Monte Carlo simulation; Perturbations of data; Statistical consistency and modeling; HETEROSKEDASTICITY; ESTIMATOR; INFERENCE;
D O I
10.1016/j.ins.2022.10.076
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Generation of massive data in the digital age leads to possible violations of individual pri-vacy. The search for personal data becomes an increasingly recurrent exposure today. The present work corresponds to the area of differential privacy, which guarantees data confi-dentiality and robustness against invasive identification attacks. This area stands out in the literature for its rigorous mathematical basis capable of quantifying the loss of privacy. A differentially private method based on regression models was developed to prevent inver-sion attacks while retaining model efficacy. In this paper, we propose a novel approach to improve the data privacy based on regression models under heteroscedasticity, a common aspect, but not studied, in practical situations of differential privacy. The influence of pri-vacy restriction on the statistical performance of the estimators of model parameters is evaluated using Monte Carlo simulations, including a study of performance associated with test rejection rates for the proposed approach. The results of the numerical evaluation show high inferential distortion for stricter privacy restrictions. Empirical illustrations with real-world data are presented to show potential applications.(c) 2022 Elsevier Inc. All rights reserved.
引用
收藏
页码:280 / 300
页数:21
相关论文
共 50 条
  • [1] Correlated Differential Privacy of Multiparty Data Release in Machine Learning
    Zhao, Jian-Zhe
    Wang, Xing-Wei
    Mao, Ke-Ming
    Huang, Chen-Xi
    Su, Yu-Kai
    Li, Yu-Chen
    JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2022, 37 (01) : 231 - 251
  • [2] Data release for machine learning via correlated differential privacy
    Shen, Hua
    Li, Jiqiang
    Wu, Ge
    Zhang, Mingwu
    INFORMATION PROCESSING & MANAGEMENT, 2023, 60 (03)
  • [3] Correlated Differential Privacy of Multiparty Data Release in Machine Learning
    Jian-Zhe Zhao
    Xing-Wei Wang
    Ke-Ming Mao
    Chen-Xi Huang
    Yu-Kai Su
    Yu-Chen Li
    Journal of Computer Science and Technology, 2022, 37 : 231 - 251
  • [4] Learning Markov Chain Models from Sequential Data Under Local Differential Privacy
    Guner, Efehan
    Gursoy, M. Emre
    COMPUTER SECURITY - ESORICS 2023, PT II, 2024, 14345 : 359 - 379
  • [5] Enhancing correlated big data privacy using differential privacy and machine learning
    Biswas, Sreemoyee
    Fole, Anuja
    Khare, Nilay
    Agrawal, Pragati
    JOURNAL OF BIG DATA, 2023, 10 (01)
  • [6] Enhancing correlated big data privacy using differential privacy and machine learning
    Sreemoyee Biswas
    Anuja Fole
    Nilay Khare
    Pragati Agrawal
    Journal of Big Data, 10
  • [7] New Partially Linear Regression and Machine Learning Models Applied to Agronomic Data
    Rodrigues, Gabriela M.
    Ortega, Edwin M. M.
    Cordeiro, Gauss M.
    AXIOMS, 2023, 12 (11)
  • [8] Regression models for exceedance data: a new approach
    Marcelo Bourguignon
    Fernando Ferraz do Nascimento
    Statistical Methods & Applications, 2021, 30 : 157 - 173
  • [9] Regression models for exceedance data: a new approach
    Bourguignon, Marcelo
    do Nascimento, Fernando Ferraz
    STATISTICAL METHODS AND APPLICATIONS, 2021, 30 (01): : 157 - 173
  • [10] ADMM-Based Differential Privacy Learning for Penalized Quantile Regression on Distributed Functional Data
    Zhou, Xingcai
    Xiang, Yu
    MATHEMATICS, 2022, 10 (16)