Privacy Preserving Attribute-Focused Anonymization Scheme for Healthcare Data Publishing

被引:11
|
作者
Onesimu, J. Andrew [1 ]
Karthikeyan, J. [2 ]
Eunice, Jennifer [3 ]
Pomplun, Marc [4 ]
Hien Dang [4 ,5 ]
机构
[1] Manipal Acad Higher Educ, Manipal Inst Technol, Dept Comp Sci & Engn, Manipal 576104, Karnataka, India
[2] Vellore Inst Technol, Sch Informat Technol & Engn, Vellore 632014, Tamil Nadu, India
[3] Karunya Inst Technol & Sci, Dept Elect & Commun Engn, Coimbatore 641114, Tamil Nadu, India
[4] Univ Massachusetts, Dept Comp Sci, Boston, MA 02125 USA
[5] Thuyloi Univ, Fac Comp Sci & Engn, Hanoi 100000, Vietnam
关键词
Anonymization; data privacy; data publishing; healthcare data; privacy-preserving; ANONYMITY; CLASSIFICATION; MODEL;
D O I
10.1109/ACCESS.2022.3199433
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Advancements in Industry 4.0 brought tremendous improvements in the healthcare sector, such as better quality of treatment, enhanced communication, remote monitoring, and reduced cost. Sharing healthcare data with healthcare providers is crucial for harnessing the benefits of such improvements. In general, healthcare data holds sensitive information about individuals. Hence, sharing such data is challenging because of various security and privacy issues. According to privacy regulations and ethical requirements, it is essential to preserve the privacy of patients before sharing data for medical research. State-of-the-art literature on privacy preserving studies either uses cryptographic approaches to protect the privacy or uses anonymizing techniques regardless of the type of attributes, this results in poor protection and data utility. In this paper, we propose an attribute-focused privacy preserving data publishing scheme. The proposed scheme is two-fold, comprising a fixed-interval approach to protect numerical attributes and an improved l-diverse slicing approach to protect the categorical and sensitive attributes. In the fixed-interval approach, the original values of the healthcare data are replaced with an equivalent computed value. The improved l-diverse slicing approach partitions the data both horizontally and vertically to avoid privacy leaks. Extensive experiments with real-world datasets are conducted to evaluate the performance of the proposed scheme. The classification models built on anonymized dataset yields approximately 13% better accuracy than benchmarked algorithms. Experimental analyses show that the average information loss which is measured by normalized certainty penalty (NCP) is reduced by 12% compared to similar approaches. The attribute focused scheme not only provides data utility but also prevents the data from membership disclosures, attribute disclosures, and identity disclosures.
引用
收藏
页码:86979 / 86997
页数:19
相关论文
共 50 条
  • [1] Privacy Preserving Data Publishing and Data Anonymization Approaches: A Review
    Goswami, Puneet
    Madan, Suman
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION AND AUTOMATION (ICCCA), 2017, : 139 - 142
  • [2] Anonymization Techniques for Privacy Preserving Data Publishing: A Comprehensive Survey
    Majeed, Abdul
    Lee, Sungchang
    [J]. IEEE ACCESS, 2021, 9 : 8512 - 8545
  • [3] Selective Feature Anonymization for Privacy-Preserving Image Data Publishing
    Kim, Taehoon
    Yang, Jihoon
    [J]. ELECTRONICS, 2020, 9 (05):
  • [4] Toward Scalable Anonymization for Privacy-Preserving Big Data Publishing
    Mehta, Brijesh B.
    Rao, Udai Pratap
    [J]. RECENT FINDINGS IN INTELLIGENT COMPUTING TECHNIQUES, VOL 2, 2018, 708 : 297 - 304
  • [5] Anonymization-Based Attacks in Privacy-Preserving Data Publishing
    Wong, Raymond Chi-Wing
    Fu, Ada Wai-Chee
    Wang, Ke
    Pei, Jian
    [J]. ACM TRANSACTIONS ON DATABASE SYSTEMS, 2009, 34 (02):
  • [6] Attribute-centric anonymization scheme for improving user privacy and utility of publishing e-health data
    Majeed, Abdul
    [J]. JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2019, 31 (04) : 426 - 435
  • [7] EDAMS: Efficient Data Anonymization Model Selector for Privacy-Preserving Data Publishing
    Qamar, Tehreem
    Bawany, Narmeen Zakaria
    Khan, Najeed Ahmed
    [J]. ENGINEERING TECHNOLOGY & APPLIED SCIENCE RESEARCH, 2020, 10 (02) : 5423 - 5427
  • [8] A new utility-aware anonymization model for privacy preserving data publishing
    Canbay, Yavuz
    Sagiroglu, Seref
    Vural, Yilmaz
    [J]. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2022, 34 (10):
  • [9] Stipulation-Based Anonymization with Sensitivity Flags for Privacy Preserving Data Publishing
    Ashoka, K.
    Poornima, B.
    [J]. RECENT FINDINGS IN INTELLIGENT COMPUTING TECHNIQUES, VOL 1, 2019, 707 : 445 - 454
  • [10] Privacy-Preserving Trajectory Data Publishing by Dynamic Anonymization with Bounded Distortion
    Li, Songyuan
    Tian, Hui
    Shen, Hong
    Sang, Yingpeng
    [J]. ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION, 2021, 10 (02)