Balancing Privacy and Utility in Cross-Company Defect Prediction

被引:100
|
作者
Peters, Fayola [1 ]
Menzies, Tim [1 ]
Gong, Liang [2 ]
Zhang, Hongyu [2 ]
机构
[1] W Virginia Univ, Lane Dept Comp Sci & Elect Engn, Morgantown, WV 26506 USA
[2] Tsinghua Univ, Sch Software, Beijing 100084, Peoples R China
基金
美国国家科学基金会;
关键词
Privacy; classification; defect prediction; STATIC CODE ATTRIBUTES; K-ANONYMITY; MODEL;
D O I
10.1109/TSE.2013.6
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Background: Cross-company defect prediction (CCDP) is a field of study where an organization lacking enough local data can use data from other organizations for building defect predictors. To support CCDP, data must be shared. Such shared data must be privatized, but that privatization could severely damage the utility of the data. Aim: To enable effective defect prediction from shared data while preserving privacy. Method: We explore privatization algorithms that maintain class boundaries in a dataset. CLIFF is an instance pruner that deletes irrelevant examples. MORPH is a data mutator that moves the data a random distance, taking care not to cross class boundaries. CLIFF+MORPH are tested in a CCDP study among 10 defect datasets from the PROMISE data repository. Results: We find: 1) The CLIFFed+MORPHed algorithms provide more privacy than the state-of-the-art privacy algorithms; 2) in terms of utility measured by defect prediction, we find that CLIFF+MORPH performs significantly better. Conclusions: For the OO defect data studied here, data can be privatized and shared without a significant degradation in utility. To the best of our knowledge, this is the first published result where privatization does not compromise defect prediction.
引用
收藏
页码:1054 / 1068
页数:15
相关论文
共 50 条
  • [1] Transfer learning for cross-company software defect prediction
    Ma, Ying
    Luo, Guangchun
    Zeng, Xue
    Chen, Aiguo
    INFORMATION AND SOFTWARE TECHNOLOGY, 2012, 54 (03) : 248 - 256
  • [2] Improving Cross-Company Defect Prediction with Data Filtering
    Yu, Xiao
    Liu, Jin
    Peng, Weiqiang
    Peng, Xingyu
    INTERNATIONAL JOURNAL OF SOFTWARE ENGINEERING AND KNOWLEDGE ENGINEERING, 2017, 27 (9-10) : 1427 - 1438
  • [3] On the relative value of cross-company and within-company data for defect prediction
    Burak Turhan
    Tim Menzies
    Ayşe B. Bener
    Justin Di Stefano
    Empirical Software Engineering, 2009, 14 : 540 - 578
  • [4] On the relative value of cross-company and within-company data for defect prediction
    Turhan, Burak
    Menzies, Tim
    Bener, Ayse B.
    Di Stefano, Justin
    EMPIRICAL SOFTWARE ENGINEERING, 2009, 14 (05) : 540 - 578
  • [5] A feature matching and transfer approach for cross-company defect prediction
    Yu, Qiao
    Jiang, Shujuan
    Zhang, Yanmei
    JOURNAL OF SYSTEMS AND SOFTWARE, 2017, 132 : 366 - 378
  • [6] The utility challenge of privacy-preserving data-sharing in cross-company defect prediction An empirical study of the CLIFF&MORPH algorithm
    Fan, Yi
    Lv, Chenxi
    Zhang, Xu
    Zhou, Guoqiang
    Zhou, Yuming
    2017 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE AND EVOLUTION (ICSME), 2017, : 80 - 90
  • [7] Applying Cross Project Defect Prediction Approaches to Cross-Company Effort Estimation
    Amasaki, Sousuke
    Yokogawa, Tomoyuki
    Aman, Hirohisa
    15TH INTERNATIONAL CONFERENCE ON PREDICTIVE MODELS AND DATA ANALYTICS IN SOFTWARE ENGINEERING (PROMISE'19), 2019, : 76 - 79
  • [8] Research on Cross-Company Defect Prediction Method to Improve Software Security
    Shao, Yanli
    Zhao, Jingru
    Wang, Xingqi
    Wu, Weiwei
    Fang, Jinglong
    SECURITY AND COMMUNICATION NETWORKS, 2021, 2021
  • [9] Defect prediction model of static code features for cross-company and cross-project software
    Singh S.
    Singla R.
    International Journal of Information Technology, 2021, 13 (2) : 667 - 675
  • [10] Approach to cross-company spacecraft software defect prediction based on transfer learning
    Ha Q.-H.
    Liu D.-Y.
    Chen Y.
    Liu L.
    Guangxue Jingmi Gongcheng/Optics and Precision Engineering, 2019, 27 (02): : 469 - 478