Empirical Evaluation of Mimic Software Project Data Sets for Software Effort Estimation

被引:0
|
作者
Gan, Maohua [1 ]
Yucel, Zeynep [2 ]
Monden, Akito [3 ]
Sasaki, Kentaro [2 ]
机构
[1] Okayama Univ, Div Elect & Informat Syst Engn, Grad Sch Nat Sci & Technol, Okayama, Japan
[2] Okayama Univ, Okayama, Japan
[3] Okayama Univ, Grad Sch Nat Sci & Technol, Okayama, Japan
来源
关键词
empirical software engineering; data confidentiality; data mining; ADAPTATION TECHNIQUES;
D O I
10.1587/transinf.2019EDP7150
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
To conduct empirical research on industry software development, it is necessary to obtain data of real software projects from industry. However, only few such industry data sets are publicly available; and unfortunately, most of them are very old. In addition, most of today's software companies cannot make their data open, because software development involves many stakeholders, and thus, its data confidentiality must be strongly preserved. To that end, this study proposes a method for artificially generating a "mimic" software project data set, whose characteristics (such as average, standard deviation and correlation coefficients) are very similar to a given confidential data set. Instead of using the original (confidential) data set, researchers are expected to use the mimic data set to produce similar results as the original data set. The proposed method uses the Box-Muller transform for generating normally distributed random numbers; and exponential transformation and number reordering for data mimicry. To evaluate the efficacy of the proposed method, effort estimation is considered as potential application domain for employing mimic data. Estimation models are built from 8 reference data sets and their concerning mimic data. Our experiments confirmed that models built from mimic data sets show similar effort estimation performance as the models built from original data sets, which indicate the capability of the proposed method in generating representative samples.
引用
收藏
页码:2094 / 2103
页数:10
相关论文
共 50 条
  • [1] Project productivity evaluation in early software effort estimation
    Azzeh, Mohammad
    Nassif, Ali Bou
    [J]. JOURNAL OF SOFTWARE-EVOLUTION AND PROCESS, 2018, 30 (12)
  • [2] Quantitative evaluation of data smoothing for software effort estimation
    Korenaga, Kento
    Monden, Akito
    Yücel, Zeynep
    [J]. Computer Software, 2021, 38 (03) : 83 - 89
  • [3] Software Project Duration and Effort: An Empirical Study
    Evelyn J. Barry
    Tridas Mukhopadhyay
    Sandra A. Slaughter
    [J]. Information Technology and Management, 2002, 3 (1-2) : 113 - 136
  • [4] Software project effort estimation with voting rules
    Koch, Stefan
    Mitloehner, Johann
    [J]. DECISION SUPPORT SYSTEMS, 2009, 46 (04) : 895 - 901
  • [5] Method Study of Software Project Effort Estimation
    Zhang Jun-guang
    [J]. 2008 4TH INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, NETWORKING AND MOBILE COMPUTING, VOLS 1-31, 2008, : 7594 - 7597
  • [6] Bagging predictors for estimation of software project effort
    Braga, Petronio L.
    Oliveira, Adriano L. I.
    Ribeiro, Gustavo H. T.
    Meira, Silvio R. L.
    [J]. 2007 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-6, 2007, : 1595 - +
  • [7] Estimation Method of Software Project Effort Buffer
    Zhang, J. G.
    Jia, S. K.
    Song, X. W.
    [J]. INTERNATIONAL CONFERENCE ON ADVANCES IN MANAGEMENT ENGINEERING AND INFORMATION TECHNOLOGY (AMEIT 2015), 2015, : 782 - 788
  • [8] An experiment on software project size and effort estimation
    Passing, U
    Shepperd, M
    [J]. 2003 INTERNATIONAL SYMPOSIUM ON EMPIRICAL SOFTWARE ENGINEERING, PROCEEDINGS, 2003, : 120 - 129
  • [9] Data Smoothing for Software Effort Estimation
    Korenaga, Kent
    Monden, Akito
    Yucel, Zeynep
    [J]. 2019 20TH IEEE/ACIS INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ARTIFICIAL INTELLIGENCE, NETWORKING AND PARALLEL/DISTRIBUTED COMPUTING (SNPD), 2019, : 501 - 506
  • [10] Empirical studies on effort estimation in software development projects
    Jorgensen, M
    Sjoberg, DIK
    [J]. CHALLENGES OF INFORMATION TECHNOLOGY MANAGEMENT IN THE 21ST CENTURY, 2000, : 778 - 779