Evaluating ensemble imputation in software effort estimation

被引:3
|
作者
Abnane, Ibtissam [1 ]
Idri, Ali [1 ,2 ]
Chlioui, Imane [1 ]
Abran, Alain [3 ]
机构
[1] Mohammed V Univ, Software Project Management Res Team, ENSIAS, Rabat, Morocco
[2] Mohammed VI Polytech Univ, MSDA, Ben Guerir, Morocco
[3] Univ Quebec, Dept Software Engn & Informat Technol, ETS, Montreal, PQ, Canada
关键词
Missing data; Imputation; Ensemble; Software development effort estimation; MISSING DATA TECHNIQUES; INCOMPLETE DATA; COST ESTIMATION; FUZZY ANALOGY; PREDICTION; REGRESSION; ALGORITHM; VALUES;
D O I
10.1007/s10664-022-10260-0
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Choosing the appropriate missing data (MD) imputation technique for a given software development effort estimation (SDEE) technique is not a trivial task. In fact, the impact of MD imputation on the estimation output depends on the dataset and the SDEE technique used, and there is no best imputation technique in all contexts. Thus, an attractive solution is to use more than one imputation technique and combine their results to obtain a final imputation outcome. This concept is called ensemble imputation and can significantly improve the effort estimation accuracy. This study proposes and constructs 11 heterogeneous ensemble imputation techniques, whose members are two, three, or four of the following single imputation techniques: K-nearest neighbors, expectation maximization, support vector regression (SVR) and decision trees (DTs). The effects of single/ensemble imputation techniques on SDEE performance were evaluated over six SDEE datasets: COCOMO81, ISBSG, Desharnais, China, Kemerer, and Miyazaki. Five SDEE performance measures were used: standardized accuracy (SA), predictor at 25% (Pred (0.25)), mean balanced relative error (MBRE), mean inverted balanced relative error (MIBRE), and logarithmic standard deviation (LSD). Moreover, we used: (1) the Skott-Knott (SK) statistical test to cluster and compare the results, and (2) the Borda count method to rank the SDEE techniques belonging to the best SK cluster.The results showed that ensemble imputers significantly improved the performance of SDEE techniques compared to single imputation techniques. We also found that adding one or more imputers to the ensemble imputers generally led to a significant improvement in the SDEE performance. When the performance improvement is not significant, it is better to use the ensemble imputer with the minimum number of members because it is less complex. For ensemble imputers, the results suggest that no particular ensemble imputer gave the best results in all contexts. Overall, SVR imputation was the best imputation technique used to construct ensemble imputers for the SDEE. For the SDEE techniques, the best results were obtained by the DTs and SVR variants using ensemble imputation.
引用
收藏
页数:37
相关论文
共 50 条
  • [1] Evaluating ensemble imputation in software effort estimation
    Ibtissam Abnane
    Ali Idri
    Imane Chlioui
    Alain Abran
    [J]. Empirical Software Engineering, 2023, 28
  • [2] Heterogeneous Ensemble Imputation for Software Development Effort Estimation
    Abnane, Ibtissam
    Idri, Ali
    Hosni, Mohamed
    Abran, Alain
    [J]. PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON PREDICTIVE MODELS AND DATA ANALYTICS IN SOFTWARE ENGINEERING (PROMISE '21), 2021, : 1 - 10
  • [3] Analogy Software Effort Estimation Using Ensemble KNN Imputation
    Abnane, Ibtissam
    Hosni, Mohamed
    Idri, Ali
    Abran, Alain
    [J]. 2019 45TH EUROMICRO CONFERENCE ON SOFTWARE ENGINEERING AND ADVANCED APPLICATIONS (SEAA 2019), 2019, : 228 - 235
  • [4] SOFTWARE EFFORT ESTIMATION USING A NEURAL NETWORK ENSEMBLE
    Pai, Dinesh R.
    McFall, Kevin S.
    Subramanian, Girish H.
    [J]. JOURNAL OF COMPUTER INFORMATION SYSTEMS, 2013, 53 (04) : 49 - 58
  • [5] A Stacking Ensemble-based Approach for Software Effort Estimation
    Shukla, Suyash
    Kumar, Sandeep
    [J]. ENASE: PROCEEDINGS OF THE 16TH INTERNATIONAL CONFERENCE ON EVALUATION OF NOVEL APPROACHES TO SOFTWARE ENGINEERING, 2021, : 205 - 212
  • [6] Heterogeneous Ensemble Model to Optimize Software Effort Estimation Accuracy
    Ali, Syed Sarmad
    Ren, Jian
    Zhang, Kui
    Wu, Ji
    Liu, Chao
    [J]. IEEE ACCESS, 2023, 11 : 27759 - 27792
  • [7] A pragmatic ensemble learning approach for effective software effort estimation
    Suresh Kumar, P.
    Behera, H. S.
    Nayak, Janmenjoy
    Naik, Bighnaraj
    [J]. INNOVATIONS IN SYSTEMS AND SOFTWARE ENGINEERING, 2022, 18 (02) : 283 - 299
  • [8] Heterogeneous Ensemble Dynamic Selection for Software Development Effort Estimation
    Cabral, Jose Thiago H. de A.
    Araujo, Ricardo de A.
    Nobrega, Jarley P.
    de Oliveira, Adriano L., I
    [J]. 2017 IEEE 29TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2017), 2017, : 210 - 217
  • [9] A pragmatic ensemble learning approach for effective software effort estimation
    P. Suresh Kumar
    H. S. Behera
    Janmenjoy Nayak
    Bighnaraj Naik
    [J]. Innovations in Systems and Software Engineering, 2022, 18 : 283 - 299
  • [10] An evolutionary ensemble analogy-based software effort estimation
    Shahpar, Zahra
    Bardsiri, Vahid Khatibi
    Bardsiri, Amid Khatibi
    [J]. SOFTWARE-PRACTICE & EXPERIENCE, 2022, 52 (04): : 929 - 946