Mining Software Repository for Cleaning Bugs Using Data Mining Technique

被引:5
|
作者
Mahmood, Nasir [1 ]
Hafeez, Yaser [1 ]
Iqbal, Khalid [2 ]
Hussain, Shariq [3 ]
Aqib, Muhammad [1 ]
Jamal, Muhammad [4 ]
Song, Oh-Young [5 ]
机构
[1] Pir Mehr Ali Shah Arid Agr Univ, Univ Inst Informat Technol, Rawalpindi 46000, Pakistan
[2] COMSATS Univ Islamabad, Dept Comp Sci, Attock Campus, Attock 43600, Pakistan
[3] Fdn Univ Islamabad, Dept Software Engn, Islamabad 44000, Pakistan
[4] Pir Mehr Ali Shah Arid Agr Univ, Dept Math & Stat, Rawalpindi 46000, Pakistan
[5] Sejong Univ, Dept Software, Seoul 05006, South Korea
来源
CMC-COMPUTERS MATERIALS & CONTINUA | 2021年 / 69卷 / 01期
关键词
Fault prediction; association rule; data mining; frequent pattern mining; RULES;
D O I
10.32604/cmc.2021.016614
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Despite advances in technological complexity and efforts, software repository maintenance requires reusing the data to reduce the effort and complexity. However, increasing ambiguity, irrelevance, and bugs while extracting similar data during software development generate a large amount of data from those data that reside in repositories. Thus, there is a need for a repository mining technique for relevant and bug-free data prediction. This paper proposes a fault prediction approach using a data-mining technique to find good predictors for high-quality software. To predict errors in mining data, the Apriori algorithm was used to discover association rules by fixing confidence at more than 40% and support at least 30%. The pruning strategy was adopted based on evaluation measures. Next, the rules were extracted from three projects of different domains; the extracted rules were then combined to obtain the most popular rules based on the evaluation measure values. To evaluate the proposed approach, we conducted an experimental study to compare the proposed rules with existing ones using four different industrial projects. The evaluation showed that the results of our proposal are promising. Practitioners and developers can utilize these rules for defect prediction during early software development.
引用
收藏
页码:873 / 893
页数:21
相关论文
共 50 条
  • [41] Data Mining for Healthy Tomorrow with the Implementation of Software Project Management Technique
    Rao, K. Venkata
    Balakrishna, R.
    Pai, H. Aditya
    Pareek, Piyush Kumar
    ARTIFICIAL INTELLIGENCE AND EVOLUTIONARY COMPUTATIONS IN ENGINEERING SYSTEMS, ICAIECES 2015, 2016, 394 : 345 - 355
  • [42] Preliminary Cleaning and Transformation of Data in Data Mining Using PHP Pthreads Library
    Shichkina, Yulia
    Koblov, Alexander
    Lysov, Kirill
    Iakushkin, Oleg
    COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2017, PT V, 2017, 10408 : 463 - 472
  • [43] Using text mining and link analysis for software mining
    Grcar, Miha
    Grobehlik, Marko
    Mladenic, Dunja
    MINING COMPLEX DATA, 2008, 4944 : 1 - 12
  • [44] An efficient technique for Bayesian modeling of family data using the BUGS software
    Bae, Harold T.
    Perls, Thomas T.
    Sebastiani, Paola
    FRONTIERS IN GENETICS, 2014, 5
  • [45] An important issue in data mining-data cleaning
    Yang, Qi Xiao
    Yuan, Sung Sam
    Chan, Lu
    Rajasekera, Jay
    PACLIC 16: LANGUAGE, INFORMATION, AND COMPUTATION, PROCEEDINGS, 2002, : 455 - 464
  • [46] KEEL Data-Mining Software Tool: Data Set Repository, Integration of Algorithms and Experimental Analysis Framework
    Alcala-Fdez, J.
    Fernandez, A.
    Luengo, J.
    Derrac, J.
    Garcia, S.
    Sanchez, L.
    Herrera, F.
    JOURNAL OF MULTIPLE-VALUED LOGIC AND SOFT COMPUTING, 2011, 17 (2-3) : 255 - 287
  • [47] Heart Disease Diagnosis Using Data Mining Technique
    Babu, Sarath
    Vivek, E. M.
    Famina, K. P.
    Fida, K.
    Aswathi, P.
    Shanid, M.
    Hena, M.
    2017 INTERNATIONAL CONFERENCE OF ELECTRONICS, COMMUNICATION AND AEROSPACE TECHNOLOGY (ICECA), VOL 1, 2017, : 750 - 753
  • [48] A Development of Customer Segmentation by Using Data Mining Technique
    Jin, Seohoon
    KOREAN JOURNAL OF APPLIED STATISTICS, 2005, 18 (03) : 555 - 565
  • [49] Test Case Reduction Using Data Mining Technique
    Saifan, Ahmad A.
    Alsukhni, Emad
    Alawneh, Hanadi
    Al Sbaih, Ayat
    INTERNATIONAL JOURNAL OF SOFTWARE INNOVATION, 2016, 4 (04) : 56 - 70
  • [50] A Modification to MC Algorithm Using Data Mining Technique
    Li, Xuemei
    Zhang, Caiming
    Zhang, Caiqing
    FIFTH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, VOL 4, PROCEEDINGS, 2008, : 295 - +