Mining Software Repository for Cleaning Bugs Using Data Mining Technique

被引:5
|
作者
Mahmood, Nasir [1 ]
Hafeez, Yaser [1 ]
Iqbal, Khalid [2 ]
Hussain, Shariq [3 ]
Aqib, Muhammad [1 ]
Jamal, Muhammad [4 ]
Song, Oh-Young [5 ]
机构
[1] Pir Mehr Ali Shah Arid Agr Univ, Univ Inst Informat Technol, Rawalpindi 46000, Pakistan
[2] COMSATS Univ Islamabad, Dept Comp Sci, Attock Campus, Attock 43600, Pakistan
[3] Fdn Univ Islamabad, Dept Software Engn, Islamabad 44000, Pakistan
[4] Pir Mehr Ali Shah Arid Agr Univ, Dept Math & Stat, Rawalpindi 46000, Pakistan
[5] Sejong Univ, Dept Software, Seoul 05006, South Korea
来源
CMC-COMPUTERS MATERIALS & CONTINUA | 2021年 / 69卷 / 01期
关键词
Fault prediction; association rule; data mining; frequent pattern mining; RULES;
D O I
10.32604/cmc.2021.016614
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Despite advances in technological complexity and efforts, software repository maintenance requires reusing the data to reduce the effort and complexity. However, increasing ambiguity, irrelevance, and bugs while extracting similar data during software development generate a large amount of data from those data that reside in repositories. Thus, there is a need for a repository mining technique for relevant and bug-free data prediction. This paper proposes a fault prediction approach using a data-mining technique to find good predictors for high-quality software. To predict errors in mining data, the Apriori algorithm was used to discover association rules by fixing confidence at more than 40% and support at least 30%. The pruning strategy was adopted based on evaluation measures. Next, the rules were extracted from three projects of different domains; the extracted rules were then combined to obtain the most popular rules based on the evaluation measure values. To evaluate the proposed approach, we conducted an experimental study to compare the proposed rules with existing ones using four different industrial projects. The evaluation showed that the results of our proposal are promising. Practitioners and developers can utilize these rules for defect prediction during early software development.
引用
收藏
页码:873 / 893
页数:21
相关论文
共 50 条
  • [1] Using ontology repository to support data mining
    Pan, Ding
    Pan, Yan
    WCICA 2006: SIXTH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION, VOLS 1-12, CONFERENCE PROCEEDINGS, 2006, : 5947 - +
  • [2] Enhancing Software Project Monitoring with Multidimensional Data Repository Mining
    Reszka, Lukasz
    Sosnowski, Janusz
    Dobrzynski, Bartosz
    ELECTRONICS, 2023, 12 (18)
  • [3] The SmartSHARK Ecosystem for Software Repository Mining
    Trautsch, Alexander
    Trautsch, Fabian
    Herbold, Steffen
    Ledel, Benjamin
    Grabowski, Jens
    2020 ACM/IEEE 42ND INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING: COMPANION PROCEEDINGS (ICSE-COMPANION 2020), 2020, : 25 - 28
  • [4] Evaluating the Lifespan of Code Smells using Software Repository Mining
    Peters, Ralph
    Zaidman, Andy
    2012 16TH EUROPEAN CONFERENCE ON SOFTWARE MAINTENANCE AND REENGINEERING (CSMR), 2012, : 411 - 416
  • [5] Mining Software Repositories with a Collaborative Heuristic Repository
    Babii, Hlib
    Prenner, Julian Aron
    Stricker, Laurin
    Karmakar, Anjan
    Janes, Andrea
    Robbes, Romain
    2021 ACM/IEEE 43RD INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING: NEW IDEAS AND EMERGING RESULTS (ICSE-NIER 2021), 2021, : 106 - 110
  • [6] Polyglot and Distributed Software Repository Mining with Crossflow
    Barmpis, Konstantinos
    Neubauer, Patrick
    Co, Jonathan
    Kolovos, Dimitris
    Matragkas, Nicholas
    Paige, Richard F.
    2020 IEEE/ACM 17TH INTERNATIONAL CONFERENCE ON MINING SOFTWARE REPOSITORIES, MSR, 2020, : 374 - 384
  • [7] Text Mining Studies of Software Repository Contents
    Dobrzynski, Bartosz
    Sosnowski, Janusz
    PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON EVALUATION OF NOVEL APPROACHES TO SOFTWARE ENGINEERING, ENASE 2023, 2023, : 562 - 569
  • [8] Data Mining Technique of Analysis Software for Patent Map
    Miao, Chunyu
    Chen, Lina
    PROCEEDINGS OF ANNUAL CONFERENCE OF CHINA INSTITUTE OF COMMUNICATIONS, 2010, : 215 - 218
  • [9] Is the UCI repository useful for data mining?
    Soares, Carlos
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2003, 2902 : 209 - 223
  • [10] Is the UCI repository useful for data mining?
    Soares, C
    PROGRESS IN ARTIFICIAL INTELLIGENCE-B, 2003, 2902 : 209 - 223