Automated change-prone class prediction on unlabeled dataset using unsupervised method

被引:18
|
作者
Yan, Meng [2 ]
Zhang, Xiaohong [1 ,2 ]
Liu, Chao [2 ]
Xu, Ling [2 ]
Yang, Mengning [2 ]
Yang, Dan [2 ]
机构
[1] Minist Educ, Key Lab Dependable Serv Comp Cyber Phys Soc, Chongqing 400044, Peoples R China
[2] Chongqing Univ, Sch Software Engn, Chongqing 401331, Peoples R China
基金
中国国家自然科学基金;
关键词
Software maintenance; Change-prone prediction; Unlabeled dataset; Unsupervised prediction; OBJECT-ORIENTED SOFTWARE; OPEN-SOURCE PRODUCTS; MAINTAINABILITY; METRICS; CODE; PROJECT; MODELS; SUITE;
D O I
10.1016/j.infsof.2017.07.003
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Context: Software change-prone class prediction can enhance software decision making activities during software maintenance (e.g., resource allocating). Researchers have proposed many change-prone class prediction approaches and most are effective on labeled datasets (projects with historical labeled data). These approaches usually build a supervised model by learning from historical labeled data. However, a major challenge is that this typical change-prone prediction setting cannot be used for unlabeled datasets (e.g., new projects or projects with limited historical data). Although the cross-project prediction is a solution on unlabeled dataset, it needs the prior labeled data from other projects and how to select the appropriate training project is a difficult task. Objective: We aim to build a change-prone class prediction model on unlabeled datasets without the need of prior labeled data. Method: We propose to tackle this task by adopting a state-of-art unsupervised method, namely CLAMI. In addition, we propose a novel unsupervised approach CLAMI+ by extending CLAMI. The key idea is to enable change-prone class prediction on unlabeled dataset by learning from itself. Results: The experiments among 14 open source projects show that the unsupervised methods achieve comparable results to the typical supervised within-project and cross-project prediction baselines in average and the proposed CLAMI+ slightly improves the CLAMI method in average. Conclusion: Our method discovers that it is effective for building change-prone class prediction model by using unsupervised method. It is convenient for practical usage in industry, since it does not need prior labeled data. (C) 2017 Elsevier B.V. All rights reserved.
引用
收藏
页码:1 / 16
页数:16
相关论文
共 50 条
  • [1] Time-series Approaches to Change-prone Class Prediction Problem
    Melo, Cristiano Sousa
    Lima da Cruz, Matheus Mayron
    Forte Martins, Antonio Diogo
    da Silva Monteiro Filho, Jose Maria
    Machado, Javam de Castro
    PROCEEDINGS OF THE 22ND INTERNATIONAL CONFERENCE ON ENTERPRISE INFORMATION SYSTEMS (ICEIS), VOL 2, 2020, : 122 - 132
  • [2] Machine Learning for Change-Prone Class Prediction: A History-Based Approach
    Silva, Rogerio C.
    Farah, Paulo Roberto
    Vergilio, Silvia Regina
    36TH BRAZILIAN SYMPOSIUM ON SOFTWARE ENGINEERING, SBES 2022, 2022, : 289 - 298
  • [3] Change-Prone Java']Java Method Prediction by Focusing on Individual Differences in Comment Density
    Burhandenny, Aji Ery
    Aman, Hirohisa
    Kawahara, Minoru
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2017, E100D (05): : 1128 - 1131
  • [4] Automated identification of change-prone classes in open source software projects
    Zhu, X. (xyxyzh@gmail.com), 1600, Academy Publisher (08):
  • [5] Defect Prediction on Unlabeled Datasets by Using Unsupervised Clustering
    Yang, Jun
    Qian, Hongbing
    PROCEEDINGS OF 2016 IEEE 18TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS; IEEE 14TH INTERNATIONAL CONFERENCE ON SMART CITY; IEEE 2ND INTERNATIONAL CONFERENCE ON DATA SCIENCE AND SYSTEMS (HPCC/SMARTCITY/DSS), 2016, : 465 - 472
  • [6] Deriving change-prone thresholds from software evolution using ROC curves
    Shatnawi, Raed
    JOURNAL OF SUPERCOMPUTING, 2024, 80 (16): : 23565 - 23591
  • [7] A Doc2Vec-Based Assessment of Comments and Its Application to Change-Prone Method Analysis
    Aman, Hirohisa
    Amasaki, Sousuke
    Yokogawa, Tomoyuki
    Kawahara, Minoru
    2018 25TH ASIA-PACIFIC SOFTWARE ENGINEERING CONFERENCE (APSEC 2018), 2018, : 643 - 647
  • [8] Abductive Network Ensembles for Improved Prediction of Future Change-Prone Classes in Object-Oriented Software
    Al-Khiaty, Mojeeb
    Abdel-Aal, Radwan
    Elish, Mahmoud
    INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2017, 14 (06) : 803 - 811
  • [9] Using unlabeled data to improve the automated prediction of stellar atmospheric parameters
    Solorio, T
    Fuentes, O
    ASTRONOMICAL DATA ANALYSIS SOFTWARE AND SYSTEMS XI, 2002, 281 : 405 - 408
  • [10] Using source code metrics to predict change-prone web services: A case-study on ebay services
    Kumar, Lov
    Rath, Santanu Kumar
    Sureka, Ashish
    MaLTeSQuE 2017 - IEEE International Workshop on Machine Learning Techniques for Software Quality Evaluation, co-located with SANER 2017, 2017, : 1 - 7