Class Imbalance in Software Fault Prediction Data Set

被引:6
|
作者
Arun, C. [1 ]
Lakshmi, C. [1 ]
机构
[1] SRM Inst Sci & Technol, Sch Comp, Kattankulathur, India
关键词
Classification; Class imbalance; Machine learning; Majority; Minority; Sampling; Training;
D O I
10.1007/978-981-15-0199-9_64
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Classification has been the prominent technique in machine learning domain, due to its ability of forecasting and predicts capabilities it is widely used in various domains such as health care, networking, social network, and software engineering with enhancement of different algorithm. The performance of the classifier majorly depends on the quality and amount of data present in the training sample. In real-world scenario, the majority of training samples suffered from class imbalance problem, that is, most of the data samples belong to one particular category, i.e., majority class while very few represent the minority class. In this case, classification techniques tend to be overwhelmed by the majority class and ignore the minority class. To solve class imbalance problem people relay on the different kind of sampling techniques either by generating synthetic data or by concentrating on minority class samples, but those approaches have introduced adverse effect in the learnability. In this paper, we attempt to study different techniques proposed to solve the class imbalance problem.
引用
收藏
页码:745 / 757
页数:13
相关论文
共 50 条
  • [31] Novel Framework for Improving the Desired Structure Prediction on Imbalance Data Set
    Sun, Hui
    Guan, Qingji
    Hao, Qiaohong
    Kong, Jun
    Lu, Yinghua
    Qi, Miao
    [J]. ASIAN JOURNAL OF CHEMISTRY, 2014, 26 (17) : 5839 - 5841
  • [32] A New Software Fault Prediction Model in Imbalanced Data
    Wang, Shi-Hai
    He, Ping
    [J]. 2015 INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING AND INFORMATION SYSTEM (SEIS 2015), 2015, : 245 - 250
  • [33] Software fault prediction using data reduction approaches
    Yohannese, Chubato Wondaferaw
    Li, Tianrui
    Bashir, Kamal
    Simfukwe, Macmillan
    Hussein, Ahmed Saad
    [J]. DATA SCIENCE AND KNOWLEDGE ENGINEERING FOR SENSING DECISION SUPPORT, 2018, 11 : 1364 - 1372
  • [34] SAGA: A Hybrid Technique to handle Imbalance Data in Software Defect Prediction
    Malhotra, Ruchika
    Kapoor, Ritvik
    Saxena, Paridhi
    Sharma, Parth
    [J]. 11TH IEEE SYMPOSIUM ON COMPUTER APPLICATIONS & INDUSTRIAL ELECTRONICS (ISCAIE 2021), 2021, : 331 - 336
  • [35] Alleviating Class Imbalance Issue in Software Fault Prediction Using DBSCAN-Based Induced Graph Under-Sampling Method
    Bhandari, Kirti
    Kumar, Kuldeep
    Sangal, Amrit Lal
    [J]. ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2024, 49 (09) : 12589 - 12627
  • [36] DBOS_US: a density-based graph under-sampling method to handle class imbalance and class overlap issues in software fault prediction
    Bhandari, Kirti
    Kumar, Kuldeep
    Sangal, Amrit Lal
    [J]. JOURNAL OF SUPERCOMPUTING, 2024, 80 (15): : 22682 - 22725
  • [37] A novel data augmentation approach to fault diagnosis with class-imbalance problem
    Tian, Jilun
    Jiang, Yuchen
    Zhang, Jiusi
    Luo, Hao
    Yin, Shen
    [J]. RELIABILITY ENGINEERING & SYSTEM SAFETY, 2024, 243
  • [38] The use of cross-company fault data for the software fault prediction problem
    Catal, Cagatay
    [J]. TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, 2016, 24 (05) : 3714 - 3723
  • [39] SOFTWARE FAULT PREDICTION
    SHERER, SA
    [J]. JOURNAL OF SYSTEMS AND SOFTWARE, 1995, 29 (02) : 97 - 105
  • [40] An empirical study toward dealing with noise and class imbalance issues in software defect prediction
    Pandey, Sushant Kumar
    Tripathi, Anil Kumar
    [J]. SOFT COMPUTING, 2021, 25 (21) : 13465 - 13492