A novel under sampling strategy for efficient software defect analysis of skewed distributed data

被引：0

作者：

K. Nitalaksheswara Rao

Ch. Satyananda Reddy

机构：

[1] Andhra University,Department of Computer Science and Systems Engineering

来源：

Evolving Systems | 2020年 / 11卷

关键词：

Software defects analysis; Classification; Decision tree; Class imbalance learning; Under sampling;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

The software quality development process is a continuous process which starts by identifying a reliable fault detection technique. The implementation of the effective fault detection technique depends on the properties of the dataset in terms of domain information, characteristics of input data, complexity, etc. The early detection of defective modules provide more time for the developers to allocate resources effectively to deliver the quality software in time. The class imbalance nature of the software defect datasets indicates that the existing techniques are unsuccessful for identifying all the defective modules. Misclassification of the defective modules in the software engineering datasets invites unexpected loses to the software developers. To classify the class imbalance software datasets in an efficient way, we have proposed a novel approach called as under sampling strategy. This proposed approach uses under sampling strategy to reduce the less prominent instances from majority subset. The experimental results confirm that the proposed approach can deliver more accuracy in predicting the modules which are error prone with less and simple rules.

引用

页码：119 / 131

页数：12

共 50 条

[41] RELIABILITY-ANALYSIS OF LARGE SOFTWARE SYSTEMS - DEFECT DATA MODELING
LEVENDEL, Y
[J]. IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 1990, 16 (02) : 141 - 152
[42] Novel coordinated secondary voltage control strategy for efficient utilisation of distributed generations
Alobeidli, Khaled
El Moursi, Mohamed Shawky
[J]. IET RENEWABLE POWER GENERATION, 2014, 8 (05) : 569 - 579
[43] A Workload Assignment Strategy for Efficient ROLAP Data Cube Computation in Distributed Systems
Suh, Ilhyun
Chung, Yon Dohn
[J]. INTERNATIONAL JOURNAL OF DATA WAREHOUSING AND MINING, 2016, 12 (03) : 51 - 71
[44] CDFRS: A scalable sampling approach for efficient big data analysis
Cai, Yongda
Wu, Dingming
Sun, Xudong
Wu, Siyue
Xu, Jingsheng
Huang, Joshua Zhexue
[J]. INFORMATION PROCESSING & MANAGEMENT, 2024, 61 (04)
[45] Data sampling approach using heuristic Learning Vector Quantization (LVQ) classifier for software defect prediction
Amanullah, M.
Ramya, S. Thanga
Sudha, M.
Pushparathi, V. P. Gladis
Haldorai, Anandakumar
Pant, Bhaskar
[J]. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2023, 44 (03) : 3867 - 3876
[46] DISTRIBUTED DATA-ANALYSIS IN COLLABORATIVE STUDIES - THE CARDIA STRATEGY
PERKINS, L
WAGENKNECHT, L
CUTTER, G
BIRCH, R
BLANTON, M
DYER, A
[J]. CONTROLLED CLINICAL TRIALS, 1987, 8 (03): : 281 - 282
[47] Distributed Storage Strategy and Visual Analysis for Economic Big Data
Chang, Xiangli
Cui, Hailang
[J]. JOURNAL OF MATHEMATICS, 2021, 2021
[48] SSFile: A novel column-store for efficient data analysis in Hadoop-based distributed systems
Son, Jihoon
Ryu, Hyoseok
Yi, Sungmin
Chung, Yon Dohn
[J]. INFORMATION SCIENCES, 2015, 316 : 68 - 86
[49] Efficient Publication of Distributed and Overlapping Graph Data Under Differential Privacy
Xu Zheng
Lizong Zhang
Kaiyang Li
Xi Zeng
[J]. Tsinghua Science and Technology, 2022, 27 (02) : 235 - 243
[50] Efficient Publication of Distributed and Overlapping Graph Data Under Differential Privacy
Zheng, Xu
Zhang, Lizong
Li, Kaiyang
Zeng, Xi
[J]. TSINGHUA SCIENCE AND TECHNOLOGY, 2022, 27 (02) : 235 - 243

← 1 2 3 4 5 →