PF-SMOTE: A novel parameter-free SMOTE for imbalanced datasets

被引：26

作者：

Chen, Qiong ^{[1
]}

Zhang, Zhong-Liang ^{[1
,2
,3
]}

Huang, Wen-Po ^{[1
]}

Wu, Jian ^{[1
,3
]}

Luo, Xing-Gang ^{[1
]}

机构：

[1] Hangzhou Dianzi Univ, Sch Management, Hangzhou 310018, Peoples R China

[2] Shanghai Jiao Tong Univ, Antai Coll Econ & Management, Shanghai 200030, Peoples R China

[3] Hangzhou Dianzi Univ, Res Ctr Youth Publ Opin Zhejiang, Hangzhou 310018, Peoples R China

来源：

NEUROCOMPUTING | 2022年 / 498卷

基金：

美国国家科学基金会;

关键词：

Imbalanced datasets; Data preprocessing; SMOTE; Gaussian process; Oversampling; OVERSAMPLING TECHNIQUE; SAMPLING APPROACH; DATA-SETS; CLASSIFICATION; NOISY; TREES;

D O I：

10.1016/j.neucom.2022.05.017

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Class imbalance learning is one of the most important topics in the field of machine learning and data mining, and the Synthetic Minority Oversampling Techniques (SMOTE) is the common method to handle this issue. The main shortcomings of the classic SMOTE and its variants is the interpolation of potential noise and unrepresentative examples. This paper is devoted to proposing a novel parameter-free SMOTE mechanism to produce sufficient representative synthetic examples while avoiding interpolating noisy examples. Specifically, two types of minority class examples are defined, namely boundary and safe minority examples. The synthetic examples generation procedure fully reflects the characteristics of the minority class examples with filling the region dominated by the minority class and expanding the margin of the minority class. To verify the effectiveness and robustness of the proposed method, a thorough experimental study on forty datasets selected from real-world applications is carried out. The experimental results indicate that our proposed method is competitive to the classic SMOTE and its state-of-the-art variants. (c) 2022 Elsevier B.V. All rights reserved.

引用

页码：75 / 88

页数：14

共 50 条

[1] A Parameter-Free Cleaning Method for SMOTE in Imbalanced Classification
Yan, Yuanting
Liu, Ruiqing
Ding, Zihan
Du, Xiuquan
Chen, Jie
Zhang, Yanping
[J]. IEEE ACCESS, 2019, 7 : 23537 - 23548
[2] A-SMOTE: A New Preprocessing Approach for Highly Imbalanced Datasets by Improving SMOTE
Ahmed Saad Hussein
Tianrui Li
Chubato Wondaferaw Yohannese
Kamal Bashir
[J]. International Journal of Computational Intelligence Systems, 2019, 12 : 1412 - 1422
[3] A-SMOTE: A New Preprocessing Approach for Highly Imbalanced Datasets by Improving SMOTE
Hussein, Ahmed Saad
Li, Tianrui
Yohannese, Chubato Wondaferaw
Bashir, Kamal
[J]. INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS, 2019, 12 (02) : 1412 - 1422
[4] Geometric SMOTE for imbalanced datasets with nominal and continuous features
Fonseca, Joao
Bacao, Fernando
[J]. EXPERT SYSTEMS WITH APPLICATIONS, 2023, 234
[5] Learning imbalanced datasets based on SMOTE and Gaussian distribution
Pan, Tingting
Zhao, Junhong
Wu, Wei
Yang, Jie
[J]. INFORMATION SCIENCES, 2020, 512 : 1214 - 1233
[6] A Modified Borderline Smote with Noise Reduction in Imbalanced Datasets
Revathi, M.
Ramyachitra, D.
[J]. WIRELESS PERSONAL COMMUNICATIONS, 2021, 121 (03) : 1659 - 1680
[7] A Modified Borderline Smote with Noise Reduction in Imbalanced Datasets
M. Revathi
D. Ramyachitra
[J]. Wireless Personal Communications, 2021, 121 : 1659 - 1680
[8] Kernel-Based SMOTE for SVM Classification of Imbalanced Datasets
Mathew, Josey
Luo, Ming
Pang, Chee Khiang
Chan, Hian Leng
[J]. IECON 2015 - 41ST ANNUAL CONFERENCE OF THE IEEE INDUSTRIAL ELECTRONICS SOCIETY, 2015, : 1127 - 1132
[9] Applying Threshold SMOTE Algorithm with Attribute Bagging to Imbalanced Datasets
Wang, Jin
Yun, Bo
Huang, Pingli
Liu, Yu-Ao
[J]. ROUGH SETS AND KNOWLEDGE TECHNOLOGY: 8TH INTERNATIONAL CONFERENCE, 2013, 8171 : 221 - 228
[10] Combination Approach of SMOTE and Biased-SVM for Imbalanced Datasets
Wang He-Yong
[J]. 2008 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-8, 2008, : 228 - 231

← 1 2 3 4 5 →