Cluster-Based Instance Selection for the Imbalanced Data Classification

被引：5

作者：

Czarnowski, Ireneusz ^{[1
]}

Jedrzejowicz, Piotr ^{[1
]}

机构：

[1] Gdynia Maritime Univ, Dept Informat Syst, Morska 83, PL-81225 Gdynia, Poland

来源：

COMPUTATIONAL COLLECTIVE INTELLIGENCE, ICCCI 2018, PT II | 2018年 / 11056卷

关键词：

Instance selection; Clustering; Imbalanced data; Team of agents; INTEGRATION; REDUCTION;

D O I：

10.1007/978-3-319-98446-9_18

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Instance selection, often referred to as data reduction, aims at deciding which instances from the training set should be retained for further use during the learning process. Instance selection is the important preprocessing step for many machine leaning tools, especially when the huge data sets are considered. Class imbalance arises, when the number of examples belonging to one class is much greater than the number of examples belonging to another. The paper proposes a cluster-based instance selection approach for the imbalanced data classification. The proposed approach bases on the similarity coefficient between training data instances, calculated for each considered data class independently. Similar instances are grouped into clusters. Next, the instance selection is carried out. The process of instance selection is controlled and carried-out by the team of agents. The proposed approach is validated experimentally. Advantages and main features of the approach are discussed considering results of the computational experiment.

引用

页码：191 / 200

页数：10

共 50 条

[1] Cluster-based instance selection for machine classification
Czarnowski, Ireneusz
[J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2012, 30 (01) : 113 - 133
[2] Cluster-based instance selection for machine classification
Ireneusz Czarnowski
[J]. Knowledge and Information Systems, 2012, 30 : 113 - 133
[3] Cluster Integration for the Cluster-Based Instance Selection
Czarnowski, Ireneusz
Jedrzejowicz, Piotr
[J]. COMPUTATIONAL COLLECTIVE INTELLIGENCE: TECHNOLOGIES AND APPLICATIONS, PT I, 2010, 6421 : 353 - 362
[4] A cluster-based hybrid sampling approach for imbalanced data classification
Feng, Shou
Zhao, Chunhui
Fu, Ping
[J]. REVIEW OF SCIENTIFIC INSTRUMENTS, 2020, 91 (05):
[5] A New Cluster-based Instance Selection Algorithm
Czarnowski, Ireneusz
Jedrzejowicz, Piotr
[J]. AGENT AND MULTI-AGENT SYSTEMS: TECHNOLOGIES AND APPLICATIONS, 2011, 6682 : 436 - 445
[6] An Approach to Imbalanced Data Classification Based on Instance Selection and Over-Sampling
Czarnowski, Ireneusz
Jedrzejowicz, Piotr
[J]. COMPUTATIONAL COLLECTIVE INTELLIGENCE, PT I, 2019, 11683 : 601 - 610
[7] Cluster-based sampling of multiclass imbalanced data
Prachuabsupakij, Wanthanee
Soonthornphisaj, Nuanwan
[J]. INTELLIGENT DATA ANALYSIS, 2014, 18 (06) : 1109 - 1135
[8] Deep Learning with MCA-based Instance Selection and Bootstrapping for Imbalanced Data Classification
Guan, Sheng
Chen, Min
Ha, Hsin-Yu
Chen, Shu-Ching
Shyu, Mei-Ling
Zhang, Chengde
[J]. 2015 IEEE CONFERENCE ON COLLABORATION AND INTERNET COMPUTING (CIC), 2015, : 288 - 295
[9] A Cluster-based Regrouping Approach for Imbalanced Data Distributions
Yu, Wen
Jiang, ShengYi
[J]. 2012 WORLD AUTOMATION CONGRESS (WAC), 2012,
[10] Cluster-based sampling approaches to imbalanced data distributions
Yen, Show-Jane
Lee, Yue-Shi
[J]. DATA WAREHOUSING AND KNOWLEDGE DISCOVERY, PROCEEDINGS, 2006, 4081 : 427 - 436

← 1 2 3 4 5 →