Observation points classifier ensemble for high-dimensional imbalanced classification

被引:2
|
作者
He, Yulin [1 ,2 ]
Li, Xu [1 ]
Fournier-Viger, Philippe [1 ]
Huang, Joshua Zhexue [1 ,2 ]
Li, Mianjie [3 ]
Salloum, Salman [4 ]
机构
[1] Shenzhen Univ, Coll Comp Sci & Software Engn, Nanhai Ave 3688, Shenzhen 518060, Peoples R China
[2] Shenzhen Univ, Guangdong Lab Artificial Intelligence & Digital E, Shenzhen, Peoples R China
[3] Macau Univ Sci & Technol, Fac Informat Technol, Taipa, Macao, Peoples R China
[4] Natl Univ Singapore, Sch Comp, Singapore, Singapore
基金
中国国家自然科学基金;
关键词
FEATURE-SELECTION; SMOTE;
D O I
10.1049/cit2.12100
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, an Observation Points Classifier Ensemble (OPCE) algorithm is proposed to deal with High-Dimensional Imbalanced Classification (HDIC) problems based on data processed using the Multi-Dimensional Scaling (MDS) feature extraction technique. First, dimensionality of the original imbalanced data is reduced using MDS so that distances between any two different samples are preserved as well as possible. Second, a novel OPCE algorithm is applied to classify imbalanced samples by placing optimised observation points in a low-dimensional data space. Third, optimization of the observation point mappings is carried out to obtain a reliable assessment of the unknown samples. Exhaustive experiments have been conducted to evaluate the feasibility, rationality, and effectiveness of the proposed OPCE algorithm using seven benchmark HDIC data sets. Experimental results show that (1) the OPCE algorithm can be trained faster on low-dimensional imbalanced data than on high-dimensional data; (2) the OPCE algorithm can correctly identify samples as the number of optimised observation points is increased; and (3) statistical analysis reveals that OPCE yields better HDIC performances on the selected data sets in comparison with eight other HDIC algorithms. This demonstrates that OPCE is a viable algorithm to deal with HDIC problems.
引用
收藏
页码:500 / 517
页数:18
相关论文
共 50 条
  • [31] Classification in High-Dimensional Feature Spaces: Random Subsample Ensemble
    Serpen, Gursel
    Pathical, Santhosh
    EIGHTH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS, PROCEEDINGS, 2009, : 740 - 745
  • [32] The classification method based on evolutionary algorithm for high-dimensional imbalanced missing data
    Liu, Yi
    Li, Gengsong
    Li, Xiang
    Qin, Wei
    Zheng, Qibin
    Ren, Xiaoguang
    ELECTRONICS LETTERS, 2023, 59 (12)
  • [33] Study on source of classification in imbalanced datasets based on new ensemble classifier
    Zhai Y.
    Yang B.-R.
    Qu W.
    Sui H.-F.
    Xi Tong Gong Cheng Yu Dian Zi Ji Shu/Systems Engineering and Electronics, 2011, 33 (01): : 196 - 201
  • [34] An Empirical Study on Preprocessing High-dimensional Class-imbalanced Data for Classification
    Yin, Hua
    Gai, Keke
    2015 IEEE 17TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS, 2015 IEEE 7TH INTERNATIONAL SYMPOSIUM ON CYBERSPACE SAFETY AND SECURITY, AND 2015 IEEE 12TH INTERNATIONAL CONFERENCE ON EMBEDDED SOFTWARE AND SYSTEMS (ICESS), 2015, : 1314 - 1319
  • [35] Dynamic classifier ensemble model for customer classification with imbalanced class distribution
    Xiao, Jin
    Xie, Ling
    He, Changzheng
    Jiang, Xiaoyi
    EXPERT SYSTEMS WITH APPLICATIONS, 2012, 39 (03) : 3668 - 3675
  • [36] Distributed Ensemble Feature Selection Framework for High-Dimensional and High-Skewed Imbalanced Big Dataset
    Soheili, Majid
    Haeri, Maryam Amir Amir
    2021 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI 2021), 2021,
  • [37] A Concise TSK Fuzzy Ensemble Classifier Integrating Dropout and Bagging for High-Dimensional Problems
    Guo, Fei
    Liu, Jiahuan
    Li, Maoyuan
    Huang, Tianlun
    Zhang, Yun
    Li, Dequn
    Zhou, Huamin
    IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2022, 30 (08) : 3176 - 3190
  • [38] Ensemble methods for classification of patients for personalized medicine with high-dimensional data
    Moon, Hojin
    Ahn, Hongshik
    Kodell, Ralph L.
    Baek, Songjoon
    Lin, Chien-Ju
    Chen, James J.
    ARTIFICIAL INTELLIGENCE IN MEDICINE, 2007, 41 (03) : 197 - 207
  • [39] Hybrid One-Class Ensemble for High-Dimensional Data Classification
    Krawczyk, Bartosz
    INTELLIGENT INFORMATION AND DATABASE SYSTEMS, ACIIDS 2016, PT II, 2016, 9622 : 136 - 144
  • [40] FCM Classifier for High-dimensional Data
    Ichihashi, Hidetomo
    Honda, Katsuhiro
    Notsu, Akira
    Miyamoto, Eri
    2008 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, VOLS 1-5, 2008, : 200 - 206