Observation points classifier ensemble for high-dimensional imbalanced classification

被引:2
|
作者
He, Yulin [1 ,2 ]
Li, Xu [1 ]
Fournier-Viger, Philippe [1 ]
Huang, Joshua Zhexue [1 ,2 ]
Li, Mianjie [3 ]
Salloum, Salman [4 ]
机构
[1] Shenzhen Univ, Coll Comp Sci & Software Engn, Nanhai Ave 3688, Shenzhen 518060, Peoples R China
[2] Shenzhen Univ, Guangdong Lab Artificial Intelligence & Digital E, Shenzhen, Peoples R China
[3] Macau Univ Sci & Technol, Fac Informat Technol, Taipa, Macao, Peoples R China
[4] Natl Univ Singapore, Sch Comp, Singapore, Singapore
基金
中国国家自然科学基金;
关键词
FEATURE-SELECTION; SMOTE;
D O I
10.1049/cit2.12100
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, an Observation Points Classifier Ensemble (OPCE) algorithm is proposed to deal with High-Dimensional Imbalanced Classification (HDIC) problems based on data processed using the Multi-Dimensional Scaling (MDS) feature extraction technique. First, dimensionality of the original imbalanced data is reduced using MDS so that distances between any two different samples are preserved as well as possible. Second, a novel OPCE algorithm is applied to classify imbalanced samples by placing optimised observation points in a low-dimensional data space. Third, optimization of the observation point mappings is carried out to obtain a reliable assessment of the unknown samples. Exhaustive experiments have been conducted to evaluate the feasibility, rationality, and effectiveness of the proposed OPCE algorithm using seven benchmark HDIC data sets. Experimental results show that (1) the OPCE algorithm can be trained faster on low-dimensional imbalanced data than on high-dimensional data; (2) the OPCE algorithm can correctly identify samples as the number of optimised observation points is increased; and (3) statistical analysis reveals that OPCE yields better HDIC performances on the selected data sets in comparison with eight other HDIC algorithms. This demonstrates that OPCE is a viable algorithm to deal with HDIC problems.
引用
收藏
页码:500 / 517
页数:18
相关论文
共 50 条
  • [21] Semi-supervised classifier ensemble model for high-dimensional data
    Niu, Xufeng
    Ma, Wenping
    INFORMATION SCIENCES, 2023, 643
  • [22] Multiple optimized ensemble learning for high-dimensional imbalanced credit scoring datasets
    Lenka, Sudhansu R.
    Bisoy, Sukant Kishoro
    Priyadarshini, Rojalina
    KNOWLEDGE AND INFORMATION SYSTEMS, 2024, 66 (09) : 5429 - 5457
  • [23] An Improved Ensemble Learning Method for Classifying High-Dimensional and Imbalanced Biomedicine Data
    Yu, Hualong
    Ni, Jun
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2014, 11 (04) : 657 - 666
  • [24] Classifier Ensemble Design for Imbalanced Data Classification: A Hybrid Approach
    Salunkhe, Uma R.
    Mali, Suresh N.
    INTERNATIONAL CONFERENCE ON COMPUTATIONAL MODELLING AND SECURITY (CMS 2016), 2016, 85 : 725 - 732
  • [25] Multinomial naive Bayesian classifier with generalized Dirichlet priors for high-dimensional imbalanced data
    Wong, Tzu-Tsung
    Tsai, Hsing-Chen
    KNOWLEDGE-BASED SYSTEMS, 2021, 228
  • [26] Random projection ensemble conformal prediction for high-dimensional classification
    Qian, Xiaoyu
    Wu, Jinru
    Wei, Ligong
    Lin, Youwu
    CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2024, 253
  • [27] Random projection ensemble classification with high-dimensional time series
    Zhang, Fuli
    Chan, Kung-Sik
    BIOMETRICS, 2023, 79 (02) : 964 - 974
  • [28] Ensemble of penalized logistic models for classification of high-dimensional data
    Ijaz, Musarrat
    Asghar, Zahid
    Gul, Asma
    COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2021, 50 (07) : 2072 - 2088
  • [29] A novel ensemble method for high-dimensional genomic data classification
    Espichan, Alexandra
    Villanueva, Edwin
    PROCEEDINGS 2018 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2018, : 2229 - 2236
  • [30] Classification of High-Dimensional Data with Ensemble of Logistic Regression Models
    Lim, Noha
    Ahn, Hongshik
    Moon, Hojin
    Chen, James J.
    JOURNAL OF BIOPHARMACEUTICAL STATISTICS, 2010, 20 (01) : 160 - 171