New Fitness Functions in Genetic Programming for Classification with High-dimensional Unbalanced Data

被引:0
|
作者
Pei, Wenbin [1 ]
Xue, Bing [1 ]
Shang, Lin [2 ]
Zhang, Mengjie [1 ]
机构
[1] Victoria Univ Wellington, Sch Engn & Comp Sci, POB 600, Wellington 6140, New Zealand
[2] Nanjing Univ, State Key Lab Novel Software Technol, Nanjing 210093, Jiangsu, Peoples R China
关键词
Classification; Genetic Programming; Fitness Functions; High-dimensionality; Class Imbalance; FEATURE-SELECTION;
D O I
10.1109/cec.2019.8789974
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
High-dimensionality and class imbalance represent two main challenges in classification. Recently, there is a growing number of datasets exhibiting the characteristics of the combination of the class imbalance and high-dimensionality. Genetic programming (GP) has been successfully applied to solve high-dimensional classification tasks. However, most existing GP methods may also suffer from a performance bias if the class distribution is unbalanced. Using fitness functions for cost adjustment is one of the most important methods in GP to address the class imbalance issue. This paper develops new fitness functions in GP to address the class imbalance issue in classification with high-dimensional unbalanced data. Two fitness functions are proposed to increase the performance of the traditional accuracy measures, and one fitness function is proposed to approximate Area Under Curve (AUC) with the goal to save the training time. Experiments on six high-dimensional unbalanced datasets show the better performance of the proposed fitness functions, compared to existing fitness functions.
引用
收藏
页码:2779 / 2786
页数:8
相关论文
共 50 条
  • [31] Multi-Objective Genetic Programming for Classification with Unbalanced Data
    Bhowan, Urvesh
    Zhang, Mengjie
    Johnston, Mark
    AI 2009: ADVANCES IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2009, 5866 : 370 - 380
  • [32] Reusing Genetic Programming for Ensemble Selection in Classification of Unbalanced Data
    Bhowan, Urvesh
    Johnston, Mark
    Zhang, Mengjie
    Yao, Xin
    IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, 2014, 18 (06) : 893 - 908
  • [33] A classification algorithm for high-dimensional data
    Roy, Asim
    INNS CONFERENCE ON BIG DATA 2015 PROGRAM, 2015, 53 : 345 - 355
  • [34] Classification with High-Dimensional Genetic Data: Assigning Patients and Genetic Features to Known Classes
    Schwender, Holger
    Ickstadt, Katja
    Rahnenfuehrer, Joerg
    BIOMETRICAL JOURNAL, 2008, 50 (06) : 911 - 926
  • [35] A novel fitness function in genetic programming to handle unbalanced emotion recognition data
    Acharya, Divya
    Goel, Shivani
    Asthana, Rishi
    Bhardwaj, Arpit
    PATTERN RECOGNITION LETTERS, 2020, 133 : 272 - 279
  • [36] A novel fitness function in genetic programming for medical data classification
    Kumar, Arvind
    Sinha, Nishant
    Bhardwaj, Arpit
    JOURNAL OF BIOMEDICAL INFORMATICS, 2020, 112
  • [37] Evolving Diverse Ensembles Using Genetic Programming for Classification With Unbalanced Data
    Bhowan, Urvesh
    Johnston, Mark
    Zhang, Mengjie
    Yao, Xin
    IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, 2013, 17 (03) : 368 - 386
  • [38] Online Nonlinear Classification for High-Dimensional Data
    Vanli, N. Denizcan
    Ozkan, Huseyin
    Delibalta, Ibrahim
    Kozat, Suleyman S.
    2015 IEEE INTERNATIONAL CONGRESS ON BIG DATA - BIGDATA CONGRESS 2015, 2015, : 685 - 688
  • [39] Enhanced algorithm for high-dimensional data classification
    Wang, Xiaoming
    Wang, Shitong
    APPLIED SOFT COMPUTING, 2016, 40 : 1 - 9
  • [40] A Compressive Classification Framework for High-Dimensional Data
    Tabassum, Muhammad Naveed
    Ollila, Esa
    IEEE OPEN JOURNAL OF SIGNAL PROCESSING, 2020, 1 : 177 - 186