Confusion-Matrix-Based Kernel Logistic Regression for Imbalanced Data Classification

被引:112
|
作者
Ohsaki, Miho [1 ]
Wang, Peng [1 ]
Matsuda, Kenji [1 ]
Katagiri, Shigeru [1 ]
Watanabe, Hideyuki [2 ]
Ralescu, Anca [3 ]
机构
[1] Doshisha Univ, Grad Sch Sci & Engn, 1-3 Tataramiyakodani, Kyotanabe, Kyoto 6100321, Japan
[2] Natl Inst Informat & Commun Technol, 3-5 Hikaridai, Seika, Kyoto 6190289, Japan
[3] Univ Cincinnati, Coll Engn & Appl Sci, Dept Elect Engn & Comp Syst, 812 Rhodes Hall, Cincinnati, OH 45221 USA
关键词
Imbalanced data; confusion matrix; kernel logistic regression; minimum classification error and generalized probabilistic descent; OPTIMIZATION; SELECTION; MODEL;
D O I
10.1109/TKDE.2017.2682249
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
There have been many attempts to classify imbalanced data, since this classification is critical in a wide variety of applications related to the detection of anomalies, failures, and risks. Many conventional methods, which can be categorized into sampling, cost-sensitive, or ensemble, include heuristic and task dependent processes. In order to achieve a better classification performance by formulation without heuristics and task dependence, we propose confusion-matrix-based kernel logistic regression (CM-KLOGR). Its objective function is the harmonic mean of various evaluation criteria derived from a confusion matrix, such criteria as sensitivity, positive predictive value, and others for negatives. This objective function and its optimization are consistently formulated on the framework of KLOGR, based on minimum classification error and generalized probabilistic descent (MCE/GPD) learning. Due to the merits of the harmonic mean, KLOGR, and MCE/GPD, CM-KLOGR improves the multifaceted performances in a well-balanced way. This paper presents the formulation of CM-KLOGR and its effectiveness through experiments that comparatively evaluated CM-KLOGR using benchmark imbalanced datasets.
引用
收藏
页码:1806 / 1819
页数:14
相关论文
共 50 条
  • [41] Weighted logistic regression for large-scale imbalanced and rare events data
    Maalouf, Maher
    Siddiqi, Mohammad
    KNOWLEDGE-BASED SYSTEMS, 2014, 59 : 142 - 148
  • [42] Classification of Imbalanced Data by Oversampling in Kernel Space of Support Vector Machines
    Mathew, Josey
    Pang, Chee Khiang
    Luo, Ming
    Leong, Weng Hoe
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (09) : 4065 - 4076
  • [43] Imbalanced data classification algorithm with support vector machine kernel extensions
    Wang, Feng
    Liu, Shaojiang
    Ni, Weichuan
    Xu, Zhiming
    Qiu, Zemin
    Wan, Zhiping
    Pan, Zhihong
    EVOLUTIONARY INTELLIGENCE, 2019, 12 (03) : 341 - 347
  • [44] Kernel modified optimal margin distribution machine for imbalanced data classification
    Zhang, Xiaogang
    Wang, Dingxiang
    Zhou, Yicong
    Chen, Hua
    Cheng, Fanyong
    Liu, Min
    PATTERN RECOGNITION LETTERS, 2019, 125 : 325 - 332
  • [45] Kernel-Based SMOTE for SVM Classification of Imbalanced Datasets
    Mathew, Josey
    Luo, Ming
    Pang, Chee Khiang
    Chan, Hian Leng
    IECON 2015 - 41ST ANNUAL CONFERENCE OF THE IEEE INDUSTRIAL ELECTRONICS SOCIETY, 2015, : 1127 - 1132
  • [46] Research on Imbalanced Data Regression Based on Confrontation
    Liu, Xiaowen
    Tian, Huixin
    PROCESSES, 2024, 12 (02)
  • [47] Imbalanced data classification algorithm with support vector machine kernel extensions
    Feng Wang
    Shaojiang Liu
    Weichuan Ni
    Zhiming Xu
    Zemin Qiu
    Zhiping Wan
    Zhihong Pan
    Evolutionary Intelligence, 2019, 12 : 341 - 347
  • [48] Learning non-parametric kernel via matrix decomposition for logistic regression
    Wang, Kaijie
    He, Fan
    He, Mingzhen
    Huang, Xiaolin
    PATTERN RECOGNITION LETTERS, 2023, 171 : 177 - 183
  • [49] Asymmetric classifier based on kernel PLS for imbalanced data
    Ma, Ying
    Su, Bing-Huang
    Zhu, Shunzhi
    Weng, Wei
    Huang, Liang
    Hu, Jianqiang
    10TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE & EDUCATION (ICCSE 2015), 2015, : 482 - 485
  • [50] Applying Kernel Logistic Regression in Data Mining to Classify Credit Risk
    Rahayu, S. P.
    Purnami, S. W.
    Embong, A.
    INTERNATIONAL SYMPOSIUM OF INFORMATION TECHNOLOGY 2008, VOLS 1-4, PROCEEDINGS: COGNITIVE INFORMATICS: BRIDGING NATURAL AND ARTIFICIAL KNOWLEDGE, 2008, : 1271 - 1276