Confusion-Matrix-Based Kernel Logistic Regression for Imbalanced Data Classification

被引:112
|
作者
Ohsaki, Miho [1 ]
Wang, Peng [1 ]
Matsuda, Kenji [1 ]
Katagiri, Shigeru [1 ]
Watanabe, Hideyuki [2 ]
Ralescu, Anca [3 ]
机构
[1] Doshisha Univ, Grad Sch Sci & Engn, 1-3 Tataramiyakodani, Kyotanabe, Kyoto 6100321, Japan
[2] Natl Inst Informat & Commun Technol, 3-5 Hikaridai, Seika, Kyoto 6190289, Japan
[3] Univ Cincinnati, Coll Engn & Appl Sci, Dept Elect Engn & Comp Syst, 812 Rhodes Hall, Cincinnati, OH 45221 USA
关键词
Imbalanced data; confusion matrix; kernel logistic regression; minimum classification error and generalized probabilistic descent; OPTIMIZATION; SELECTION; MODEL;
D O I
10.1109/TKDE.2017.2682249
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
There have been many attempts to classify imbalanced data, since this classification is critical in a wide variety of applications related to the detection of anomalies, failures, and risks. Many conventional methods, which can be categorized into sampling, cost-sensitive, or ensemble, include heuristic and task dependent processes. In order to achieve a better classification performance by formulation without heuristics and task dependence, we propose confusion-matrix-based kernel logistic regression (CM-KLOGR). Its objective function is the harmonic mean of various evaluation criteria derived from a confusion matrix, such criteria as sensitivity, positive predictive value, and others for negatives. This objective function and its optimization are consistently formulated on the framework of KLOGR, based on minimum classification error and generalized probabilistic descent (MCE/GPD) learning. Due to the merits of the harmonic mean, KLOGR, and MCE/GPD, CM-KLOGR improves the multifaceted performances in a well-balanced way. This paper presents the formulation of CM-KLOGR and its effectiveness through experiments that comparatively evaluated CM-KLOGR using benchmark imbalanced datasets.
引用
收藏
页码:1806 / 1819
页数:14
相关论文
共 50 条
  • [1] Formulation of the Kernel Logistic Regression based on the Confusion Matrix
    Ohsaki, Miho
    Matsuda, Kenji
    Wang, Peng
    Katagiri, Shigeru
    Watanabe, Hideyuki
    2015 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2015, : 2327 - 2334
  • [2] A Novel Imbalanced Data Classification Approach Based on Logistic Regression and Fisher Discriminant
    Shi, Baofeng
    Wang, Jing
    Qi, Junyan
    Cheng, Yanqiu
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2015, 2015
  • [3] Robust weighted kernel logistic regression in imbalanced and rare events data
    Maalouf, Maher
    Trafalis, Theodore B.
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2011, 55 (01) : 168 - 183
  • [4] Kernel Logistic Regression Algorithm for Large-Scale Data Classification
    Elbashir, Murtada
    Wang, Jianxin
    INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2015, 12 (05) : 465 - 472
  • [5] Logistic Regression and Random Forest for Effective Imbalanced Classification
    Luo, Hanwu
    Pan, Xiubao
    Wang, Qingshun
    Ye, Shasha
    Qian, Ying
    2019 IEEE 43RD ANNUAL COMPUTER SOFTWARE AND APPLICATIONS CONFERENCE (COMPSAC), VOL 1, 2019, : 916 - 917
  • [6] Texture classification using kernel logistic regression
    Tambo, Asongu L.
    Mistry, Rajan B.
    Campbell, Jonathan M.
    Chan, Sherwin R.
    Hang, Xiyi
    INT CONF ON CYBERNETICS AND INFORMATION TECHNOLOGIES, SYSTEMS AND APPLICATIONS/INT CONF ON COMPUTING, COMMUNICATIONS AND CONTROL TECHNOLOGIES, VOL 1, 2007, : 259 - 262
  • [7] LOGISTIC REGRESSION BASED ON STATISTICAL LEARNING MODEL WITH LINEARIZED KERNEL FOR CLASSIFICATION
    Guan, Xiaochun
    Zhang, Jianhua
    Chen, Shengyong
    COMPUTING AND INFORMATICS, 2021, 40 (02) : 298 - 317
  • [8] Kernel Logistic Regression: A Robust Weighting for Imbalanced Classes with Noisy Labels
    Byrnes, Paul G.
    DiazDelaO, Francisco A.
    2018 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND DATA ENGINEERING (ICMLDE 2018), 2018, : 30 - 34
  • [9] Logistic regression for imbalanced learning based on clustering
    Guo, Huaping
    Wei, Tao
    INTERNATIONAL JOURNAL OF COMPUTATIONAL SCIENCE AND ENGINEERING, 2019, 18 (01) : 54 - 64
  • [10] Entropy-Based Fuzzy Weighted Logistic Regression for Classifying Imbalanced Data
    Harumeka, Ajiwasesa
    Purnami, Santi Wulan
    Rahayu, Santi Puteri
    SOFT COMPUTING IN DATA SCIENCE, SCDS 2021, 2021, 1489 : 312 - 327