Kernel-based linear classification on categorical data

被引:2
|
作者
Chen, Lifei [1 ,2 ]
Ye, Yanfang [3 ]
Guo, Gongde [1 ,2 ]
Zhu, Jianping [4 ]
机构
[1] Fujian Normal Univ, Sch Math & Comp Sci, Fuzhou 350117, Fujian, Peoples R China
[2] Fujian Normal Univ, Fujian Prov Key Lab Network Secur & Cryptol, Fuzhou 350117, Fujian, Peoples R China
[3] West Virginia Univ, Lane Dept Comp Sci & Elect Engn, Morgantown, WV 26506 USA
[4] Xiamen Univ, Sch Management, Data Min Res Ctr, Xiamen 361005, Peoples R China
基金
中国国家自然科学基金;
关键词
Data classification; Categorical attributes; Kernel density estimation; Naive Bayes; Nearest neighbor; Prototype-based classification; NEAREST-NEIGHBOR CLASSIFICATION;
D O I
10.1007/s00500-015-1926-8
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Kernel-based methods have been widely investigated in the soft-computing community. However, they focus mainly on numeric data. In this paper, we propose a novel method for kernel learning on categorical data, and show how the method can be used to derive effective classifiers for linear classification. Based on kernel density estimation for categorical attributes, three popular classification methods, i.e., Naive Bayes, nearest neighbor and prototype-based classification, are effectively extended to classify categorical data. We also propose two data-driven approaches to the bandwidth selection problem, with one aimed at minimizing the mean squared error of the kernel estimate and the other endeavored to attribute weights optimization. Theoretical analysis indicates that, as in the numeric case, kernel learning of categorical attributes is capable to make the classes to be more separable, resulting in outstanding performances of the new classifiers on various real-world data sets.
引用
收藏
页码:2981 / 2993
页数:13
相关论文
共 50 条
  • [1] Kernel-based linear classification on categorical data
    Lifei Chen
    Yanfang Ye
    Gongde Guo
    Jianping Zhu
    [J]. Soft Computing, 2016, 20 : 2981 - 2993
  • [2] Kernel-Based k-Representatives Algorithm for Fuzzy Clustering of Categorical Data
    Mau, Toan Nguyen
    Huynh, Van-Nam
    [J]. IEEE CIS INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS 2021 (FUZZ-IEEE), 2021,
  • [3] Kernel-based audio classification
    Li, Xiao-Li
    Du, Zhen-Long
    Zhang, Ya-Fen
    [J]. PROCEEDINGS OF 2006 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2006, : 3313 - +
  • [4] A COMPARATIVE-STUDY OF KERNEL-BASED DENSITY ESTIMATES FOR CATEGORICAL-DATA
    TITTERINGTON, DM
    [J]. TECHNOMETRICS, 1980, 22 (02) : 259 - 268
  • [5] Kernel-based data transformation model for nonlinear classification of symbolic data
    Xuanhui Yan
    Lifei Chen
    Gongde Guo
    [J]. Soft Computing, 2022, 26 : 1249 - 1259
  • [6] Kernel-based data transformation model for nonlinear classification of symbolic data
    Yan, Xuanhui
    Chen, Lifei
    Guo, Gongde
    [J]. SOFT COMPUTING, 2022, 26 (03) : 1249 - 1259
  • [7] Kernel-based distance metric learning for microarray data classification
    Xiong, Huilin
    Chen, Xue-wen
    [J]. BMC BIOINFORMATICS, 2006, 7 (1)
  • [8] Kernel-based distance metric learning for microarray data classification
    Huilin Xiong
    Xue-wen Chen
    [J]. BMC Bioinformatics, 7
  • [9] Fetal Risk Classification Based on Cardiotocography Data: A Kernel-Based Approach
    Keddachi, Khaoula
    Theljani, Foued
    [J]. PROCEEDINGS OF THE SECOND INTERNATIONAL AFRO-EUROPEAN CONFERENCE FOR INDUSTRIAL ADVANCEMENT (AECIA 2015), 2016, 427 : 327 - 337
  • [10] Improving kernel-based nonparametric regression for circular–linear data
    Yasuhito Tsuruta
    Masahiko Sagae
    [J]. Japanese Journal of Statistics and Data Science, 2022, 5 : 111 - 131