A clustering-based discretization for supervised learning

被引:39
|
作者
Gupta, Ankit [2 ]
Mehrotra, Kishan G. [1 ]
Mohan, Chilukuri [1 ]
机构
[1] Syracuse Univ, Dept Elect Engn & Comp Sci, Ctr Sci & Technol 4 106, Syracuse, NY 13244 USA
[2] Indian Inst Technol, Dept Elect Engn, Kanpur 208016, Uttar Pradesh, India
关键词
Discretization; Clustering; Binning; Supervised learning;
D O I
10.1016/j.spl.2010.01.015
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We address the problem of discretization of continuous variables for machine learning classification algorithms. Existing procedures do not use interdependence between the variables towards this goal. Our proposed method uses clustering to exploit such interdependence. Numerical results show that this improves the classification performance in almost all cases. Even if an existing algorithm can successfully operate with continuous variables, better performance is obtained if the variables are first discretized. An additional advantage of discretization is that it reduces the overall computation time. (C) 2010 Elsevier B.V. All rights reserved.
引用
收藏
页码:816 / 824
页数:9
相关论文
共 50 条
  • [41] Cross-company defect prediction via semi-supervised clustering-based data filtering and MSTrA-based transfer learning
    Xiao Yu
    Man Wu
    Yiheng Jian
    Kwabena Ebo Bennin
    Mandi Fu
    Chuanxiang Ma
    Soft Computing, 2018, 22 : 3461 - 3472
  • [42] Cross-company defect prediction via semi-supervised clustering-based data filtering and MSTrA-based transfer learning
    Yu, Xiao
    Wu, Man
    Jian, Yiheng
    Bennin, Kwabena Ebo
    Fu, Mandi
    Ma, Chuanxiang
    SOFT COMPUTING, 2018, 22 (10) : 3461 - 3472
  • [43] A Clustering-based Recommendation System
    Wu, Shaofei
    PROCEEDINGS OF 2008 INTERNATIONAL PRE-OLYMPIC CONGRESS ON COMPUTER SCIENCE, VOL I: COMPUTER SCIENCE AND ENGINEERING, 2008, : 328 - 330
  • [44] Clustering-based feature selection
    School of Informatics, Guangdong University of Foreign Studies, Guangzhou 510006, China
    Tien Tzu Hsueh Pao, 2008, SUPPL. (157-160):
  • [45] Spectral Clustering-based Classification
    Owhadi-Kareshk, Moein
    Akbarzadeh-T, Mohammad-R
    2015 5TH INTERNATIONAL CONFERENCE ON COMPUTER AND KNOWLEDGE ENGINEERING (ICCKE), 2015, : 222 - 227
  • [46] A clustering-based fuzzy classifier
    Drummond, Isabela
    Sandri, Sandra
    ARTIFICIAL INTELLIGENCE RESEARCH AND DEVELOPMENT, 2005, 131 : 247 - 254
  • [47] Fuzzy Clustering-Based Filter
    Coletta, Luiz F. S.
    Hruschka, Eduardo R.
    Covoes, Thiago F.
    Campello, Ricardo J. G. B.
    INFORMATION PROCESSING AND MANAGEMENT OF UNCERTAINTY IN KNOWLEDGE-BASED SYSTEMS: THEORY AND METHODS, PT 1, 2010, 80 : 406 - 415
  • [48] Clustering-based microcode compression
    Borin, Edson
    Breternitz, Mauricio, Jr.
    Wut, Youfeg
    Araujo, Guido
    PROCEEDINGS 2006 INTERNATIONAL CONFERENCE ON COMPUTER DESIGN, 2007, : 189 - +
  • [49] Clustering-Based Federated Learning for Enhancing Data Privacy in Internet of Vehicles
    Jin, Zilong
    Wang, Jin
    Zhang, Lejun
    KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2024, 18 (06): : 1462 - 1477
  • [50] Joint patch clustering-based dictionary learning for multimodal image fusion
    Kim, Minjae
    Han, David K.
    Ko, Hanseok
    INFORMATION FUSION, 2016, 27 : 198 - 214