A clustering-based discretization for supervised learning

被引:39
|
作者
Gupta, Ankit [2 ]
Mehrotra, Kishan G. [1 ]
Mohan, Chilukuri [1 ]
机构
[1] Syracuse Univ, Dept Elect Engn & Comp Sci, Ctr Sci & Technol 4 106, Syracuse, NY 13244 USA
[2] Indian Inst Technol, Dept Elect Engn, Kanpur 208016, Uttar Pradesh, India
关键词
Discretization; Clustering; Binning; Supervised learning;
D O I
10.1016/j.spl.2010.01.015
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We address the problem of discretization of continuous variables for machine learning classification algorithms. Existing procedures do not use interdependence between the variables towards this goal. Our proposed method uses clustering to exploit such interdependence. Numerical results show that this improves the classification performance in almost all cases. Even if an existing algorithm can successfully operate with continuous variables, better performance is obtained if the variables are first discretized. An additional advantage of discretization is that it reduces the overall computation time. (C) 2010 Elsevier B.V. All rights reserved.
引用
收藏
页码:816 / 824
页数:9
相关论文
共 50 条
  • [31] CLUSTERING-BASED SUBSET ENSEMBLE LEARNING METHOD FOR IMBALANCED DATA
    Hu, Xiao-Sheng
    Zhang, Run-Jing
    PROCEEDINGS OF 2013 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS (ICMLC), VOLS 1-4, 2013, : 35 - 39
  • [32] Energy Demand Prediction with Optimized Clustering-Based Federated Learning
    Perry, Dylan
    Wang, Ning
    Ho, Shen-Shyang
    2021 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2021,
  • [33] Clustering-based Active Learning Classification towards Data Stream
    Yin, Chunyong
    Chen, Shuangshuang
    Yin, Zhichao
    ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2023, 14 (02)
  • [34] CLUSTERING-BASED SUBSPACE SVM ENSEMBLE FOR RELEVANCE FEEDBACK LEARNING
    Ji, Rongrong
    Yao, Hongxun
    Wang, Jicheng
    Xu, Pengfei
    Liu, Xianming
    2008 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-4, 2008, : 1221 - 1224
  • [35] Clustering-based fuzzy knowledgebase reduction in the FRIQ-learning
    Tompa, Tamas
    Kovacs, Szilveszter
    2017 IEEE 15TH INTERNATIONAL SYMPOSIUM ON APPLIED MACHINE INTELLIGENCE AND INFORMATICS (SAMI), 2017, : 197 - 200
  • [36] A Clustering-Based Deep Learning Method for Water Level Prediction
    Wang, Chih-Ping
    Liu, Duen-Ren
    IEICE Transactions on Information and Systems, 2024, E107.D (12) : 1538 - 1541
  • [37] Weakly Supervised Acoustic Defect Detection in Concrete Structures Using Clustering-Based Augmentation
    Kasahara, Jun Younes Louhi
    Fujii, Hiromitsu
    Yamashita, Atsushi
    Asama, Hajime
    IEEE-ASME TRANSACTIONS ON MECHATRONICS, 2021, 26 (06) : 2826 - 2834
  • [38] Semi-supervised clustering-based method for fault diagnosis and prognosis: A case study
    Azar, Kamyar
    Hajiakhondi-Meybodi, Zohreh
    Naderkhani, Farnoosh
    RELIABILITY ENGINEERING & SYSTEM SAFETY, 2022, 222
  • [39] Clustering Based on Supervised Learning of Exemplar Discriminative Information
    Duan, Lijuan
    Cui, Song
    Qiao, Yuanhua
    Yuan, Bin
    IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2020, 50 (12): : 5255 - 5270
  • [40] Fuel consumption estimation method based on clustering-based deep learning model
    Chen, Chi-Hua
    ASIA-PACIFIC JOURNAL OF CLINICAL ONCOLOGY, 2022, 18 : 129 - 130