DECISION TREE WITH BETTER CLASS PROBABILITY ESTIMATION

被引:12
|
作者
Jiang, Liangxiao [1 ]
Li, Chaoqun [2 ]
Cai, Zhihua [3 ]
机构
[1] China Univ Geosci, Fac Comp Sci, Wuhan 430074, Hubei, Peoples R China
[2] China Univ Geosci, Fac Math, Wuhan 430074, Hubei, Peoples R China
[3] China Univ Geosci, Fac Comp Sci, Wuhan 430074, Hubei, Peoples R China
关键词
C4.4; locally weighted C4.4; class probability estimation; locally weighted learning; conditional log likelihood; AUC; classification; ROC CURVE; AREA;
D O I
10.1142/S0218001409007296
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Traditionally, the performance of a classifier is measured by its classification accuracy or error rate. In fact, probability-based classifiers also produce the class probability estimation (the probability that a test instance belongs to the predicted class). This information is often ignored in classification, as long as the class with the highest class probability estimation is identical to the actual class. In many data mining applications, however, classification accuracy and error rate are not enough. For example, in direct marketing, we often need to deploy different promotion strategies to customers with different likelihood (class probability) of buying some products. Thus, accurate class probability estimations are often required to make optimal decisions. In this paper, we firstly review some state-of-the-art probability-based classifiers and empirically investigate their class probability estimation performance. From our experimental results, we can draw a conclusion: C4.4 is an attractive algorithm for class probability estimation. Then, we present a locally weighted version of C4.4 to scale up its class probability estimation performance by combining locally weighted learning with C4.4. We call our improved algorithm locally weighted C4.4, simply LWC4.4. We experimentally test LWC4.4 using the whole 36 UCI data sets selected by Weka. The experimental results show that LWC4.4 significantly outperforms C4.4 in terms of class probability estimation.
引用
收藏
页码:745 / 763
页数:19
相关论文
共 50 条
  • [41] Decision Tree for Competing Risks Survival Probability in Breast Cancer Study
    Ibrahim, N. A.
    Kudus, A.
    Daud, I.
    Abu Bakar, M. R.
    PROCEEDINGS OF WORLD ACADEMY OF SCIENCE, ENGINEERING AND TECHNOLOGY, VOL 28, 2008, 28 : 15 - +
  • [42] Error Probability Analysis of Majority Decision in Tree Network Composed of BSC
    Nishino, Kazutaka
    Tani, Shinji
    Oka, Ikuo
    Ata, Shingo
    IEICE TRANSACTIONS ON COMMUNICATIONS, 2011, E94B (02) : 562 - 564
  • [43] Estimation of soil moisture using decision tree regression
    Pekel, Engin
    THEORETICAL AND APPLIED CLIMATOLOGY, 2020, 139 (3-4) : 1111 - 1119
  • [44] Estimation of Distribution Algorithms for Decision-Tree Induction
    Cagnini, Henry E. L.
    Barros, Rodrigo C.
    Basgalupp, Marcio P.
    2017 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2017, : 2022 - 2029
  • [45] Decision Tree for Locally Private Estimation with Public Data
    Ma, Yuheng
    Zhang, Han
    Cai, Yuchao
    Yang, Hanfang
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [46] Early Estimation of Tomato Yield by Decision Tree Ensembles
    Lillo-Saavedra, Mario
    Espinoza-Salgado, Alberto
    Garcia-Pedrero, Angel
    Souto, Camilo
    Holzapfel, Eduardo
    Gonzalo-Martin, Consuelo
    Somos-Valenzuela, Marcelo
    Rivera, Diego
    AGRICULTURE-BASEL, 2022, 12 (10):
  • [47] Estimation of soil moisture using decision tree regression
    Engin Pekel
    Theoretical and Applied Climatology, 2020, 139 : 1111 - 1119
  • [48] Estimation of parameters in the tree order restriction by a randomized decision
    Momeni, R.
    Etminan, J.
    Sadegh, M. Khanjari
    JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2019, 89 (11) : 1986 - 2005
  • [49] DECISION-DIRECTED ESTIMATION OF A 2-CLASS DECISION BOUNDARY
    PATRICK, EA
    COSTELLO, JP
    MONDS, FC
    IEEE TRANSACTIONS ON COMPUTERS, 1970, C 19 (03) : 197 - +
  • [50] Better Decision Tree Induction for Limited Data Sets of Liver Disease
    Sug, Hyontai
    COMPUTER APPLICATIONS FOR BIO-TECHNOLOGY, MULTIMEDIA, AND UBIQUITOUS CITY, 2012, 353 : 88 - 93