DECISION TREE WITH BETTER CLASS PROBABILITY ESTIMATION

被引:12
|
作者
Jiang, Liangxiao [1 ]
Li, Chaoqun [2 ]
Cai, Zhihua [3 ]
机构
[1] China Univ Geosci, Fac Comp Sci, Wuhan 430074, Hubei, Peoples R China
[2] China Univ Geosci, Fac Math, Wuhan 430074, Hubei, Peoples R China
[3] China Univ Geosci, Fac Comp Sci, Wuhan 430074, Hubei, Peoples R China
关键词
C4.4; locally weighted C4.4; class probability estimation; locally weighted learning; conditional log likelihood; AUC; classification; ROC CURVE; AREA;
D O I
10.1142/S0218001409007296
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Traditionally, the performance of a classifier is measured by its classification accuracy or error rate. In fact, probability-based classifiers also produce the class probability estimation (the probability that a test instance belongs to the predicted class). This information is often ignored in classification, as long as the class with the highest class probability estimation is identical to the actual class. In many data mining applications, however, classification accuracy and error rate are not enough. For example, in direct marketing, we often need to deploy different promotion strategies to customers with different likelihood (class probability) of buying some products. Thus, accurate class probability estimations are often required to make optimal decisions. In this paper, we firstly review some state-of-the-art probability-based classifiers and empirically investigate their class probability estimation performance. From our experimental results, we can draw a conclusion: C4.4 is an attractive algorithm for class probability estimation. Then, we present a locally weighted version of C4.4 to scale up its class probability estimation performance by combining locally weighted learning with C4.4. We call our improved algorithm locally weighted C4.4, simply LWC4.4. We experimentally test LWC4.4 using the whole 36 UCI data sets selected by Weka. The experimental results show that LWC4.4 significantly outperforms C4.4 in terms of class probability estimation.
引用
收藏
页码:745 / 763
页数:19
相关论文
共 50 条
  • [21] Learning naive Bayes Tree for conditional probability estimation
    Liang, Han
    Yan, Yuhong
    ADVANCES IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2006, 4013 : 455 - 466
  • [22] The lattice class of decision functions on a bounded probability space
    Karkishchenko, AN
    AUTOMATION AND REMOTE CONTROL, 1996, 57 (04) : 529 - 536
  • [23] Customer's class transformation for profit maximization in multi-class setting of Telecom industry using probability estimation decision trees
    Muneiah, Janapati Naga
    Rao, Ch D. V. Subba
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2019, 37 (06) : 8167 - 8197
  • [24] ESTIMATION OF TOXIC HAZARD - DECISION TREE APPROACH
    CRAMER, GM
    FORD, RA
    HALL, RL
    FOOD AND COSMETICS TOXICOLOGY, 1978, 16 (03): : 255 - 276
  • [25] Stochastic decision tree acceptability analysis with uncertain state probability
    Song, Shiling
    Xia, Qiong
    Yang, Feng
    Zhang, Xiaoqi
    JOURNAL OF THE OPERATIONAL RESEARCH SOCIETY, 2023, 74 (03) : 944 - 955
  • [26] One-Class Classification by Combining Density and Class Probability Estimation
    Hempstalk, Kathryn
    Frank, Eibe
    Witten, Ian H.
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, PART I, PROCEEDINGS, 2008, 5211 : 505 - 519
  • [27] A Better Decision Tree: The Max-Cut Decision Tree with Modified PCA Improves Accuracy and Running Time
    Bodine J.
    Hochbaum D.S.
    SN Computer Science, 3 (4)
  • [28] The fault probability estimation and decision reliability improvement in WSNs
    Hsu, Ming-Tsung
    Lin, Frank Yeong-Sung
    Chang, Yue-Shan
    Juang, Tong-Ying
    TENCON 2007 - 2007 IEEE REGION 10 CONFERENCE, VOLS 1-3, 2007, : 696 - +
  • [29] Estimation of Failure Probability with Dependence in Fault Tree Analysis Based on Interval Probability Theory
    Zhang Xinfeng
    Zhao Yan
    Wang Shengchang
    2011 3RD WORLD CONGRESS IN APPLIED COMPUTING, COMPUTER SCIENCE, AND COMPUTER ENGINEERING (ACC 2011), VOL 1, 2011, 1 : 602 - +
  • [30] An Ensemble of Optimal Trees for Class Membership Probability Estimation
    Khan, Zardad
    Gul, Asma
    Mahmoud, Osama
    Miftahuddin, Miftahuddin
    Perperoglou, Aris
    Adler, Werner
    Lausen, Berthold
    ANALYSIS OF LARGE AND COMPLEX DATA, 2016, : 395 - 409