DECISION TREE WITH BETTER CLASS PROBABILITY ESTIMATION

被引:12
|
作者
Jiang, Liangxiao [1 ]
Li, Chaoqun [2 ]
Cai, Zhihua [3 ]
机构
[1] China Univ Geosci, Fac Comp Sci, Wuhan 430074, Hubei, Peoples R China
[2] China Univ Geosci, Fac Math, Wuhan 430074, Hubei, Peoples R China
[3] China Univ Geosci, Fac Comp Sci, Wuhan 430074, Hubei, Peoples R China
关键词
C4.4; locally weighted C4.4; class probability estimation; locally weighted learning; conditional log likelihood; AUC; classification; ROC CURVE; AREA;
D O I
10.1142/S0218001409007296
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Traditionally, the performance of a classifier is measured by its classification accuracy or error rate. In fact, probability-based classifiers also produce the class probability estimation (the probability that a test instance belongs to the predicted class). This information is often ignored in classification, as long as the class with the highest class probability estimation is identical to the actual class. In many data mining applications, however, classification accuracy and error rate are not enough. For example, in direct marketing, we often need to deploy different promotion strategies to customers with different likelihood (class probability) of buying some products. Thus, accurate class probability estimations are often required to make optimal decisions. In this paper, we firstly review some state-of-the-art probability-based classifiers and empirically investigate their class probability estimation performance. From our experimental results, we can draw a conclusion: C4.4 is an attractive algorithm for class probability estimation. Then, we present a locally weighted version of C4.4 to scale up its class probability estimation performance by combining locally weighted learning with C4.4. We call our improved algorithm locally weighted C4.4, simply LWC4.4. We experimentally test LWC4.4 using the whole 36 UCI data sets selected by Weka. The experimental results show that LWC4.4 significantly outperforms C4.4 in terms of class probability estimation.
引用
收藏
页码:745 / 763
页数:19
相关论文
共 50 条
  • [31] Boosted classification trees and class probability/quantile estimation
    Mease, David
    Wyner, Abraham J.
    Buja, Andreas
    JOURNAL OF MACHINE LEARNING RESEARCH, 2007, 8 : 409 - 439
  • [32] A Modular Decision-Tree Architecture for Better Problem Understanding
    Khare, Vineet R.
    Subramania, Halasya Siva
    SIMULATED EVOLUTION AND LEARNING, 2010, 6457 : 647 - 656
  • [33] A Better Prediction for Higher Education Performance using the Decision Tree
    Hilal, Anwer Mustafa Mohamedsalih
    Zamani, Abu Sarwar
    Farid, Muhammad Shahid Ghulam
    Rizwanullah, Mohammed
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2021, 21 (04): : 209 - 213
  • [34] Attribute selection for decision tree learning with class constraint
    Sun, Huaining
    Hu, Xuegang
    CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2017, 163 : 16 - 23
  • [35] NSVM Decision Tree for Multi-class Classification
    Yao Yong
    Zhao Hui
    Liu Zhijing
    CHINESE JOURNAL OF ELECTRONICS, 2008, 17 (04): : 627 - 629
  • [36] Class-oriented reduction of decision tree complexity
    Polo, Jose-Luis
    Berzal, Fernando
    Cubero, Juan-Carlos
    FOUNDATIONS OF INTELLIGENT SYSTEMS, PROCEEDINGS, 2008, 4994 : 48 - 57
  • [37] Improving tree probability estimation with stochastic optimization and variance reduction
    Xie, Tianyu
    Yuan, Musu
    Deng, Minghua
    Zhang, Cheng
    STATISTICS AND COMPUTING, 2024, 34 (06)
  • [38] Keep the Decision Tree and Estimate the Class Probabilities Using its Decision Boundary
    Alvarez, Isabelle
    Bernard, Stephan
    Deffuant, Guillaume
    20TH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2007, : 654 - 659
  • [39] Predicting the probability of mortality of gastric cancer patients using decision tree
    Mohammadzadeh, F.
    Noorkojuri, H.
    Pourhoseingholi, M. A.
    Saadat, S.
    Baghestani, A. R.
    IRISH JOURNAL OF MEDICAL SCIENCE, 2015, 184 (02) : 277 - 284
  • [40] Predicting the probability of mortality of gastric cancer patients using decision tree
    F. Mohammadzadeh
    H. Noorkojuri
    M. A. Pourhoseingholi
    S. Saadat
    A. R. Baghestani
    Irish Journal of Medical Science (1971 -), 2015, 184 : 277 - 284