Partially Bayesian variable selection in classification trees

被引:0
|
作者
Noe, Douglas A. [1 ]
He, Xuming [2 ]
机构
[1] Miami Univ, Dept Math & Stat, Oxford, OH 45056 USA
[2] Univ Illinois, Dept Stat, Champaign, IL 61820 USA
基金
美国国家科学基金会;
关键词
Feature selection; Expert opinion; Supervised learning;
D O I
暂无
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Tree-structured models for classification may be split into two broad categories: those that are completely data-driven and those that allow some direct user interaction during model construction. Classifiers such as CART [3] and QUEST [11] are members of the first category. In those data-driven algorithms, all predictor variables compete equally for a particular classification task. However, in many cases a subject-area expert is likely to have some qualitative notion about their relative importance. Interactive algorithms such as RTREE [17] address this issue by allowing users to select variables at various stages of tree construction. In this paper, we introduce a more formal partially Bayesian procedure for dynamically incorporating qualitative expert opinions in the construction of classification trees. An algorithm that dynamically incorporates expert opinion in this way has two potential advantages, each improving with the quality of the expert. First, by de-emphasizing certain subsets of variables during the estimation process, machine-based computational activity can be reduced. Second, by giving an expert's preferred variables priority, we reduce the chance that a spurious variable will appear in the model. Hence, our resulting models are potentially more interpretable and less unstable than those generated by purely data-driven algorithms.
引用
收藏
页码:155 / 167
页数:13
相关论文
共 50 条