Partially Bayesian variable selection in classification trees

被引:0
|
作者
Noe, Douglas A. [1 ]
He, Xuming [2 ]
机构
[1] Miami Univ, Dept Math & Stat, Oxford, OH 45056 USA
[2] Univ Illinois, Dept Stat, Champaign, IL 61820 USA
基金
美国国家科学基金会;
关键词
Feature selection; Expert opinion; Supervised learning;
D O I
暂无
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Tree-structured models for classification may be split into two broad categories: those that are completely data-driven and those that allow some direct user interaction during model construction. Classifiers such as CART [3] and QUEST [11] are members of the first category. In those data-driven algorithms, all predictor variables compete equally for a particular classification task. However, in many cases a subject-area expert is likely to have some qualitative notion about their relative importance. Interactive algorithms such as RTREE [17] address this issue by allowing users to select variables at various stages of tree construction. In this paper, we introduce a more formal partially Bayesian procedure for dynamically incorporating qualitative expert opinions in the construction of classification trees. An algorithm that dynamically incorporates expert opinion in this way has two potential advantages, each improving with the quality of the expert. First, by de-emphasizing certain subsets of variables during the estimation process, machine-based computational activity can be reduced. Second, by giving an expert's preferred variables priority, we reduce the chance that a spurious variable will appear in the model. Hence, our resulting models are potentially more interpretable and less unstable than those generated by purely data-driven algorithms.
引用
收藏
页码:155 / 167
页数:13
相关论文
共 50 条
  • [41] CONSISTENCY OF BAYESIAN PROCEDURES FOR VARIABLE SELECTION
    Casella, George
    Giron, F. Javier
    Martinez, M. Lina
    Moreno, Elias
    [J]. ANNALS OF STATISTICS, 2009, 37 (03): : 1207 - 1228
  • [42] Bayesian variable selection with related predictors
    Chipman, H
    [J]. CANADIAN JOURNAL OF STATISTICS-REVUE CANADIENNE DE STATISTIQUE, 1996, 24 (01): : 17 - 36
  • [43] Variable Selection for Clustering and Classification
    Andrews, Jeffrey L.
    McNicholas, Paul D.
    [J]. JOURNAL OF CLASSIFICATION, 2014, 31 (02) : 136 - 153
  • [44] Variable Selection for Clustering and Classification
    Jeffrey L. Andrews
    Paul D. McNicholas
    [J]. Journal of Classification, 2014, 31 : 136 - 153
  • [45] Variable Selection for Kernel Classification
    Steel, S. J.
    Louw, N.
    Bierman, S.
    [J]. COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2011, 40 (02) : 241 - 258
  • [46] Combining Kohonen neural networks and variable selection by classification trees to cluster road soil samples
    Gomez-Carracedo, M. P.
    Andrade, J. M.
    Carrera, G. V. S. M.
    Aires-de-Sousa, J.
    Carlosena, A.
    Prada, D.
    [J]. CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2010, 102 (01) : 20 - 34
  • [47] Split selection methods for classification trees
    Loh, WY
    Shih, YS
    [J]. STATISTICA SINICA, 1997, 7 (04) : 815 - 840
  • [48] Variable selection in partially linear wavelet models
    Ding, Huijuan
    Claeskens, Gerda
    Jansen, Maarten
    [J]. STATISTICAL MODELLING, 2011, 11 (05) : 409 - 427
  • [49] Bayesian variable selection using partially observed categorical prior information in fine-mapping association studies
    Alenazi, Abdulaziz A.
    Cox, Angela
    Juarez, Miguel
    Lin, Wei-Yu
    Walters, Kevin
    [J]. GENETIC EPIDEMIOLOGY, 2019, 43 (06) : 690 - 703
  • [50] Splitting variable selection for multivariate regression trees
    Hsiao, Wei-Cheng
    Shih, Yu-Shan
    [J]. STATISTICS & PROBABILITY LETTERS, 2007, 77 (03) : 265 - 271