Tree-structured modelling of categorical predictors in generalized additive regression

被引:3
|
作者
Tutz, Gerhard [1 ]
Berger, Moritz [2 ]
机构
[1] Ludwig Maximilians Univ Munchen, Akad Str 1, D-80799 Munich, Germany
[2] Univ Klinikum Bonn, IMBIE, Sigmund Freud Str 25, D-53105 Bonn, Germany
关键词
Categorical predictors; Tree-structured clustering; Recursive partitioning; Partially linear tree-based regression; 62J12; 62J02; VARIABLE IMPORTANCE; CLASSIFICATION; LIKELIHOOD; SELECTION;
D O I
10.1007/s11634-017-0298-6
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Generalized linear and additive models are very efficient regression tools but many parameters have to be estimated if categorical predictors with many categories are included. The method proposed here focusses on the main effects of categorical predictors by using tree type methods to obtain clusters of categories. When the predictor has many categories one wants to know in particular which of the categories have to be distinguished with respect to their effect on the response. The tree-structured approach allows to detect clusters of categories that share the same effect while letting other predictors, in particular metric predictors, have a linear or additive effect on the response. An algorithm for the fitting is proposed and various stopping criteria are evaluated. The preferred stopping criterion is based on p values representing a conditional inference procedure. In addition, stability of clusters is investigated and the relevance of predictors is investigated by bootstrap methods. Several applications show the usefulness of the tree-structured approach and small simulation studies demonstrate that the fitting procedure works well.
引用
收藏
页码:737 / 758
页数:22
相关论文
共 50 条