Clusterwise Regression Using Dirichlet Mixtures

被引:0
|
作者
Kang, Changku [1 ]
Ghosal, Subhashis [2 ]
机构
[1] Bank Korea, Econ Stat Dept, 110,3 Ga, Seoul, South Korea
[2] North Carolina State Univ, Dept Stat, Raleigh, NC 27695 USA
关键词
D O I
暂无
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
The article describes a method of estimating nonparametric regression function through Bayesian clustering. The basic working assumption in the underlying method is that the population is a union of several hidden subpopulations in each of which a different linear regression is in force and the overall nonlinear regression function arises as a result of superposition of these linear regression functions. A Bayesian clustering technique based on Dirichlet mixture process is used to identify clusters which correspond to samples from these hidden subpopulations. The clusters are formed automatically within a Markov chain Monte-Carlo scheme arising from a Dirichlet mixture process prior for the density of the regressor variable. The number of components in the mixing distribution is thus treated as unknown allowing considerable flexibility in modeling. Within each cluster, we estimate model parameters by the standard least square method or some of its variations. Automatic model averaging takes care of the uncertainty in classifying a new observation to the obtained clusters. As opposed to most commonly used nonparametric regression estimates which break up the sample locally, our method splits the sample into a number of subgroups not depending on the dimension of the regressor variable. Thus our method avoids the curse of dimensionality problem. Through extensive simulations, we compare the performance of our proposed method with that of commonly used nonparametric regression techniques. We conclude that when the model assumption holds and the subpopulation are not highly overlapping, our method has smaller estimation error particularly if the dimension is relatively large.
引用
收藏
页码:305 / +
页数:3
相关论文
共 50 条
  • [1] Pavement condition prediction using clusterwise regression
    Luo, Zairen
    Chou, Eddie Y. J.
    [J]. PAVE MANAGEMENT; MONITORING, EVALUATION, AND DATA STORAGE; AND ACCELERATED TESTING 2006, 2006, (1974): : 70 - 77
  • [2] PM10 forecasting using clusterwise regression
    Poggi, Jean-Michel
    Portier, Bruno
    [J]. ATMOSPHERIC ENVIRONMENT, 2011, 45 (38) : 7005 - 7014
  • [3] Constrained clusterwise linear regression
    Plaia, A
    [J]. New Developments in Classification and Data Analysis, 2005, : 79 - 86
  • [4] SOMwise regression: a new clusterwise regression method
    Jorge Muruzábal
    Diego Vidaurre
    Julián Sánchez
    [J]. Neural Computing and Applications, 2012, 21 : 1229 - 1241
  • [5] CLUSTERWISE LINEAR-REGRESSION
    SPATH, H
    [J]. COMPUTING, 1979, 22 (04) : 367 - 373
  • [6] SOMwise regression: a new clusterwise regression method
    Muruzabal, Jorge
    Vidaurre, Diego
    Sanchez, Julian
    [J]. NEURAL COMPUTING & APPLICATIONS, 2012, 21 (06): : 1229 - 1241
  • [7] Seemingly unrelated clusterwise linear regression
    Galimberti, Giuliano
    Soffritti, Gabriele
    [J]. ADVANCES IN DATA ANALYSIS AND CLASSIFICATION, 2020, 14 (02) : 235 - 260
  • [8] Clusterwise functional linear regression models
    Li, Ting
    Song, Xinyuan
    Zhang, Yingying
    Zhu, Hongtu
    Zhu, Zhongyi
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2021, 158
  • [9] Cluster analysis of microbiome data by using mixtures of Dirichlet-multinomial regression models
    Subedi, Sanjeena
    Neish, Drew
    Bak, Stephen
    Feng, Zeny
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES C-APPLIED STATISTICS, 2020, 69 (05) : 1163 - 1187
  • [10] Identifiability of models for clusterwise linear regression
    Hennig, C
    [J]. JOURNAL OF CLASSIFICATION, 2000, 17 (02) : 273 - 296