High-Dimensional Bayesian Clustering with Variable Selection: The R Package bclus

被引:0
|
作者
Nia, Vahid Partovi [1 ]
Davison, Anthony C. [2 ]
机构
[1] Ecole Polytech, Dept Math & Ind Engn, Montreal, PQ H3T 1J4, Canada
[2] Ecole Polytech Fed Lausanne, EPFL FSB MATHAA STAT, CH-1015 Lausanne, Switzerland
来源
JOURNAL OF STATISTICAL SOFTWARE | 2012年 / 47卷 / 05期
基金
加拿大自然科学与工程研究理事会;
关键词
agglomerative clustering; Bayesian clustering; Bayesian variable selection; dendrogram; hierarchical clustering; R; spike-and-slab model; GEOMETRIC REPRESENTATION; MIXTURE MODEL; EXPRESSION;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The R package bclust is useful for clustering high-dimensional continuous data. The package uses a parametric spike-and-slab Bayesian model to downweight the effect of noise variables and to quantify the importance of each variable in agglomerative clustering. We take advantage of the existence of closed-form marginal distributions to estimate the model hyper-parameters using empirical Bayes, thereby yielding a fully automatic method. We discuss computational problems arising in implementation of the procedure and illustrate the usefulness of the package through examples.
引用
收藏
页码:1 / 22
页数:22
相关论文
共 50 条
  • [1] Bayesian variable selection in clustering high-dimensional data
    Tadesse, MG
    Sha, N
    Vannucci, M
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2005, 100 (470) : 602 - 617
  • [2] Bayesian Variable Selection in Clustering High-Dimensional Data With Substructure
    Swartz, Michael D.
    Mo, Qianxing
    Murphy, Mary E.
    Lupton, Joanne R.
    Turner, Nancy D.
    Hong, Mee Young
    Vannucci, Marina
    [J]. JOURNAL OF AGRICULTURAL BIOLOGICAL AND ENVIRONMENTAL STATISTICS, 2008, 13 (04) : 407 - 423
  • [3] Bayesian variable selection in clustering high-dimensional data with substructure
    Michael D. Swartz
    Qianxing Mo
    Mary E. Murphy
    Joanne R. Lupton
    Nancy D. Turner
    Mee Young Hong
    Marina Vannucci
    [J]. Journal of Agricultural, Biological, and Environmental Statistics, 2008, 13 : 407 - 423
  • [4] BayesSUR: An R Package for High-Dimensional Multivariate Bayesian Variable and Covariance Selection in Linear Regression
    Zhao, Zhi
    Banterle, Marco
    Bottolo, Leonardo
    Richardson, Sylvia
    Lewin, Alex
    Zucknick, Manuela
    [J]. JOURNAL OF STATISTICAL SOFTWARE, 2021, 100 (11): : 1 - 32
  • [5] Variable Clustering in High-Dimensional Linear Regression: The R Package clere
    Yengo, Loic
    Jacques, Julien
    Biernacki, Christophe
    Canouil, Mickael
    [J]. R JOURNAL, 2016, 8 (01): : 92 - 106
  • [6] Bayesian variable selection in clustering high-dimensional data via a mixture of finite mixtures
    Doo, Woojin
    Kim, Heeyoung
    [J]. JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2021, 91 (12) : 2551 - 2568
  • [7] Bayesian variable selection for high-dimensional rank data
    Cui, Can
    Singh, Susheela P.
    Staicu, Ana-Maria
    Reich, Brian J.
    [J]. ENVIRONMETRICS, 2021, 32 (07)
  • [8] ON THE COMPUTATIONAL COMPLEXITY OF HIGH-DIMENSIONAL BAYESIAN VARIABLE SELECTION
    Yang, Yun
    Wainwright, Martin J.
    Jordan, Michael I.
    [J]. ANNALS OF STATISTICS, 2016, 44 (06): : 2497 - 2532
  • [9] Springer: An R package for bi-level variable selection of high-dimensional longitudinal data
    Zhou, Fei
    Liu, Yuwen
    Ren, Jie
    Wang, Weiqun
    Wu, Cen
    [J]. FRONTIERS IN GENETICS, 2023, 14
  • [10] Scalable Bayesian variable selection for structured high-dimensional data
    Chang, Changgee
    Kundu, Suprateek
    Long, Qi
    [J]. BIOMETRICS, 2018, 74 (04) : 1372 - 1382