Bayesian variable selection in clustering high-dimensional data with substructure

被引:0
|
作者
Michael D. Swartz
Qianxing Mo
Mary E. Murphy
Joanne R. Lupton
Nancy D. Turner
Mee Young Hong
Marina Vannucci
机构
[1] M.D. Anderson Cancer Center,Department of Epidemiology
[2] Memorial Sloan-Kettering Cancer Center,Department of Epidemiology and Biostatistics
[3] Texas A&M University,Nutrition and Food Science Department
[4] Texas A&M University,School of Exercise and Nutritional Sciences
[5] San Diego State University,Department of Statistics
[6] Rice University,undefined
关键词
Bayesian inference; Designed experiments; Microarray analysis;
D O I
暂无
中图分类号
学科分类号
摘要
In this article we focus on clustering techniques recently proposed for high-dimensional data that incorporate variable selection and extend them to the modeling of data with a known substructure, such as the structure imposed by an experimental design. Our method essentially approximates the within-group covariance by facilitating clustering without disrupting the groups defined by the experimenter. The method we adopt simultaneously determines which expression patterns are important, and which genes contribute to such patterns. We evaluate performance on simulated data and on microarray data from a colon carcinogenesis study. Selected genes are biologically consistent with current research and provide strong biological validation of the cluster configuration identified by the method.
引用
收藏
页码:407 / 423
页数:16
相关论文
共 50 条
  • [31] Computationally Efficient High-Dimensional Bayesian Optimization via Variable Selection
    Shen, Yihang
    Kingsford, Carl
    INTERNATIONAL CONFERENCE ON AUTOMATED MACHINE LEARNING, VOL 224, 2023, 224
  • [32] Variational Bayesian Variable Selection for High-Dimensional Hidden Markov Models
    Zhai, Yao
    Liu, Wei
    Jin, Yunzhi
    Zhang, Yanqing
    MATHEMATICS, 2024, 12 (07)
  • [33] Dimension-free mixing for high-dimensional Bayesian variable selection
    Zhou, Quan
    Yang, Jun
    Vats, Dootika
    Roberts, Gareth O.
    Rosenthal, Jeffrey S.
    JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2022, 84 (05) : 1751 - 1784
  • [34] Pairwise Variable Selection for High-Dimensional Model-Based Clustering
    Guo, Jian
    Levina, Elizaveta
    Michailidis, George
    Zhu, Ji
    BIOMETRICS, 2010, 66 (03) : 793 - 804
  • [35] On online high-dimensional spherical data clustering and feature selection
    Amayri, Ola
    Bouguila, Nizar
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2013, 26 (04) : 1386 - 1398
  • [36] High-dimensional data clustering
    Bouveyron, C.
    Girard, S.
    Schmid, C.
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2007, 52 (01) : 502 - 519
  • [37] Clustering High-Dimensional Data
    Masulli, Francesco
    Rovetta, Stefano
    CLUSTERING HIGH-DIMENSIONAL DATA, CHDD 2012, 2015, 7627 : 1 - 13
  • [38] Stochastic variational variable selection for high-dimensional microbiome data
    Dang, Tung
    Kumaishi, Kie
    Usui, Erika
    Kobori, Shungo
    Sato, Takumi
    Toda, Yusuke
    Yamasaki, Yuji
    Tsujimoto, Hisashi
    Ichihashi, Yasunori
    Iwata, Hiroyoshi
    MICROBIOME, 2022, 10 (01)
  • [39] RANKING-BASED VARIABLE SELECTION FOR HIGH-DIMENSIONAL DATA
    Baranowski, Rafal
    Chen, Yining
    Fryzlewicz, Piotr
    STATISTICA SINICA, 2020, 30 (03) : 1485 - 1516
  • [40] Variable selection and subgroup analysis for high-dimensional censored data
    Zhang, Yu
    Wang, Jiangli
    Zhang, Weiping
    STATISTICAL THEORY AND RELATED FIELDS, 2024, 8 (03) : 211 - 231