High-dimensional genomic feature selection with the ordered stereotype logit model

被引:3
|
作者
Seffernick, Anna Eames [1 ]
Mrozek, Krzysztof [2 ]
Nicolet, Deedra [2 ]
Stone, Richard M. [3 ]
Eisfeld, Ann-Kathrin [4 ,5 ]
Byrd, John C. [6 ]
Archer, Kellie J. [7 ]
机构
[1] Ohio State Univ, Columbus, OH 43210 USA
[2] Ohio State Univ, Clara D Bloonfield Ctr, Leukemia Outcomes Res, Comprehens Canc Ctr, Columbus, OH 43210 USA
[3] Dana Farber Canc Inst, Adult Acute Leukemia Program, Boston, MA 02115 USA
[4] Ohio State Comprehens Canc Ctr, Div Hematol, Columbus, OH USA
[5] Clara D Bloomfield Ctr Leukemia Outcomes Res, Bloomfield, NJ USA
[6] Univ Cincinnati, Coll Med, Dept Intnrnal Med, Cincinnati, OH 45221 USA
[7] Ohio State Univ, Div Biostat, Columbus, OH 43210 USA
基金
美国国家卫生研究院;
关键词
hierarchical model; ordinal response; variable selection; acute myeloid leukemia; ACUTE MYELOID-LEUKEMIA; VARIABLE SELECTION; BAYESIAN LASSO; REGRESSION; RECOMMENDATIONS; NORMALIZATION; ASSOCIATION; MANAGEMENT; DIAGNOSIS; SHRINKAGE;
D O I
10.1093/bib/bbac414
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
For many high-dimensional genomic and epigenomic datasets, the outcome of interest is ordinal. While these ordinal outcomes are often thought of as the observed cutpoints of some latent continuous variable, some ordinal outcomes are truly discrete and are comprised of the subjective combination of several factors. The nonlinear stereotype logistic model, which does not assume proportional odds, was developed for these 'assessed' ordinal variables. It has previously been extended to the frequentist high-dimensional feature selection setting, but the Bayesian framework provides some distinct advantages in terms of simultaneous uncertainty quantification and variable selection. Here, we review the stereotype model and Bayesian variable selection methods and demonstrate how to combine them to select genomic features associated with discrete ordinal outcomes. We compared the Bayesian and frequentist methods in terms of variable selection performance. We additionally applied the Bayesian stereotype method to an acute myeloid leukemia RNA-sequencing dataset to further demonstrate its variable selection abilities by identifying features associated with the European LeukemiaNet prognostic risk score.
引用
收藏
页数:10
相关论文
共 50 条
  • [31] Cluster feature selection in high-dimensional linear models
    Lin, Bingqing
    Pang, Zhen
    Wang, Qihua
    [J]. RANDOM MATRICES-THEORY AND APPLICATIONS, 2018, 7 (01)
  • [32] Hybrid Feature Selection for High-Dimensional Manufacturing Data
    Sun, Yajuan
    Yu, Jianlin
    Li, Xiang
    Wu, Ji Yan
    Lu, Wen Feng
    [J]. 2021 26TH IEEE INTERNATIONAL CONFERENCE ON EMERGING TECHNOLOGIES AND FACTORY AUTOMATION (ETFA), 2021,
  • [33] Feature Selection for High-Dimensional Data: The Issue of Stability
    Pes, Barbara
    [J]. 2017 IEEE 26TH INTERNATIONAL CONFERENCE ON ENABLING TECHNOLOGIES - INFRASTRUCTURE FOR COLLABORATIVE ENTERPRISES (WETICE), 2017, : 170 - 175
  • [34] An adaptive pyramid PSO for high-dimensional feature selection
    Jin, Xiao
    Wei, Bo
    Deng, Li
    Yang, Shanshan
    Zheng, Junbao
    Wang, Feng
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2024, 257
  • [35] High-Dimensional Software Engineering Data and Feature Selection
    Wang, Huanjing
    Khoshgoftaar, Taghi M.
    Gao, Kehan
    Seliya, Naeem
    [J]. ICTAI: 2009 21ST INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, 2009, : 83 - +
  • [36] Simultaneous Feature Selection and Classification for High-Dimensional Data
    Pai, Vriddhi
    Gupta, Subhash Chand
    [J]. PROCEEDINGS OF THE SECOND INTERNATIONAL CONFERENCE ON GREEN COMPUTING AND INTERNET OF THINGS (ICGCIOT 2018), 2018, : 153 - 158
  • [37] On the scalability of feature selection methods on high-dimensional data
    Bolon-Canedo, V.
    Rego-Fernandez, D.
    Peteiro-Barral, D.
    Alonso-Betanzos, A.
    Guijarro-Berdinas, B.
    Sanchez-Marono, N.
    [J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2018, 56 (02) : 395 - 442
  • [38] Preconditioning for feature selection and regression in high-dimensional problems'
    Paul, Debashis
    Bair, Eric
    Hastie, Trevor
    Tibshirani, Robert
    [J]. ANNALS OF STATISTICS, 2008, 36 (04): : 1595 - 1618
  • [39] A hybrid feature selection scheme for high-dimensional data
    Ganjei, Mohammad Ahmadi
    Boostani, Reza
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2022, 113
  • [40] Efficient Learning and Feature Selection in High-Dimensional Regression
    Ting, Jo-Anne
    D'Souza, Aaron
    Vijayakumar, Sethu
    Schaal, Stefan
    [J]. NEURAL COMPUTATION, 2010, 22 (04) : 831 - 886