High-dimensional genomic feature selection with the ordered stereotype logit model

被引:3
|
作者
Seffernick, Anna Eames [1 ]
Mrozek, Krzysztof [2 ]
Nicolet, Deedra [2 ]
Stone, Richard M. [3 ]
Eisfeld, Ann-Kathrin [4 ,5 ]
Byrd, John C. [6 ]
Archer, Kellie J. [7 ]
机构
[1] Ohio State Univ, Columbus, OH 43210 USA
[2] Ohio State Univ, Clara D Bloonfield Ctr, Leukemia Outcomes Res, Comprehens Canc Ctr, Columbus, OH 43210 USA
[3] Dana Farber Canc Inst, Adult Acute Leukemia Program, Boston, MA 02115 USA
[4] Ohio State Comprehens Canc Ctr, Div Hematol, Columbus, OH USA
[5] Clara D Bloomfield Ctr Leukemia Outcomes Res, Bloomfield, NJ USA
[6] Univ Cincinnati, Coll Med, Dept Intnrnal Med, Cincinnati, OH 45221 USA
[7] Ohio State Univ, Div Biostat, Columbus, OH 43210 USA
基金
美国国家卫生研究院;
关键词
hierarchical model; ordinal response; variable selection; acute myeloid leukemia; ACUTE MYELOID-LEUKEMIA; VARIABLE SELECTION; BAYESIAN LASSO; REGRESSION; RECOMMENDATIONS; NORMALIZATION; ASSOCIATION; MANAGEMENT; DIAGNOSIS; SHRINKAGE;
D O I
10.1093/bib/bbac414
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
For many high-dimensional genomic and epigenomic datasets, the outcome of interest is ordinal. While these ordinal outcomes are often thought of as the observed cutpoints of some latent continuous variable, some ordinal outcomes are truly discrete and are comprised of the subjective combination of several factors. The nonlinear stereotype logistic model, which does not assume proportional odds, was developed for these 'assessed' ordinal variables. It has previously been extended to the frequentist high-dimensional feature selection setting, but the Bayesian framework provides some distinct advantages in terms of simultaneous uncertainty quantification and variable selection. Here, we review the stereotype model and Bayesian variable selection methods and demonstrate how to combine them to select genomic features associated with discrete ordinal outcomes. We compared the Bayesian and frequentist methods in terms of variable selection performance. We additionally applied the Bayesian stereotype method to an acute myeloid leukemia RNA-sequencing dataset to further demonstrate its variable selection abilities by identifying features associated with the European LeukemiaNet prognostic risk score.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] High-dimensional feature selection for genomic datasets
    Afshar, Majid
    Usefi, Hamid
    [J]. KNOWLEDGE-BASED SYSTEMS, 2020, 206
  • [2] A high-dimensional multinomial logit model
    Nibbering, Didier
    [J]. JOURNAL OF APPLIED ECONOMETRICS, 2024, 39 (03) : 481 - 497
  • [3] Simultaneous Feature and Model Selection for High-Dimensional Data
    Perolini, Alessandro
    Guerif, Sebastien
    [J]. 2011 23RD IEEE INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2011), 2011, : 47 - 50
  • [4] Application of high-dimensional feature selection: evaluation for genomic prediction in man
    M. L. Bermingham
    R. Pong-Wong
    A. Spiliopoulou
    C. Hayward
    I. Rudan
    H. Campbell
    A. F. Wright
    J. F. Wilson
    F. Agakov
    P. Navarro
    C. S. Haley
    [J]. Scientific Reports, 5
  • [5] Application of high-dimensional feature selection: evaluation for genomic prediction in man
    Bermingham, M. L.
    Pong-Wong, R.
    Spiliopoulou, A.
    Hayward, C.
    Rudan, I.
    Campbell, H.
    Wright, A. F.
    Wilson, J. F.
    Agakov, F.
    Navarro, P.
    Haley, C. S.
    [J]. SCIENTIFIC REPORTS, 2015, 5
  • [6] Feature selection for high-dimensional data
    Bolón-Canedo V.
    Sánchez-Maroño N.
    Alonso-Betanzos A.
    [J]. Progress in Artificial Intelligence, 2016, 5 (2) : 65 - 75
  • [7] Feature selection for high-dimensional data
    Destrero A.
    Mosci S.
    De Mol C.
    Verri A.
    Odone F.
    [J]. Computational Management Science, 2009, 6 (1) : 25 - 40
  • [8] A reaction norm model for genomic selection using high-dimensional genomic and environmental data
    Jarquin, Diego
    Crossa, Jose
    Lacaze, Xavier
    Du Cheyron, Philippe
    Daucourt, Joelle
    Lorgeou, Josiane
    Piraux, Francis
    Guerreiro, Laurent
    Perez, Paulino
    Calus, Mario
    Burgueno, Juan
    de los Campos, Gustavo
    [J]. THEORETICAL AND APPLIED GENETICS, 2014, 127 (03) : 595 - 607
  • [9] A reaction norm model for genomic selection using high-dimensional genomic and environmental data
    Diego Jarquín
    José Crossa
    Xavier Lacaze
    Philippe Du Cheyron
    Joëlle Daucourt
    Josiane Lorgeou
    François Piraux
    Laurent Guerreiro
    Paulino Pérez
    Mario Calus
    Juan Burgueño
    Gustavo de los Campos
    [J]. Theoretical and Applied Genetics, 2014, 127 : 595 - 607
  • [10] Fractal feature selection model for enhancing high-dimensional biological problems
    Alsaeedi, Ali Hakem
    Al-Mahmood, Haider Hameed R.
    Alnaseri, Zainab Fahad
    Aziz, Mohammad R.
    Al-Shammary, Dhiah
    Ibaida, Ayman
    Ahmed, Khandakar
    [J]. BMC BIOINFORMATICS, 2024, 25 (01)