Comprehensive relative importance analysis and its applications to high dimensional gene expression data analysis

被引:1
|
作者
Shen, Zixin [1 ]
Chen, Argon [1 ,2 ]
机构
[1] Natl Taiwan Univ, Grad Inst Ind Engn, Taipei 104, Taiwan
[2] Natl Taiwan Univ, Dept Mech Engn, Taipei 104, Taiwan
关键词
Collinearity; Feature ranking; High dimensional; Small sample size; Relative importance; Singularity; CANCER CLASSIFICATION; VARIABLE SELECTION; MODEL SELECTION; REGRESSION; PREDICTORS; WEIGHT;
D O I
10.1016/j.knosys.2020.106120
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Identification of important genes is challenging not only because of its high dimensional nature, but also because the expressions of genes from the same pathway are often highly correlated. A large number of feature selection methods have been proposed to select a subset of genes for interpretation and prediction of certain phenotypes. Among them, the L-1 penalization-based methods, such as lasso, adaptive lasso and elastic net, gain most attentions. However, the L-1 penalty employed by these methods is known to have difficulties in selection of a group of highly correlated features. The issue of identifying important highly correlated features, on the other hand, is well studied in the multiple regression analysis with a sufficient sample size. In particular, relative weight analysis is known effective in measuring the relative importance of correlated features. But the relative weight analysis suffers from the postulation of a full-column-rank feature matrix and is infeasible for high dimensional problems. In this research, a comprehensive relative importance analysis is proposed and proven valid without sample size and matrix rank restraints. Simulation and real cases are used to show the effectiveness of the proposed method in selecting relevant features especially for the high dimensional data. (C) 2020 Elsevier B.V. All rights reserved.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] Cluster analysis and its applications to gene expression data
    Sharan, R
    Elkon, R
    Shamir, R
    BIOINFORMATICS AND GENOME ANALYSIS, 2002, 38 : 83 - 108
  • [2] Many-to-many comprehensive relative importance analysis and its applications to analysis of semiconductor electrical testing parameters
    Shen, Zixin
    Hong, Amos
    Chen, Argon
    Advanced Engineering Informatics, 2021, 48
  • [3] Many-to-many comprehensive relative importance analysis and its applications to analysis of semiconductor electrical testing parameters
    Shen, Zixin
    Hong, Amos
    Chen, Argon
    ADVANCED ENGINEERING INFORMATICS, 2021, 48
  • [4] Categorical Data Analysis for High-Dimensional Sparse Gene Expression Data
    Dousti Mousavi, Niloufar
    Aldirawi, Hani
    Yang, Jie
    BIOTECH, 2023, 12 (03):
  • [5] Bivariate Causal Discovery and Its Applications to Gene Expression and Imaging Data Analysis
    Jiao, Rong
    Lin, Nan
    Hu, Zixin
    Bennett, David A.
    Jin, Li
    Xiong, Momiao
    FRONTIERS IN GENETICS, 2018, 9
  • [6] Statistical analysis of high dimensional gene data
    Zhao, Yichuan
    Zhou, Yue
    2006 IEEE INTERNATIONAL CONFERENCE ON GRANULAR COMPUTING, 2006, : 502 - +
  • [7] Relative evolutionary hierarchical analysis for gene expression data classification
    Czajkowski, Marcin
    Kretowski, Marek
    PROCEEDINGS OF THE 2019 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE (GECCO'19), 2019, : 1156 - 1164
  • [8] INFORMATIVE GENE SCREENING IN HIGH DIMENSIONAL GENE EXPRESSION DATA ANALYSIS USING RELIABILITY COEFFICIENTS
    He, Wenqing
    Bull, Shelley B.
    Colby, Sarah
    Andrulis, Irene L.
    JP JOURNAL OF BIOSTATISTICS, 2011, 5 (02) : 121 - 137
  • [9] Lung Gene Expression Analysis: An Integrative Web Portal For Comprehensive Gene Expression Data Analysis In Lung Development
    Du, Y.
    Kitzmiller, J. A.
    Sridharan, A.
    Perl, A. K.
    Bridges, J.
    Misra, R. S.
    Pryhuber, G. S.
    Mariani, T. J.
    Bhattacharya, S.
    Guo, M.
    Potter, S. S.
    Dexheimer, P.
    Aronow, B.
    Whitsett, J. A.
    Xu, Y.
    AMERICAN JOURNAL OF RESPIRATORY AND CRITICAL CARE MEDICINE, 2016, 193
  • [10] Importance of RNA analysis in interpretation of reporter gene expression data
    Belancio, Victoria P.
    ANALYTICAL BIOCHEMISTRY, 2011, 417 (01) : 159 - 161