Feature selection for high-dimensional temporal data

被引:27
|
作者
Tsagris, Michail [1 ]
Lagani, Vincenzo [1 ]
Tsamardinos, Ioannis [1 ]
机构
[1] Univ Crete, Dept Comp Sci, Voutes Campus, Iraklion 700135, Greece
来源
BMC BIOINFORMATICS | 2018年 / 19卷
基金
欧洲研究理事会;
关键词
Time course data; Longitudinal data; Regression; Variable selection; Multiple solutions; GENERALIZED ESTIMATING EQUATIONS; MARKOV BLANKET INDUCTION; VARIABLE SELECTION; LONGITUDINAL DATA; MODEL SELECTION; CAUSAL DISCOVERY; FUNCTIONAL DATA; LOCAL CAUSAL; MIXED-MODELS; CLASSIFICATION;
D O I
10.1186/s12859-018-2023-7
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Feature selection is commonly employed for identifying collectively-predictive biomarkers and biosignatures; it facilitates the construction of small statistical models that are easier to verify, visualize, and comprehend while providing insight to the human expert. In this work we extend established constrained-based, feature-selection methods to high-dimensional "omics" temporal data, where the number of measurements is orders of magnitude larger than the sample size. The extension required the development of conditional independence tests for temporal and/or static variables conditioned on a set of temporal variables. Results: The algorithm is able to return multiple, equivalent solution subsets of variables, scale to tens of thousands of features, and outperform or be on par with existing methods depending on the analysis task specifics. Conclusions: The use of this algorithm is suggested for variable selection with high-dimensional temporal data.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Feature selection for high-dimensional temporal data
    Michail Tsagris
    Vincenzo Lagani
    Ioannis Tsamardinos
    [J]. BMC Bioinformatics, 19
  • [2] Feature selection for high-dimensional data
    Destrero A.
    Mosci S.
    De Mol C.
    Verri A.
    Odone F.
    [J]. Computational Management Science, 2009, 6 (1) : 25 - 40
  • [3] Feature selection for high-dimensional data
    Bolón-Canedo V.
    Sánchez-Maroño N.
    Alonso-Betanzos A.
    [J]. Progress in Artificial Intelligence, 2016, 5 (2) : 65 - 75
  • [4] FEATURE SELECTION FOR HIGH-DIMENSIONAL DATA ANALYSIS
    Verleysen, Michel
    [J]. NCTA 2011: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON NEURAL COMPUTATION THEORY AND APPLICATIONS, 2011, : IS23 - IS25
  • [5] Feature selection for high-dimensional imbalanced data
    Yin, Liuzhi
    Ge, Yong
    Xiao, Keli
    Wang, Xuehua
    Quan, Xiaojun
    [J]. NEUROCOMPUTING, 2013, 105 : 3 - 11
  • [6] Feature selection for high-dimensional data in astronomy
    Zheng, Hongwen
    Zhang, Yanxia
    [J]. ADVANCES IN SPACE RESEARCH, 2008, 41 (12) : 1960 - 1964
  • [7] A filter feature selection for high-dimensional data
    Janane, Fatima Zahra
    Ouaderhman, Tayeb
    Chamlal, Hasna
    [J]. JOURNAL OF ALGORITHMS & COMPUTATIONAL TECHNOLOGY, 2023, 17
  • [8] Feature Selection with High-Dimensional Imbalanced Data
    Van Hulse, Jason
    Khoshgoftaar, Taghi M.
    Napolitano, Amri
    Wald, Randall
    [J]. 2009 IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW 2009), 2009, : 507 - 514
  • [9] FEATURE SELECTION FOR HIGH-DIMENSIONAL DATA ANALYSIS
    Verleysen, Michel
    [J]. ECTA 2011/FCTA 2011: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON EVOLUTIONARY COMPUTATION THEORY AND APPLICATIONS AND INTERNATIONAL CONFERENCE ON FUZZY COMPUTATION THEORY AND APPLICATIONS, 2011,
  • [10] Efficient feature selection filters for high-dimensional data
    Ferreira, Artur J.
    Figueiredo, Mario A. T.
    [J]. PATTERN RECOGNITION LETTERS, 2012, 33 (13) : 1794 - 1804