Variable selection in classification for multivariate functional data

被引:19
|
作者
Blanquero, Rafael [1 ,2 ]
Carrizosa, Emilio [1 ,2 ]
Jimenez-Cordero, Asuncion [1 ,2 ]
Martin-Barragan, Belen [3 ]
机构
[1] Univ Sevilla IMUS, Fac Matemat, Dept Estadist & Invest Operat, C Tarfia S-N, Seville 41012, Spain
[2] Univ Sevilla IMUS, Inst Matemat, C Tarfia S-N, Seville 41012, Spain
[3] Univ Edinburgh, Business Sch, 29 Buccleuch Pl, Edinburgh EH89JS, Midlothian, Scotland
关键词
Feature selection; Multivariate functional data analysis; Support Vector Machines; SUPPORT VECTOR MACHINE; KERNEL; REGRESSION; ALGORITHM;
D O I
10.1016/j.ins.2018.12.060
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
When classification methods are applied to high-dimensional data, selecting a subset of the predictors may lead to an improvement in the predictive ability of the estimated model, in addition to reducing the model complexity. In Functional Data Analysis (FDA), i.e., when data are functions, selecting a subset of predictors corresponds to selecting a subset of individual time instants in the time interval in which the functional data are measured. In this paper, we address the problem of selecting the most informative time instants in multivariate functional data, a case much less studied than its single-variate counterpart. Our proposal allows one to use in a very simple way high-order information of the data, e.g. monotonicity or convexity by means of the functional data derivatives. The aforementioned problem is addressed with tools of Global Optimization in continuous variables: the time instants are selected to maximize the correlation between the class label and the Support Vector Machine score used for classification. The effectiveness of the proposal is shown in univariate and multivariate datasets. (C) 2018 Elsevier Inc. All rights reserved.
引用
收藏
页码:445 / 462
页数:18
相关论文
共 50 条
  • [1] Multivariate Procedure for Variable Selection and Classification of High Dimensional Heterogeneous Data
    Mehmood, Tahir
    Rasheed, Zahid
    [J]. COMMUNICATIONS FOR STATISTICAL APPLICATIONS AND METHODS, 2015, 22 (06) : 575 - 587
  • [2] Bayesian latent factor regression for multivariate functional data with variable selection
    Noh, Heesang
    Choi, Taeryon
    Park, Jinsu
    Chung, Yeonseung
    [J]. JOURNAL OF THE KOREAN STATISTICAL SOCIETY, 2020, 49 (03) : 901 - 923
  • [3] Bayesian latent factor regression for multivariate functional data with variable selection
    Heesang Noh
    Taeryon Choi
    Jinsu Park
    Yeonseung Chung
    [J]. Journal of the Korean Statistical Society, 2020, 49 : 901 - 923
  • [4] Variable selection for multivariate functional data via conditional correlation learning
    Wang, Keyao
    Wang, Huiwen
    Wang, Shanshan
    Wang, Lihong
    [J]. COMPUTATIONAL STATISTICS, 2024, 39 (04) : 2375 - 2412
  • [5] Partial correlation based variable selection approach for multivariate data classification methods
    Rao, K. Raghuraj
    Lakshminarayanan, S.
    [J]. CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2007, 86 (01) : 68 - 81
  • [6] Variable selection in multivariate linear models for functional data via sparse regularization
    Hidetoshi Matsui
    Yuta Umezu
    [J]. Japanese Journal of Statistics and Data Science, 2020, 3 : 453 - 467
  • [7] Variable selection in multivariate linear models for functional data via sparse regularization
    Matsui, Hidetoshi
    Umezu, Yuta
    [J]. JAPANESE JOURNAL OF STATISTICS AND DATA SCIENCE, 2020, 3 (02) : 453 - 467
  • [8] Variable Selection in Multivariate Functional Linear Regression
    Yeh, Chi-Kuang
    Sang, Peijun
    [J]. STATISTICS IN BIOSCIENCES, 2023,
  • [9] VARIABLE SELECTION IN FUNCTIONAL DATA CLASSIFICATION: A MAXIMA-HUNTING PROPOSAL
    Berrendero, Jose R.
    Cuevas, Antonio
    Torrecilla, Jose L.
    [J]. STATISTICA SINICA, 2016, 26 (02) : 619 - 638
  • [10] Unbiased variable selection for classification trees with multivariate responses
    Lee, Tzu-Haw
    Shih, Yu-Shan
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2006, 51 (02) : 659 - 667