VARIABLE SELECTION IN FUNCTIONAL DATA CLASSIFICATION: A MAXIMA-HUNTING PROPOSAL

被引:24
|
作者
Berrendero, Jose R. [1 ]
Cuevas, Antonio [1 ]
Torrecilla, Jose L. [1 ]
机构
[1] Univ Autonoma Madrid, Dept Matemat, E-28049 Madrid, Spain
关键词
Distance correlation; functional data analysis; supervised classification; variable selection; REGRESSION; COMPONENTS; MODELS;
D O I
10.5705/ss.202014.0014
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Variable selection is considered in the setting of supervised binary classification with functional data {X(t), t is an element of [0, 1]}. By "variable selection" we mean any dimension-reduction method that leads to the replacement of the whole trajectory {X(t), t is an element of [0, 1]}, with a low-dimensional vector (X (t(1)),, X(t(d))) still keeping a similar classification error. Our proposal for variable selection is based on the idea of selecting the local maxima (t(1),...,t(d)) of the function V-X(2)(t) = V-2(X(t), Y), where V denotes the "distance covariance" association measure for random variables due to Szekely, Rizzo, and Bakirov (2007). This method provides a simple natural way to deal with the relevance vs. redundancy trade-off which typically appears in variable selection. A result of consistent estimation for the maxima of V-X(2) is shown. We also show different models for the underlying process X(t) under which the relevant information is concentrated on the maxima of V-X(2). An extensive empirical study is presented, including about 400 simulated models and data examples aimed at comparing our variable selection method with other standard proposals for dimension reduction.
引用
收藏
页码:619 / 638
页数:20
相关论文
共 50 条
  • [1] Feature selection in functional data classification with recursive maxima hunting
    Torrecilla, Jose L.
    Suarez, Alberto
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29
  • [2] Variable selection in classification for multivariate functional data
    Blanquero, Rafael
    Carrizosa, Emilio
    Jimenez-Cordero, Asuncion
    Martin-Barragan, Belen
    [J]. INFORMATION SCIENCES, 2019, 481 : 445 - 462
  • [3] Functional Data Classification of Variable Stars
    Park, Minjeong
    Kim, Donghoh
    Cho, Sinsup
    Oh, Hee-Seok
    [J]. COMMUNICATIONS FOR STATISTICAL APPLICATIONS AND METHODS, 2013, 20 (04) : 271 - 281
  • [4] Joint Variable Selection and Classification with Immunohistochemical Data
    Ghosh, Debashis
    Chakrabarti, Ratna
    [J]. BIOMARKER INSIGHTS, 2009, 4 : 103 - 110
  • [5] A variable selection strategy for supervised classification with continuous spectroscopic data
    Indahl, U
    Næs, T
    [J]. JOURNAL OF CHEMOMETRICS, 2004, 18 (02) : 53 - 61
  • [6] Training data selection for event classification in a highly variable environment
    Iyer, Anand
    Flynn, Garrison
    Parikh, Nidhi
    Archer, Daniel
    Karnowski, Thomas
    Maceira, Monica
    Marcillo, Omar
    Nicholson, Andrew
    Ray, Will
    Wetherington, Randall
    Willis, Michael
    [J]. ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING FOR MULTI-DOMAIN OPERATIONS APPLICATIONS IV, 2022, 12113
  • [7] An ensemble approach to variable selection for classification of DNA microarray data
    Masulli, F
    Rovetta, S
    [J]. PROCEEDINGS OF THE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS 2003, VOLS 1-4, 2003, : 3089 - 3094
  • [8] Papers on normalization, variable selection, classification or clustering of microarray data
    Rocke, David M.
    Ideker, Trey
    Troyanskaya, Olga
    Quackenbush, John
    Dopazo, Joaquin
    [J]. BIOINFORMATICS, 2009, 25 (06) : 701 - 702
  • [9] The mRMR variable selection method: a comparative study for functional data
    Berrendero, J. R.
    Cuevas, A.
    Torrecilla, J. L.
    [J]. JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2016, 86 (05) : 891 - 907
  • [10] Variable selection in regression models including functional data predictors
    Liu, Kesheng
    Wang, Siyang
    [J]. Beijing Hangkong Hangtian Daxue Xuebao/Journal of Beijing University of Aeronautics and Astronautics, 2019, 45 (10): : 1990 - 1994