Variable-selection ANOVA Simultaneous Component Analysis (VASCA)

被引:8
|
作者
Camacho, Jose [1 ]
Vitale, Raffaele [2 ]
Morales-Jimenez, David [1 ]
Gomez-Llorente, Carolina [3 ,4 ,5 ]
机构
[1] Univ Granada, Signal Theory Networking & Commun Dept, Granada 18014, Spain
[2] Univ Lille, CNRS, LASIRE UMR 8516, Lab Avance Spect Interact Reactivite & Environm, F-59000 Lille, France
[3] Univ Granada, Biomed Res Ctr, Dept Biochem & Mol Biol 2, Sch Pharm,Inst Nutr & Food Technol Jose Mataix, Granada 18160, Spain
[4] Ibs GRANADA, Inst Invest Biosanitaria, Granada, Spain
[5] Inst Salud Carlos III, CIBEROBN Physiopathol Obes & Nutr CB12 03 30038, Madrid 28029, Spain
关键词
FALSE DISCOVERY RATE; ASCA; NIR;
D O I
10.1093/bioinformatics/btac795
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation ANOVA Simultaneous Component Analysis (ASCA) is a popular method for the analysis of multivariate data yielded by designed experiments. Meaningful associations between factors/interactions of the experimental design and measured variables in the dataset are typically identified via significance testing, with permutation tests being the standard go-to choice. However, in settings with large numbers of variables, like omics (genomics, transcriptomics, proteomics and metabolomics) experiments, the 'holistic' testing approach of ASCA (all variables considered) often overlooks statistically significant effects encoded by only a few variables (biomarkers).Results We hereby propose Variable-selection ASCA (VASCA), a method that generalizes ASCA through variable selection, augmenting its statistical power without inflating the Type-I error risk. The method is evaluated with simulations and with a real dataset from a multi-omic clinical experiment. We show that VASCA is more powerful than both ASCA and the widely adopted false discovery rate controlling procedure; the latter is used as a benchmark for variable selection based on multiple significance testing. We further illustrate the usefulness of VASCA for exploratory data analysis in comparison to the popular partial least squares discriminant analysis method and its sparse counterpart.Availability and implementation The code for VASCA is available in the MEDA Toolbox at (release v1.3). The simulation results and motivating example can be reproduced and motivating example can be reproduced using therepository athttps://github.com/josecamachop/VASCA/tree/v1.0.0(DOI 10.5281/zenodo.7410623).
引用
收藏
页数:9
相关论文
共 50 条
  • [31] Variable-selection approaches to generate QSAR models for a set of antichagasic semicarbazones and analogues
    Scotti, Marcus Tullius
    Scotti, Luciana
    Ishiki, Hamilton Mitsugu
    Peron, Leticia M.
    de Rezende, Leandro
    do Amaral, Antonia Tavares
    CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2016, 154 : 137 - 149
  • [32] Grammar-based generation of variable-selection heuristics for constraint satisfaction problems
    Sosa-Ascencio, Alejandro
    Ochoa, Gabriela
    Terashima-Marin, Hugo
    Enrique Conant-Pablos, Santiago
    GENETIC PROGRAMMING AND EVOLVABLE MACHINES, 2016, 17 (02) : 119 - 144
  • [33] ANOVA-simultaneous component analysis (ASCA): a new tool for analyzing designed metabolomics data
    Smilde, AK
    Jansen, JJ
    Hoefsloot, HCJ
    Lamers, RJAN
    van der Greef, J
    Timmerman, ME
    BIOINFORMATICS, 2005, 21 (13) : 3043 - 3048
  • [34] Quantifying and using expert opinion for variable-selection problems in regression: Authors' reponse to discussants
    Department of Mathematical Sciences, University of Aberdeen, Aberdeen AB9 2TY, United Kingdom
    CHEMOMETR. INTELL. LAB. SYST., 1 (41-43):
  • [35] Admissible variable-selection procedures when fitting misspecified regression models by least squares
    Kabaila, P
    COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 1997, 26 (10) : 2303 - 2306
  • [36] Principal component analysis and variable selection in intensive care monitoring
    Fried, R
    Gather, U
    Imhoff, M
    AMIA 2002 SYMPOSIUM, PROCEEDINGS: BIOMEDICAL INFORMATICS: ONE DISCIPLINE, 2002, : 1022 - 1022
  • [37] SIMULTANEOUS PROCEDURES FOR VARIABLE SELECTION IN MULTIPLE DISCRIMINANT-ANALYSIS
    MCKAY, RJ
    BIOMETRIKA, 1977, 64 (02) : 283 - 290
  • [38] Simultaneous change point analysis and variable selection in a regression problem
    Wu, Y.
    JOURNAL OF MULTIVARIATE ANALYSIS, 2008, 99 (09) : 2154 - 2171
  • [39] Variable selection for support vector machines via smoothing spline ANOVA
    Zhang, Hao Helen
    STATISTICA SINICA, 2006, 16 (02) : 659 - 674
  • [40] Principal component analysis based on a subset of variables: Variable selection and sensitivity analysis
    Tanaka, Y
    Mori, Y
    AMERICAN JOURNAL OF MATHEMATICAL AND MANAGEMENT SCIENCES, VOL 17, NOS 1 AND 2, 1997: MULTIVARIATE STATISTICAL INFERENCE - MSI-2000L MULTIVARIATE STATISTICAL ANALYSIS IN HONOR OF PROFESSOR MINORU SIOTANI ON HIS 70TH BIRTHDAY, 1997, 17 (1&2): : 61 - 89