Feature selection using distributions of orthogonal PLS regression vectors in spectral data

被引:10
|
作者
Lee, Geonseok [1 ]
Lee, Kichun [1 ]
机构
[1] Hanyang Univ, Ind Engn, Seoul, South Korea
关键词
Feature selection; PLS; Orthogonal signal correction; Regression vector; Permutation test;
D O I
10.1186/s13040-021-00240-3
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Feature selection, which is important for successful analysis of chemometric data, aims to produce parsimonious and predictive models. Partial least squares (PLS) regression is one of the main methods in chemometrics for analyzing multivariate data with input X and response Y by modeling the covariance structure in the X and Y spaces. Recently, orthogonal projections to latent structures (OPLS) has been widely used in processing multivariate data because OPLS improves the interpretability of PLS models by removing systematic variation in the X space not correlated to Y. The purpose of this paper is to present a feature selection method of multivariate data through orthogonal PLS regression (OPLSR), which combines orthogonal signal correction with PLS. The presented method generates empirical distributions of features effects upon Y in OPLSR vectors via permutation tests and examines the significance of the effects of the input features on Y. We show the performance of the proposed method using a simulation study in which a three-layer network structure exists in compared with the false discovery rate method. To demonstrate this method, we apply it to both real-life NIR spectra data and mass spectrometry data.
引用
收藏
页数:16
相关论文
共 50 条
  • [21] Feature selection under regularized orthogonal least square regression with optimal scaling
    Zhang, Rui
    Nie, Feiping
    Li, Xuelong
    [J]. NEUROCOMPUTING, 2018, 273 : 547 - 553
  • [22] Robust Jointly Sparse Regression with Generalized Orthogonal Learning for Image Feature Selection
    Mo, Dongmei
    Lai, Zhihui
    [J]. PATTERN RECOGNITION, 2019, 93 : 164 - 178
  • [23] A General Framework for Feature Selection Under Orthogonal Regression With Global Redundancy Minimization
    Xu, Xueyuan
    Wu, Xia
    Wei, Fulin
    Zhong, Wei
    Nie, Feiping
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2022, 34 (11) : 5056 - 5069
  • [24] Key Data Set Selection Algorithm Based on PLS Regression in Industrial Process
    Yang, Mingyang
    Yang, Xuebo
    Yang, Chengming
    Zhou, Hongpeng
    [J]. PROCEEDINGS OF THE IECON 2016 - 42ND ANNUAL CONFERENCE OF THE IEEE INDUSTRIAL ELECTRONICS SOCIETY, 2016, : 7179 - 7184
  • [25] A Rough Set Based Feature Selection Approach using Random Feature Vectors
    Raza, Muhammad Summair
    Qamar, Usman
    [J]. PROCEEDINGS OF 14TH INTERNATIONAL CONFERENCE ON FRONTIERS OF INFORMATION TECHNOLOGY PROCEEDINGS - FIT 2016, 2016, : 229 - 234
  • [26] Feature selection for high-dimensional multi-category data using PLS-based local recursive feature elimination
    You, Wenjie
    Yang, Zijiang
    Ji, Guoli
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2014, 41 (04) : 1463 - 1475
  • [27] KERNEL-BASED PLS REGRESSION CROSS-VALIDATION AND APPLICATIONS TO SPECTRAL DATA
    LINDGREN, F
    GELADI, P
    WOLD, S
    [J]. JOURNAL OF CHEMOMETRICS, 1994, 8 (06) : 377 - 389
  • [28] Enhanced prediction of misalignment conditions from spectral data using feature selection and filtering
    Cho, Hyun-Woo
    Jeong, Myong K.
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2008, 35 (1-2) : 451 - 458
  • [29] Application of orthogonal L-shaped PLS to chemogenomics data and its chemical interpretation from predictive and orthogonal regression coefficients
    Hasegawa, Kiyoshi
    Funatsu, Kimito
    [J]. CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2014, 139 : 64 - 69
  • [30] Genomic Selection in Chinese Holsteins Using Regularized Regression Models for Feature Selection of Whole Genome Sequencing Data
    Li, Shanshan
    Yu, Jian
    Kang, Huimin
    Liu, Jianfeng
    [J]. ANIMALS, 2022, 12 (18):