An Variable Selection Method of the Significance Multivariate Correlation Competitive Population Analysis for Near-Infrared Spectroscopy in Chemical Modeling

被引:8
|
作者
Wang, Yuxi [1 ]
Jia, Zhenhong [1 ]
Yang, Jie [2 ]
机构
[1] Xinjiang Univ, Coll Informat Sci & Engn, Urumqi 830046, Peoples R China
[2] Shanghai Jiao Tong Univ, Inst Image Proc & Pattern Recognit, Shanghai 200240, Peoples R China
基金
美国国家科学基金会;
关键词
Spectrochemical analysis; variable selection; the significant multivariate correlation; weighted bootstrap sampling; model population analysis; monte Carlo sampling; analytical techniques; partial least squares method; PARTIAL LEAST-SQUARES; REGRESSION; SHRINKAGE; CALIBRATION; PROJECTION; STRATEGY; SPACE; OPTIMIZATION; PERSPECTIVE; WAVELENGTHS;
D O I
10.1109/ACCESS.2019.2954115
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The high dimensionality of spectral datasets makes it difficult to select the optimal subset of variables. This paper presents a new method for variable selection called the significant multivariate competitive population analysis (SMCPA), Which combines ideas of significant multivariate correlation (SMC) and model population analysis, and employs weighted bootstrap sampling (WBS) and exponential decline function (EDF) competition methods. In this study, the values of SMC distributions are used as an index for evaluating the importance of each wavelength. Then, based on the importance level of each wavelength. SMCPA sequentially selects N subsets of spectral wavelengths by N Monte Carlo sampling in an iterative and competitive procedure. In each sampling run, a fixed ratio of samples is used to build a calibrated partial least-squares model, and then SMC is performed to obtain the score and threshold values. Next, based on the significant multivariate correlation scores, the key variables are selected by two steps: the compulsory selection of exponential decline function and the competitive selection of adaptive weighted sampling. Finally, cross-validation(CV) is applied to select the optimal subset with the lowest root mean square error. This method is tested on three NIR spectral datasets and compared against three high-performance variable selection methods. The experimental results show that the proposed algorithm has the highest efficiency and the best selection effect, and can usually locate the optimal combination of key wavelength variables in a dataset. The evaluation result after PLS modeling is also the best.
引用
收藏
页码:167195 / 167209
页数:15
相关论文
共 50 条
  • [1] A Variable Selection Method of the Selectivity Ratio Competitive Model Population Analysis for Near Infrared Spectroscopy
    Wang Yu-xi
    Jia Zhen-hong
    Yang Jie
    Kasabov, Nikola K.
    [J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2020, 40 (04) : 1056 - 1062
  • [2] An overview of variable selection methods in multivariate analysis of near-infrared spectra
    Yun, Yong-Huan
    Li, Hong-Dong
    Deng, Bai-Chuan
    Cao, Dong-Sheng
    [J]. TRAC-TRENDS IN ANALYTICAL CHEMISTRY, 2019, 113 : 102 - 115
  • [3] A variable selection method based on uninformative variable elimination for multivariate calibration of near-infrared spectra
    Cai, Wensheng
    Li, Yankun
    Shao, Xueguang
    [J]. CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2008, 90 (02) : 188 - 194
  • [4] A Variable Selection Method of Near Infrared Spectroscopy Based on Automatic Weighting Variable Combination Population Analysis
    Zhao Huan
    Huan Ke-Wei
    Shi Xiao-Guang
    Zheng Feng
    Liu Li-Ying
    Liu Wei
    Zhao Chun-Ying
    [J]. CHINESE JOURNAL OF ANALYTICAL CHEMISTRY, 2018, 46 (01) : 136 - 142
  • [5] Pathlength selection method for quantitative analysis with near-infrared spectroscopy
    Xu, KX
    Lu, YH
    Li, QB
    Wang, Y
    [J]. ALT'03 INTERNATIONAL CONFERENCE ON ADVANCED LASER TECHNOLOGIES: BIOMEDICAL OPTICS, 2003, 5486 : 100 - 106
  • [6] Selection of spectral width for prediction modeling in near-infrared spectroscopy analysis
    Yang Hao-Min
    Lu Qi-Peng
    Huang Fu-Rong
    [J]. JOURNAL OF INFRARED AND MILLIMETER WAVES, 2011, 30 (06) : 522 - 525
  • [7] Adaptive Variable Re-weighting and Shrinking Approach for Variable Selection in Multivariate Calibration for Near-infrared Spectroscopy
    Sun Jing-Jing
    Yang Wu-De
    Feng Mei-Chen
    Xiao Lu-Jie
    Sun Hui
    Kubar, Muhammad-Saleem
    [J]. CHINESE JOURNAL OF ANALYTICAL CHEMISTRY, 2021, 49 (05) : E21079 - E21086
  • [8] Pattern-Coupled Baseline Correction Method for Near-Infrared Spectroscopy Multivariate Modeling
    Li, Yuqiang
    Wang, Xinjie
    Yu, Huijing
    Du, Wenli
    [J]. IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2023, 72
  • [9] A Flow System for Generation of Concentration Perturbation in Two-Dimensional Correlation Near-Infrared Spectroscopy: Application to Variable Selection in Multivariate Calibration
    Pereira, Claudete Fernandes
    Pasquini, Celio
    [J]. APPLIED SPECTROSCOPY, 2010, 64 (05) : 507 - 513
  • [10] A heuristic and parallel simulated annealing algorithm for variable selection in near-infrared spectroscopy analysis
    Shi, Jiyong
    Hu, Xuetao
    Zou, Xiaobo
    Zhao, Jiewen
    Zhang, Wen
    Huang, Xiaowei
    Zhu, Yaodi
    Li, Zhihua
    Xu, Yiwei
    [J]. JOURNAL OF CHEMOMETRICS, 2016, 30 (08) : 442 - 450