Study on an Algorithm for Near Infrared Singular Sample Identification Based on Strong Influence Degree

被引:3
|
作者
Wu Zhao-na [1 ]
Ding Xiang-qian [2 ]
Gong Hui-li [1 ]
Dong Mei [3 ]
Wang Mei-xun [3 ]
机构
[1] Ocean Univ China, Coll Informat Sci & Engn, Qingdao 266100, Peoples R China
[2] Ocean Univ China, Ctr Informat Engn, Qingdao 266071, Peoples R China
[3] Linyi Tobacco Co Ltd Shangdong Prov, Linyi 276000, Peoples R China
关键词
Near infrared spectral; Mahalanobis distance; Leverage; Spectral residual; Singular sample identification; SPECTROSCOPY;
D O I
10.3964/j.issn.1000-0593(2015)07-1830-05
中图分类号
O433 [光谱学];
学科分类号
0703 ; 070302 ;
摘要
Correcting sample selection and elimination of singular sample is very important for the quantitative and qualitative modeling of near infrared spectroscopy. However, methods for identification of singular sample available are generally based on data center estimates which require an experience decision threshold, this largely limit its recognition accuracy and practicability. Aiming at the low accuracy of the existing methods of singular sample recognition problem, this paper improves the existing metric - Leverage value and presents a new algorithm for near infrared singular sample identification based on strong influence degree. This metric reduces the dependence on the data center to a certain extent, so that the normal samples become more aggregation, and the distance between the singular samples and the normal samples is opened; at the same time, in order to avoid artificial setting threshold unreasonably according to experience, this paper introduces the concept of the jump degree in the field of statistics, and proposes an automatic threshold setting method to distinguish singular samples. In order to verify the validity of our algorithm, abnormal samples of 200 representative samples were eliminated in the calibration set with using Mahalanobis distance, Leverage- Spectral residual method and the algorithm presented in this paper respectively; then through partial least squares (PLS), the rest of the calibration samples were made quantitative modelings (took Nicotine as index), and the results of quantitative modelings were made a comparative analysis; besides, 60 representative testing samples were made a prediction through the modelings; at last, all the algorithms above were made a comparison with took Root Mean Square Error of Cross Validation (RMSECV), Correlation Coefficient (r) and Root Mean Square Error of Prediction (RMSEP) as evaluation Index. The experimental results demonstrate that the algorithm for near infrared singular sample identification based on strong influence degree significantly improves the accuracy of singular sample identification over existing methods. With lower RMSECV (0. 104), RMSEP (0. 112) and higher r (0. 983), it also contribute to boost the stability and prediction ability of the model.
引用
收藏
页码:1830 / 1834
页数:5
相关论文
共 10 条
  • [1] [Anonymous], 2001, NEAR INFRARED TECHNO
  • [2] Chen B., 2008, J JIANGSU U NAT SCI, V29, P277
  • [3] CHU Xiao-li, 2006, MODERN SCI INSTRUMEN, V16, P8
  • [4] CHU Xiao-li, 2011, MOL SPECTROSCOPY ANA, P77
  • [5] Determination of Chlorogenic Acid, Rutin, Scopoletin and Total Polyphenol in Tobacco by Fourier Transform Near Infrared Spectroscopy
    Leng Hong-qiong
    Guo Ya-dong
    Liu Wei
    Zhang Tao
    Deng Liang
    Shen Zhi-qiang
    [J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2013, 33 (07) : 1801 - 1804
  • [6] Principal component analysis applied to Fourier transform infrared spectroscopy for the design of calibration sets for glycerol prediction models in wine and for the detection and classification of outlier samples
    Nieuwoudt, HH
    Prior, BA
    Pretorius, IS
    Manley, M
    Bauer, FF
    [J]. JOURNAL OF AGRICULTURAL AND FOOD CHEMISTRY, 2004, 52 (12) : 3726 - 3735
  • [7] YAN Yan-lu, 2011, MODERN INSTRUMENTS, V17, P5
  • [8] [杨虎 YANG Hu], 2009, [工程数学学报, Chinese Journal of Engineering Mathematics], V26, P123
  • [9] ZHANG De-ran, 2003, STAT RES, V5, P53
  • [10] Zhu ShiPing Zhu ShiPing, 2004, Nongye Jixie Xuebao = Transactions of the Chinese Society of Agricultural Machinery, V35, P115