Pre-processing ensembles with response oriented sequential alternation calibration (PROSAC): A step towards ending the pre-processing search and optimization quest for near-infrared spectral modelling

被引:11
|
作者
Mishra, Puneet [1 ]
Roger, Jean Michel [2 ,3 ]
Marini, Federico [4 ]
Biancolillo, Alessandra [5 ]
Rutledge, Douglas N. [6 ,7 ]
机构
[1] Wageningen Food & Biobased Res, Bornse Weilanden 9,POB 17, NL-6700 AA Wageningen, Netherlands
[2] Univ Montpellier, Inst Agro, INRAE, ITAP, Montpellier, France
[3] ChemHouse Res Grp, Montpellier, France
[4] Univ Roma La Sapienza, Dept Chem, Piazzale Aldo Moro 5, I-00185 Rome, Italy
[5] Univ Aquila, Dept Phys & Chem Sci, Via Vetoio, I-67100 Laquila, Italy
[6] Univ Paris Saclay, AgroParisTech, INRAE, UMR SayFood, F-75005 Paris, France
[7] Charles Sturt Univ, Natl Wine & Grape Ind Ctr, Wagga Wagga, NSW, Australia
关键词
Multi-block modelling; Pre-processing; Spectroscopy; Data fusion; MULTIVARIATE CALIBRATION; SPECTROSCOPY; CHEMOMETRICS; COMMON; TOOL; MSC;
D O I
10.1016/j.chemolab.2022.104497
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Ensemble pre-processing is emerging as a potential tool to avoid the tiring pre-processing selection and optimization task in near-infrared (NIR) spectral modelling. Furthermore, differently pre-processed data may carry complementary information, hence, ensemble pre-processing may represent the best suited modelling option to extract all the useful information from differently pre-processed data. Recently, multi-block techniques such as sequential (SPORT) and parallel (PORTO) orthogonalized partial least squares regression were proposed to extract complementary information present in differently pre-processed data. Although such multi-block techniques allowed efficient modelling of differently pre-processed data blocks, depending on the approach, challenges related to choosing block order, parameter tuning, block scaling and optimization time requirements still must be dealt with. To cope with such issues, the present study proposes the use of a recently developed faster, block order independent and scale independent, multi-block data modelling technique called response-oriented sequential alternation (ROSA) to process the multi-block data generated by differently pre-processing the same NIR data. This new method is called PROSAC, i.e., pre-processing ensembles with ROSA calibration. The potential of the approach is demonstrated on five real NIR spectral datasets. Furthermore, as baselines for comparison, partial least squares regression was done on individually pre-processed data sets, and using two multi-block pre-processing fusion approaches, i.e., SPORT and PORTO. The ensemble pre-processing with ROSA achieved either better performance compared to the baseline methods or achieved comparable performance without the need to worry about the pre-processing order, the scaling of data after pre-processing and optimization time requirements. PROSAC can be considered as a general tool for the ensemble pre-processing for NIR data modelling.
引用
收藏
页数:10
相关论文
共 17 条
  • [1] Near-Infrared Data Pre-processing for Glucose Level Prediction in Blood
    Abd Rahim, Intan Maisarah
    Rahim, Herlina Abdul
    Ghazali, Rashidah
    [J]. 2020 IEEE 10TH INTERNATIONAL CONFERENCE ON SYSTEM ENGINEERING AND TECHNOLOGY (ICSET), 2020, : 73 - 78
  • [2] Study on Near-Infrared Spectroscopy Signal Pre-processing and Model Building
    Zhang Xiaochao
    Zhang Yinqiao
    Wang Hui
    [J]. INTERNATIONAL CONFERENCE ON ADVANCED COMPUTER CONTROL : ICACC 2009 - PROCEEDINGS, 2009, : 618 - 622
  • [3] Review of the most common pre-processing techniques for near-infrared spectra
    Rinnan, Asmund
    van den Berg, Frans
    Engelsen, Soren Balling
    [J]. TRAC-TRENDS IN ANALYTICAL CHEMISTRY, 2009, 28 (10) : 1201 - 1222
  • [4] The influence of data pre-processing in the pattern recognition of excipients near-infrared spectra
    Candolfi, A
    De Maesschalck, R
    Jouan-Rimbaud, D
    Hailey, PA
    Massart, DL
    [J]. JOURNAL OF PHARMACEUTICAL AND BIOMEDICAL ANALYSIS, 1999, 21 (01) : 115 - 132
  • [5] Parallel pre-processing through orthogonalization (PORTO) and its application to near-infrared spectroscopy
    Mishra, Puneet
    Roger, Jean Michel
    Marini, Federico
    Biancolillo, Alessandra
    Rutledge, Douglas N.
    [J]. CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2021, 212
  • [6] Influence of different pre-processing methods in predicting sugarcane quality from near-infrared (NIR) spectral data
    Lazim, S. S. R. M.
    Nawi, N. M.
    Chen, G.
    Jensen, T.
    Rasli, A. M. M.
    [J]. INTERNATIONAL FOOD RESEARCH JOURNAL, 2016, 23 : S231 - S236
  • [7] A local pre-processing method for near-infrared spectra, combined with spectral segmentation and standard normal variate transformation
    Bi, Yiming
    Yuan, Kailong
    Xiao, Weiqiang
    Wu, Jizhong
    Shi, Chunyun
    Xia, Jun
    Chu, Guohai
    Zhang, Guangxin
    Zhou, Guojun
    [J]. ANALYTICA CHIMICA ACTA, 2016, 909 : 30 - 40
  • [8] A Comparative Investigation of the Combined Effects of Pre-Processing, Wavelength Selection, and Regression Methods on Near-Infrared Calibration Model Performance
    Wan, Jian
    Chen, Yi-Chieh
    Morris, A. Julian
    Thennadil, Suresh N.
    [J]. APPLIED SPECTROSCOPY, 2017, 71 (07) : 1432 - 1446
  • [9] Spectral Pre-Processing and Multivariate Calibration Methods for the Prediction of Wood Density in Chinese White Poplar by Visible and Near Infrared Spectroscopy
    Li, Ying
    Wang, Guozhong
    Guo, Gensheng
    Li, Yaoxiang
    Via, Brian K.
    Pei, Zhiyong
    [J]. FORESTS, 2022, 13 (01):
  • [10] Exploring the Steps of Infrared (IR) Spectral Analysis: Pre-Processing, (Classical) Data Modelling, and Deep Learning
    Mokari, Azadeh
    Guo, Shuxia
    Bocklitz, Thomas
    [J]. MOLECULES, 2023, 28 (19):