共 50 条
Parallel pre-processing through orthogonalization (PORTO) and its application to near-infrared spectroscopy
被引:21
|作者:
Mishra, Puneet
[1
]
Roger, Jean Michel
[2
,3
]
Marini, Federico
[4
]
Biancolillo, Alessandra
[5
]
Rutledge, Douglas N.
[6
,7
]
机构:
[1] Wageningen Food & Biobased Res, Bornse Weilanden 9,POB 17, NL-6700 AA Wageningen, Netherlands
[2] Univ Montpellier, Inst Agro, INRAE, ITAP, Montpellier, France
[3] ChemHouse Res Grp, Montpellier, France
[4] Univ Roma La Sapienza, Dept Chem, Piazzale Aldo Moro 5, I-00185 Rome, Italy
[5] Univ Aquila, Dept Phys & Chem Sci, Via Vetoio 67100, I-67100 Laquila, Italy
[6] Univ Paris Saclay, AgroParisTech, INRAE, UMR SayFood, F-75005 Paris, France
[7] Charles Sturt Univ, Natl Wine & Grape Ind Ctr, Wagga Wagga, NSW, Australia
关键词:
Multi-block data analysis;
Data fusion;
Partial least squares;
Pre-processing;
Parallel and orthogonalized partial least squares (PO-PLS);
COMMON;
D O I:
10.1016/j.chemolab.2020.104190
中图分类号:
TP [自动化技术、计算机技术];
学科分类号:
0812 ;
摘要:
Data generated from spectroscopy may be deformed by artefacts due to a range of physical, chemical and environmental factors that are not of interest for the characterization of the samples under study. For example, data acquired by near-infrared (NIR) spectroscopy in the diffuse reflectance mode can be affected by light scattering. This artefact, if not reduced or removed by spectral pre-processing, can complicate the multivariate data analysis. However, different pre-processing approaches correct these effects in different ways. For example, differentiation can reveal underlying bands, while spectral normalization techniques such as standard normal variate (SNV) can correct for multiplicative and additive effects. Combining multiple pre-processing techniques can lead to better results. However, it is not feasible for a user to explore all possible combinations of pre-processing techniques. In the present work, a new pre-processing fusion approach, based on the framework of separating common and distinct components in multi-block multivariate data analysis, is demonstrated. The approach utilizes parallel and orthogonalized partial least squares (PO-PLS) regression for the parallel fusion of multiple pre-processing techniques applied to the same data. The results obtained on 4 different NIR spectroscopic data sets related to the assessment of fruit quality and used as benchmark are compared to those of the recently developed sequential preprocessing through orthogonalization (SPORT) approach: it is found that, in all the cases, the PO-PLS approach leads to slightly better performances. Furthermore, a clear understanding of the common and distinct information present in the data sets after each pre-treatment was obtained. Parallel pre-processing through orthogonalization (PORTO) can be seen as parallel boosting of multiple pre-processing techniques to improve model performances.
引用
收藏
页数:9
相关论文