An automatic generation of pre-processing strategy combined with machine learning multivariate analysis for NIR spectral data

被引:8
|
作者
Arianti, Nunik Destria [1 ]
Saputra, Edo [2 ,3 ]
Sitorus, Agustami [4 ,5 ]
机构
[1] Nusa Putra Univ, Dept Informat Syst, Sukabumi 43155, Indonesia
[2] Univ Riau, Fac Agr, Dept Agr Technol, Pekanbaru 28293, Indonesia
[3] IPB Univ, Agr Engn Study Program, Bogor 16680, Indonesia
[4] Natl Res & Innovat Agcy BRIN, Res Ctr Appropriate Technol, Subang 41213, Indonesia
[5] King Mongkuts Inst Technol Ladkrabang, Sch Engn, Dept Agr Engn, Bangkok 10520, Thailand
关键词
Ensemble pre-processing; Chemometrics; Machine learning; AGoES;
D O I
10.1016/j.jafr.2023.100625
中图分类号
S [农业科学];
学科分类号
09 ;
摘要
Pre-processing near-infrared (NIR) spectral data is indispensable in multivariate analysis, since the measured spectra of complex samples are often subject to overwhelming background, light scattering, varying noises, and other unexpected factors. Various pre-processing methods have been developed to remove or reduce the interference of these effects. Until now, most applications of NIR spectra pre-processing in multivariate calibration have been trial-and-error, with selecting a proper method depending on the nature of the data, expertise, and practitioner experience. Thus, it is usually challenging to determine the best pre-processing method for a given data. In order to tackle these problems, this study proposes a new concept of data pre-processing, namely, automatically generating a pre-processing strategy (AGoES). This concept belongs to the ensemble pre-processing method, where machine learning algorithms (PLSR, SVM, k-NN, DT, AB, and GPR) built on differently preprocessed data are combined by 5-fold cross-validation and grid search optimization. To investigate our concept, a public NIR spectral dataset was used to predict three responses, including dry matter content (DM), organic matter content (OM) and ammonium nitrogen content (AN) from manure organic waste. The results show that SVM is the best algorithm combined with the AGoES pre-processing to predict DM and AN with a ratio of prediction to deviation (RPD) of 3.619 and 2.996, respectively. The AB tandem with AGoES pre-processing is the best strategy for predicting OM with an RPD of 3.185. Therefore, in the framework of the AGoES concept, it is unsupervised pre-processing, more simple, and feasible to apply multivariate analysis using machine learning algorithms.
引用
收藏
页数:9
相关论文
共 50 条
  • [21] Development of automatic tuning for combined preprocessing and hyperparameters of machine learning and its application to NIR spectral data of coconut milk adulteration
    Sitorus, Agustami
    Lapcharoensuk, Ravipat
    FOOD CHEMISTRY, 2024, 457
  • [22] Analysis of activity detection data pre-processing
    Alexan, Anca
    Alexan, Alexandru
    Stefan, Oniga
    Pap, Iuliu Alexandru
    2019 IEEE 25TH INTERNATIONAL SYMPOSIUM FOR DESIGN AND TECHNOLOGY IN ELECTRONIC PACKAGING (SIITME 2019), 2019, : 282 - 286
  • [23] Histogram-Based Image Pre-processing for Machine Learning
    Sada, Ayumi
    Kinoshita, Yuma
    Shiota, Sayaka
    Kiya, Hitoshi
    2018 IEEE 7TH GLOBAL CONFERENCE ON CONSUMER ELECTRONICS (GCCE 2018), 2018, : 272 - 275
  • [24] PRESISTANT: Learning based assistant for data pre-processing
    Bilalli, Besim
    Abello, Alberto
    Aluja-Banet, Tomas
    Wrembel, Robert
    DATA & KNOWLEDGE ENGINEERING, 2019, 123
  • [25] A MATLAB toolbox for data pre-processing and multivariate statistical process control
    Yi, Gang
    Herdsman, Craig
    Morris, Julian
    CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2019, 194
  • [26] A Multi-purpose Data Pre-processing Framework using Machine Learning for Enterprise Data Models
    Ramana, Venkata B.
    Narsimha, G.
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2021, 12 (03) : 646 - 656
  • [27] Observation data pre-processing and scientific data products generation of POLAR
    Zheng-Heng Li
    Jian-Chao Sun
    Li-Ming Song
    Bo-Bing Wu
    Lu Li
    Xing Wen
    Hua-Lin Xiao
    Shao-Lin Xiong
    Lai-Yu Zhang
    Shuang-Nan Zhang
    Yong-Jie Zhang
    Research in Astronomy and Astrophysics, 2019, 19 (07) : 15 - 26
  • [28] Observation data pre-processing and scientific data products generation of POLAR
    Li, Zheng-Heng
    Sun, Jian-Chao
    Song, Li-Ming
    Wu, Bo-Bing
    Li, Lu
    Wen, Xing
    Xiao, Hua-Lin
    Xiong, Shao-Lin
    Zhang, Lai-Yu
    Zhang, Shuang-Nan
    Zhang, Yong-Jie
    RESEARCH IN ASTRONOMY AND ASTROPHYSICS, 2019, 19 (07)
  • [29] Raw data pre-processing in the protozoa and metazoa identification by image analysis and multivariate statistical techniques
    Ginoris, Y. P.
    Amaral, A. L.
    Nicolau, A.
    Coelho, M. A. Z.
    Ferreira, E. C.
    JOURNAL OF CHEMOMETRICS, 2007, 21 (3-4) : 156 - 164
  • [30] The importance of signal pre-processing for machine learning: The influence of Data scaling in a driver identity classification
    Abdennour, Najmeddine
    Ouni, Tarek
    Ben Amor, Nader
    2021 IEEE/ACS 18TH INTERNATIONAL CONFERENCE ON COMPUTER SYSTEMS AND APPLICATIONS (AICCSA), 2021,