Comparison of machine learning approaches for the classification of elution profiles

被引:0
|
作者
Baccolo, Giacomo [1 ,2 ]
Yu, Huiwen [2 ]
Valsecchi, Cecile [1 ]
Ballabio, Davide [1 ]
Bro, Rasmus [2 ]
机构
[1] Univ Milano Bicocca, Dept Earth & Environm Sci, Pzza Sci 1, I-20126 Milan, Italy
[2] Univ Copenhagen, Dept Food Sci, Rolighedsvej 30, DK-1958 Frederiksberg C, Denmark
关键词
Chromatography; PARAFAC2; Neural networks; Automatic analysis; PARAFAC2;
D O I
10.1016/j.chemolab.2023.105002
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Hyphenated chromatography is among the most popular analytical techniques in omics related research. While great advancements have been achieved on the experimental side, the same is not true for the extraction of the relevant information from chromatographic data. Extensive signal preprocessing is required to remove the signal of the baseline, resolve the time shifts of peaks from sample to sample and to properly estimate the spectra and concentrations of co-eluting compounds. Among several available strategies, curve resolution approaches, such as PARAFAC2, ease the deconvolution and the quantification of chemicals. However, not all resolved profiles are relevant. For example, some take into account the baseline, others the chemical compounds. Thus, it is necessary to distinguish the profiles describing relevant chemistry. With the aim to assist researchers in this selection phase, we have tried three different classification algorithms (convolutional and recurrent neural networks, k-nearest neighbours) for the automatic identification of GC-MS elution profiles resolved by PARAFAC2. To this end, we have manually labelled more than 170,000 elution profiles in the following four classes: 'Peak', 'Cutoff peak',' Baseline' and 'Others' in order to train, validate and test the classification models. The results highlight two main points: i) neural networks seem to be the best solution for this specific classification task confirmed by the overall quality of the classification, ii) the quality of the input data is crucial to maximize the modelling performances.
引用
收藏
页数:8
相关论文
共 50 条
  • [1] Citation Style Classification: a Comparison of Machine Learning Approaches
    Kopan, Artyom
    Smirnova, Anna
    Shchuckin, Ilya
    Makeev, Vladislav
    Chernishev, George
    2023 23RD IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS, ICDMW 2023, 2023, : 1058 - 1064
  • [2] An Experimental Comparison of Two Machine Learning Approaches for Emotion Classification
    Zhao, Wangchuchu
    Siau, Keng
    AMCIS 2017 PROCEEDINGS, 2017,
  • [3] Fault Classification in Reciprocating Compressors: A Comparison of Machine Learning and Deep Learning Approaches
    Sanchez, Rene-Vinicio
    Macancela, Jean-Carlo
    Cabrera, Diego
    Cerrada, Mariela
    IFAC PAPERSONLINE, 2024, 58 (08): : 157 - 161
  • [4] Comparison of machine learning approaches for enhancing Alzheimer's disease classification
    Li, Qi
    Yang, Mary Qu
    PEERJ, 2021, 9
  • [5] Comparison of machine learning and semi-quantification approaches for DaTSCAN classification
    Taylor, J.
    EUROPEAN JOURNAL OF NUCLEAR MEDICINE AND MOLECULAR IMAGING, 2017, 44 : S192 - S192
  • [6] Diagnostic classification based on DNA methylation profiles using sequential machine learning approaches
    Wojewodzic, Marcin W.
    Lavender, Jan P.
    PLOS ONE, 2024, 19 (09):
  • [7] Classification of breast cancer patients using somatic mutation profiles and machine learning approaches
    Vural, Suleyman
    Wang, Xiaosheng
    Guda, Chittibabu
    BMC SYSTEMS BIOLOGY, 2016, 10
  • [8] TV-Program Retrieval and Classification: A Comparison of Approaches based on Machine Learning
    Narducci, Fedelucio
    Musto, Cataldo
    de Gemmis, Marco
    Lops, Pasquale
    Semeraro, Giovanni
    INFORMATION SYSTEMS FRONTIERS, 2018, 20 (06) : 1157 - 1171
  • [9] TV-Program Retrieval and Classification: A Comparison of Approaches based on Machine Learning
    Fedelucio Narducci
    Cataldo Musto
    Marco de Gemmis
    Pasquale Lops
    Giovanni Semeraro
    Information Systems Frontiers, 2018, 20 : 1157 - 1171
  • [10] Classification of Cancerous Profiles using Machine Learning
    Sharma, Aman
    Rani, Rinkle
    2017 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND DATA SCIENCE (MLDS 2017), 2017, : 31 - 36