Machine learning methods for predictive proteomics

被引:47
|
作者
Barla, Annalisa [1 ]
Jurman, Giuseppe [1 ]
Riccadonna, Samantha [1 ]
Merler, Stefano [1 ]
Chierici, Marco [1 ]
Furlanello, Cesare [1 ]
机构
[1] FBK, MPBA Unit, I-38100 Trento, Italy
关键词
proteomics; selection bias; feature selection; functional profiling;
D O I
10.1093/bib/bbn008
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
The search for predictive biomarkers of disease from high-throughput mass spectrometry (MS) data requires a complex analysis path. Preprocessing and machine-learning modules are pipelined, starting from raw spectra, to set up a predictive classifier based on a shortlist of candidate features. As a machine-learning problem, proteomic profiling on MS data needs caution like the microarray case. The risk of overfitting and of selection bias effects is pervasive: not only potential features easily outnumber samples by 10(3) times, but it is easy to neglect information-leakage effects during preprocessing from spectra to peaks. The aim of this review is to explain how to build a general purpose design analysis protocol (DAP) for predictive proteomic profiling: we show how to limit leakage due to parameter tuning and how to organize classification and ranking on large numbers of replicate versions of the original data to avoid selection bias. The DAP can be used with alternative components, i.e. with different preprocessing methods (peak clustering or wavelet based), classifiers e.g. Support Vector Machine (SVM) or feature ranking methods (recursive feature elimination or I-Relief). A procedure for assessing stability and predictive value of the resulting biomarkers list is also provided. The approach is exemplified with experiments on synthetic datasets (from the Cromwell MS simulator) and with publicly available datasets from cancer studies.
引用
收藏
页码:119 / 128
页数:10
相关论文
共 50 条
  • [1] Predictive Churn Analysis with Machine Learning Methods
    Gunay, Melike
    Ensari, Tolga
    [J]. 2018 26TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2018,
  • [2] Novel Feature Representation and Machine Learning Methods in Computational Proteomics
    Chen, Lei
    [J]. CURRENT PROTEOMICS, 2021, 18 (05) : 606 - 607
  • [3] Motor Classification with Machine Learning Methods for Predictive Maintenance
    Kammerer, Christoph
    Gaust, Michael
    Kuestner, Micha
    Starke, Pascal
    Radtke, Roman
    Jesser, Alexander
    [J]. IFAC PAPERSONLINE, 2021, 54 (01): : 1059 - 1064
  • [4] Machine Learning in Proteomics and Metabolomics
    Neely, Benjamin A.
    Palmblad, Magnus
    [J]. JOURNAL OF PROTEOME RESEARCH, 2022, 21 (11) : 2553 - 2554
  • [5] Predictive Modeling of Flight Delays at an Airport Using Machine Learning Methods
    Hatipoglu, Irmak
    Tosun, Oemuer
    [J]. APPLIED SCIENCES-BASEL, 2024, 14 (13):
  • [6] Consideration of Manufacturing Data to Apply Machine Learning Methods for Predictive Manufacturing
    Han, Ji-Hyeong
    Chi, Su-Young
    [J]. 2016 EIGHTH INTERNATIONAL CONFERENCE ON UBIQUITOUS AND FUTURE NETWORKS (ICUFN), 2016, : 109 - 113
  • [7] MACHINE LEARNING METHODS AND PREDICTIVE MODELING TO IDENTIFY FAILURES IN THE MILITARY AIRCRAFT
    Min, Hokey
    Wood, Ryan
    Joo, Seong-Jong
    [J]. INTERNATIONAL JOURNAL OF INDUSTRIAL ENGINEERING-THEORY APPLICATIONS AND PRACTICE, 2023, 30 (05): : 1273 - 1283
  • [8] A systematic literature review of machine learning methods applied to predictive maintenance
    Carvalho, Thyago P.
    Soares, Fabrizzio A. A. M. N.
    Vita, Roberto
    Francisco, Robert da P.
    Basto, Joao P.
    Alcala, Symone G. S.
    [J]. COMPUTERS & INDUSTRIAL ENGINEERING, 2019, 137
  • [9] Intelligent Choice of Machine Learning Methods for Predictive Maintenance of Intelligent Machines
    Becherer, Marius
    Zipperle, Michael
    Karduck, Achim
    [J]. COMPUTER SYSTEMS SCIENCE AND ENGINEERING, 2020, 35 (02): : 81 - 89
  • [10] Feasibility of Machine Learning Methods for Predictive Alerting of the Energy State for Aircraft
    Engelmann, James
    Mourning, Chad
    de Haag, Maarten Uijt
    [J]. 2018 IEEE/AIAA 37TH DIGITAL AVIONICS SYSTEMS CONFERENCE (DASC), 2018, : 1151 - 1160