Machine learning methods for predictive proteomics

被引:47
|
作者
Barla, Annalisa [1 ]
Jurman, Giuseppe [1 ]
Riccadonna, Samantha [1 ]
Merler, Stefano [1 ]
Chierici, Marco [1 ]
Furlanello, Cesare [1 ]
机构
[1] FBK, MPBA Unit, I-38100 Trento, Italy
关键词
proteomics; selection bias; feature selection; functional profiling;
D O I
10.1093/bib/bbn008
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
The search for predictive biomarkers of disease from high-throughput mass spectrometry (MS) data requires a complex analysis path. Preprocessing and machine-learning modules are pipelined, starting from raw spectra, to set up a predictive classifier based on a shortlist of candidate features. As a machine-learning problem, proteomic profiling on MS data needs caution like the microarray case. The risk of overfitting and of selection bias effects is pervasive: not only potential features easily outnumber samples by 10(3) times, but it is easy to neglect information-leakage effects during preprocessing from spectra to peaks. The aim of this review is to explain how to build a general purpose design analysis protocol (DAP) for predictive proteomic profiling: we show how to limit leakage due to parameter tuning and how to organize classification and ranking on large numbers of replicate versions of the original data to avoid selection bias. The DAP can be used with alternative components, i.e. with different preprocessing methods (peak clustering or wavelet based), classifiers e.g. Support Vector Machine (SVM) or feature ranking methods (recursive feature elimination or I-Relief). A procedure for assessing stability and predictive value of the resulting biomarkers list is also provided. The approach is exemplified with experiments on synthetic datasets (from the Cromwell MS simulator) and with publicly available datasets from cancer studies.
引用
收藏
页码:119 / 128
页数:10
相关论文
共 50 条
  • [31] Development of a predictive model for nephrotoxicity during tacrolimus treatment using machine learning methods
    Noda, Tsubura
    Mizuno, Shotaro
    Mogushi, Kaoru
    Hase, Takeshi
    Iida, Yoritsugu
    Takeuchi, Katsuyuki
    Ishiwata, Yasuyoshi
    Nagata, Masashi
    [J]. BRITISH JOURNAL OF CLINICAL PHARMACOLOGY, 2024, 90 (03) : 675 - 683
  • [32] MACHINE LEARNING AND ARTIFICIAL INTELLIGENCE METHODS TO IDENTIFY CLINICAL FEATURES PREDICTIVE OF PROGRESSIVE MAFLD
    Salvati, Antonio
    De Rosa, Laura
    Salvati, Nicola
    Faita, Francesco
    Cavallone, Daniela
    Ricco, Gabriele
    Colombatto, Piero
    Coco, Barbara
    Romagnoli, Veronica
    Oliveri, Filippo
    Bonino, Ferruccio
    Brunetto, Maurizia R.
    [J]. HEPATOLOGY, 2021, 74 : 963A - 963A
  • [33] Predictive Maintenance Applications for Machine Learning
    Cline, Brad
    Niculescu, Radu Stefan
    Huffman, Duane
    Deckel, Bob
    [J]. 2017 ANNUAL RELIABILITY AND MAINTAINABILITY SYMPOSIUM, 2017,
  • [34] Recent advances in predictive (machine) learning
    Friedman, Jerome H.
    [J]. JOURNAL OF CLASSIFICATION, 2006, 23 (02) : 175 - 197
  • [35] Recent Advances in Predictive (Machine) Learning
    Jerome H. Friedman
    [J]. Journal of Classification, 2006, 23 : 175 - 197
  • [36] Machine Learning Application in Predictive Maintenance
    Liulys, Karolis
    [J]. 2019 OPEN CONFERENCE OF ELECTRICAL, ELECTRONIC AND INFORMATION SCIENCES (ESTREAM), 2019,
  • [37] APPLICATION OF MACHINE LEARNING IN PREDICTIVE MAINTENANCE
    Hung, Y.-S.
    Lai, S.-L.
    Chang, R.-I.
    [J]. Journal of Taiwan Society of Naval Architects and Marine Engineers, 2019, 38 (02): : 53 - 59
  • [38] Toward an Integrated Machine Learning Model of a Proteomics Experiment
    Neely, Benjamin A.
    Dorfer, Viktoria
    Martens, Lennart
    Bludau, Isabell
    Bouwmeester, Robbin
    Degroeve, Sven
    Deutsch, Eric W.
    Gessulat, Siegfried
    Kaell, Lukas
    Palczynski, Pawel
    Payne, Samuel H.
    Rehfeldt, Tobias Greisager
    Schmidt, Tobias
    Schwaemmle, Veit
    Uszkoreit, Julian
    Vizcaino, Juan Antonio
    Wilhelm, Mathias
    Palmblad, Magnus
    [J]. JOURNAL OF PROTEOME RESEARCH, 2023, : 681 - 696
  • [39] Investigation of machine learning techniques on proteomics: A comprehensive survey
    Sonsare, Pravinkumar M.
    Gunavathi, C.
    [J]. PROGRESS IN BIOPHYSICS & MOLECULAR BIOLOGY, 2019, 149 : 54 - 69
  • [40] Expanding the coverage of spatial proteomics: a machine learning approach
    Sun, Huangqingbo
    Li, Jiayi
    Murphy, Robert F.
    [J]. BIOINFORMATICS, 2024, 40 (02)