Biomarker discovery for arsenic exposure using functional data. Analysis and feature learning of mass spectrometry proteomic data

被引:29
|
作者
Harezlak, Jaroslaw [1 ]
Wu, Michael C. [2 ]
Wang, Mike [3 ]
Schwartzman, Armin [2 ,4 ]
Christiani, David C. [3 ]
Lin, Xihong [2 ]
机构
[1] Indiana Univ, Sch Med, Dept Med, Indianapolis, IN 46202 USA
[2] Harvard Univ, Sch Publ Hlth, Dept Biostat, Boston, MA 02115 USA
[3] Harvard Univ, Sch Publ Hlth, Dept Environm Hlth, Boston, MA 02115 USA
[4] Dana Farber Canc Inst, Dept Biostat & Computat Biol, Boston, MA 02115 USA
关键词
D O I
10.1021/pr070491n
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Plasma biomarkers of exposure to environmental contaminants play an important role in early detection of disease. The emerging field of proteomics presents an attractive opportunity for candidate biomarker discovery, as it simultaneously measures and analyzes a large number of proteins. This article presents a case study for measuring arsenic concentrations in a population residing in an As-endemic region of Bangladesh using plasma protein expressions measured by SELDI-TOF mass spectrometry. We analyze the data using a unified statistical method based on functional learning to preprocess mass spectra and extract mass spectrometry (MS) features and to associate the selected MS features with arsenic exposure measurements. The task is challenging due to several factors, the high dimensionality of mass spectrometry data, complicated error structures, and a multiple comparison problem. We use nonparametric functional regression techniques for MS modeling, peak detection based on the significant zero-downcrossing method, and peak alignment using a warping algorithm. Our results show significant associations of arsenic exposure to either under- or overexpressions of 20 proteins.
引用
下载
收藏
页码:217 / 224
页数:8
相关论文
共 50 条
  • [21] TOFwave: reproducibility in biomarker discovery from time-of-flight mass spectrometry data
    Chierici, Marco
    Albanese, Davide
    Franceschi, Pietro
    Furlanello, Cesare
    MOLECULAR BIOSYSTEMS, 2012, 8 (11) : 2845 - 2849
  • [22] Extreme Learning Machine for Mass Spectrometry Data Analysis
    Ulloa Orellana, Mario
    Lopez-Cortes, Xaviera A.
    Zabala-Blanco, David
    Palacios Jativa, Pablo
    Datta, Jayanta
    2022 IEEE COLOMBIAN CONFERENCE ON COMMUNICATIONS AND COMPUTING, COLCOM, 2022,
  • [23] Machine Learning for Mass Spectrometry Data Analysis in Proteomics
    Li, Juntao
    Zhou, Kanglei
    Mu, Bingyu
    CURRENT PROTEOMICS, 2021, 18 (05) : 620 - 634
  • [24] Proteomic Workflows for Biomarker Identification Using Mass Spectrometry - Technical and Statistical Considerations during Initial Discovery
    Orton, Dennis J.
    Doucette, Alan A.
    PROTEOMES, 2013, 1 (02): : 109 - 127
  • [25] Biomarker discovery of metastasis in cutaneous squamous cell carcinoma using a mass spectrometry based proteomic approach
    Shapanis, A.
    Lai, C.
    Theaker, J.
    Schofield, J.
    Parkinson, E.
    Skipp, P.
    Healy, E.
    JOURNAL OF INVESTIGATIVE DERMATOLOGY, 2018, 138 (05) : S26 - S26
  • [26] Methodology for biomarker discovery with reproducibility in microbiome data using machine learning
    Rojas-Velazquez, David
    Kidwai, Sarah
    Kraneveld, Aletta D.
    Tonda, Alberto
    Oberski, Daniel
    Garssen, Johan
    Lopez-Rincon, Alejandro
    BMC BIOINFORMATICS, 2024, 25 (01)
  • [27] Methodology for biomarker discovery with reproducibility in microbiome data using machine learning
    David Rojas-Velazquez
    Sarah Kidwai
    Aletta D. Kraneveld
    Alberto Tonda
    Daniel Oberski
    Johan Garssen
    Alejandro Lopez-Rincon
    BMC Bioinformatics, 25
  • [28] Classification Using Mass Spectrometry Proteomic Data with Kernel-Based Algorithms
    Liu, Zhenqiu
    Lin, Shili
    ENGINEERING LETTERS, 2006, 13 (03)
  • [29] Influence of different data analysis approaches on biomarker detection using proteomic data of pancreatic cancer
    Kohl, M.
    Sauer, T.
    Klempt-Giessing, K.
    Horn, J.
    Szymczak, S.
    Bober, C.
    Diercks, K.
    Gemoll, T.
    ONCOLOGY RESEARCH AND TREATMENT, 2023, 46 : 309 - 309
  • [30] Mixed effect modelling of proteomic mass spectrometry data by using Gaussian mixtures
    Browne, William J.
    Dryden, Ian L.
    Handley, Kelly
    Mian, Shahid
    Schadendorf, Dirk
    JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES C-APPLIED STATISTICS, 2010, 59 : 617 - 633