FUNCTION-ON-SCALAR QUANTILE REGRESSION WITH APPLICATION TO MASS SPECTROMETRY PROTEOMICS DATA

被引:0
|
作者
Liu, Yusha [1 ]
Li, Meng [1 ]
Morris, Jeffrey S. [2 ]
机构
[1] Rice Univ, Dept Stat, Houston, TX 77251 USA
[2] Univ Penn, Div Biostat, Philadelphia, PA 19104 USA
来源
ANNALS OF APPLIED STATISTICS | 2020年 / 14卷 / 02期
基金
美国国家科学基金会; 美国国家卫生研究院;
关键词
Bayesian hierarchical model; functional data analysis; functional response regression; global-local shrinkage; proteomic biomarker; quantile regression; EMPIRICAL LIKELIHOOD; HORSESHOE ESTIMATOR; PEAK DETECTION; SERUM; SPECTRA;
D O I
10.1214/19-AOAS1319
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Mass spectrometry proteomics, characterized by spiky, spatially heterogeneous functional data, can be used to identify potential cancer biomarkers. Existing mass spectrometry analyses utilize mean regression to detect spectral regions that are differentially expressed across groups. However, given the interpatient heterogeneity that is a key hallmark of cancer, many biomarkers are only present at aberrant levels for a subset of, not all, cancer samples. Differences in these biomarkers can easily be missed by mean regression but might be more easily detected by quantile-based approaches. Thus, we propose a unified Bayesian framework to perform quantile regression on functional responses. Our approach utilizes an asymmetric Laplace working likelihood, represents the functional coefficients with basis representations which enable borrowing of strength from nearby locations and places a global-local shrinkage prior on the basis coefficients to achieve adaptive regularization. Different types of basis transform and continuous shrinkage priors can be used in our framework. A scalable Gibbs sampler is developed to generate posterior samples that can be used to perform Bayesian estimation and inference while accounting for multiple testing. Our framework performs quantile regression and coefficient regularization in a unified manner, allowing them to inform each other and leading to improvement in performance over competing methods, as demonstrated by simulation studies. We also introduce an adjustment procedure to the model to improve its frequentist properties of posterior inference. We apply our model to identify proteomic biomarkers of pancreatic cancer that are differentially expressed for a subset of cancer patients compared to the normal controls which were missed by previous mean-regression based approaches. Supplementary Material for this article is available online.
引用
收藏
页码:521 / 541
页数:21
相关论文
共 50 条
  • [31] Online monitoring of profiles via function-on-scalar model with an application to industrial busbar
    Zhang, Wei
    Niu, Zhanwen
    He, Zhen
    He, Shuguang
    [J]. QUALITY AND RELIABILITY ENGINEERING INTERNATIONAL, 2022, 38 (07) : 3816 - 3828
  • [32] Modelling Dominant Tree Heights of Fagus sylvatica L. Using Function-on-Scalar Regression Based on Forest Inventory Data
    Engel, Markus
    Mette, Tobias
    Falk, Wolfgang
    Poschenrieder, Werner
    Fridman, Jonas
    Skudnik, Mitja
    [J]. FORESTS, 2023, 14 (02):
  • [33] A common open representation of mass spectrometry data and its application to proteomics research
    Patrick G A Pedrioli
    Jimmy K Eng
    Robert Hubley
    Mathijs Vogelzang
    Eric W Deutsch
    Brian Raught
    Brian Pratt
    Erik Nilsson
    Ruth H Angeletti
    Rolf Apweiler
    Kei Cheung
    Catherine E Costello
    Henning Hermjakob
    Sequin Huang
    Randall K Julian
    Eugene Kapp
    Mark E McComb
    Stephen G Oliver
    Gilbert Omenn
    Norman W Paton
    Richard Simpson
    Richard Smith
    Chris F Taylor
    Weimin Zhu
    Ruedi Aebersold
    [J]. Nature Biotechnology, 2004, 22 : 1459 - 1466
  • [34] A common open representation of mass spectrometry data and its application to proteomics research
    Pedrioli, PGA
    Eng, JK
    Hubley, R
    Vogelzang, M
    Deutsch, EW
    Raught, B
    Pratt, B
    Nilsson, E
    Angeletti, RH
    Apweiler, R
    Cheung, K
    Costello, CE
    Hermjakob, H
    Huang, S
    Julian, RK
    Kapp, E
    McComb, ME
    Oliver, SG
    Omenn, G
    Paton, NW
    Simpson, R
    Smith, R
    Taylor, CF
    Zhu, WM
    Aebersold, R
    [J]. NATURE BIOTECHNOLOGY, 2004, 22 (11) : 1459 - 1466
  • [35] Proteomics: data analysis of mass spectrometry results
    Vandenbrouck, Y
    Garin, J
    Jaquinod, M
    Bruley, C
    [J]. BIOFUTUR, 2005, (252) : 27 - 31
  • [36] Identification of contaminants in proteomics mass spectrometry data
    Duncan, M
    Fung, K
    Wang, H
    Yen, C
    Cios, K
    [J]. PROCEEDINGS OF THE 2003 IEEE BIOINFORMATICS CONFERENCE, 2003, : 409 - 410
  • [37] A Mass Spectrometry Proteomics Data Management Platform
    Sharma, Vagisha
    Eng, Jimmy K.
    MacCoss, Michael J.
    Riffle, Michael
    [J]. MOLECULAR & CELLULAR PROTEOMICS, 2012, 11 (09) : 824 - 831
  • [38] Preprocessing of mass spectrometry proteomics data on the grid
    Cannataro, M
    Guzzi, PH
    Mazza, T
    Tradigo, G
    Veltri, P
    [J]. 18TH IEEE SYMPOSIUM ON COMPUTER-BASED MEDICAL SYSTEMS, PROCEEDINGS, 2005, : 549 - 554
  • [39] Mass spectrometry data analysis in the proteomics era
    Forner, Francesca
    Foster, Leonard J.
    Toppo, Stefano
    [J]. CURRENT BIOINFORMATICS, 2007, 2 (01) : 63 - 93
  • [40] The effect of vaccine on COVID-19 spread by function-on-scalar regression model: a case study of Africa
    Rizk, Zeinab
    Khan, Nasrullah
    [J]. JOURNAL OF PUBLIC HEALTH-HEIDELBERG, 2024, 32 (07): : 1177 - 1186