A knowledge-based T2-statistic to perform pathway analysis for quantitative proteomic data

被引:0
|
作者
Lai, En-Yu [1 ,2 ]
Chen, Yi-Hau [3 ]
Wul, Kun-Pin [1 ]
机构
[1] Natl Yang Ming Univ, Inst Biomed Informat, Taipei 11221, Taiwan
[2] Acad Sinica, Inst Informat Sci, Taiwan Int Grad Program, Bioinformat Program, Taipei 11529, Taiwan
[3] Acad Sinica, Inst Stat Sci, Taipei 11529, Taiwan
关键词
GENE-SET ANALYSIS; JAK/STAT PATHWAYS; EXPRESSION DATA; MICROARRAY; TESTS; PI3K/PTEN/AKT/MTOR; RAF/MEK/ERK; REVEALS; CELLS; TERMS;
D O I
10.1371/journal.pcbi.1005601
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Approaches to identify significant pathways from high-throughput quantitative data have been developed in recent years. Still, the analysis of proteomic data stays difficult because of limited sample size. This limitation also leads to the practice of using a competitive null as common approach; which fundamentally implies genes or proteins as independent units. The independent assumption ignores the associations among biomolecules with similar functions or cellular localization, as well as the interactions among them manifested as changes in expression ratios. Consequently, these methods often underestimate the associations among biomolecules and cause false positives in practice. Some studies incorporate the sample covariance matrix into the calculation to address this issue. However, sample covariance may not be a precise estimation if the sample size is very limited, which is usually the case for the data produced by mass spectrometry. In this study, we introduce a multivariate test under a self-contained null to perform pathway analysis for quantitative proteomic data. The covariance matrix used in the test statistic is constructed by the confidence scores retrieved from the STRING database or the HitPredict database. We also design an integrating procedure to retain pathways of sufficient evidence as a pathway group. The performance of the proposed T-2-statistic is demonstrated using five published experimental datasets: the T-cell activation, the cAMP/PKA signaling, the myoblast differentiation, and the effect of dasatinib on the BCR-ABL pathway are proteomic datasets produced by mass spectrometry; and the protective effect of myocilin via the MAPK signaling pathway is a gene expression dataset of limited sample size. Compared with other popular statistics, the proposed T-2-statistic yields more accurate descriptions in agreement with the discussion of the original publication. We implemented the T-2-statistic into an R package T2GA, which is available at https://github.com/ roqe/T2GA.
引用
收藏
页数:29
相关论文
共 50 条
  • [41] A Regularized Hotelling's T2 Test for Pathway Analysis in Proteomic Studies
    Chen, Lin S.
    Paul, Debashis
    Prentice, Ross L.
    Wang, Pei
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2011, 106 (496) : 1345 - 1360
  • [42] Quantitative Analysis of the Impact of Automatically Generated Normal Tissue Contours on Knowledge-Based Planning Model Quality
    Arnold, S. C.
    Harms, J. M.
    Cardenas, C. E.
    Caffrey, E. A.
    Wilson, C. A.
    HEALTH PHYSICS, 2023, 125 (01): : 27 - 27
  • [43] Pathway analysis of esophageal squamous cell carcinoma using iTRAQ-based quantitative proteomic approach
    Wang, Xuefen
    Zhou, Keming
    Wang, Bowei
    Yan, Xinling
    Liu, Xia
    Li, Qiaoxin
    Ma, Yuqing
    INTERNATIONAL JOURNAL OF CLINICAL AND EXPERIMENTAL PATHOLOGY, 2016, 9 (10): : 10491 - 10498
  • [44] INTERDISCIPLINARY LIFE CYCLE DATA ANALYSIS WITHIN A KNOWLEDGE-BASED SYSTEM FOR PRODUCT COST ESTIMATION
    Altavilla, Stefania
    Montagna, Francesca
    Newnes, Linda
    DS87-5 PROCEEDINGS OF THE 21ST INTERNATIONAL CONFERENCE ON ENGINEERING DESIGN (ICED 17), VOL 5: DESIGN FOR X, DESIGN TO X, 2017, : 375 - 384
  • [45] Bayesian Stochastic Frontier Analysis with Missing Data Management as Knowledge-Based Planning for Lung SBRT
    Kroshko, A.
    Morin, O.
    Archambault, L.
    MEDICAL PHYSICS, 2020, 47 (06) : E451 - E451
  • [46] A KNOWLEDGE-BASED SYSTEM FOR STRUCTURE-ANALYSIS FROM INFRARED AND MASS-SPECTRAL DATA
    LUINGE, HJ
    TRAC-TRENDS IN ANALYTICAL CHEMISTRY, 1990, 9 (02) : 66 - 69
  • [47] Topology Data Analysis-Based Error Detection for Semantic Image Transmission with Incremental Knowledge-Based HARQ
    Ni Fei
    Li Rongpeng
    Zhao Zhifeng
    Zhang Honggang
    China Communications, 2025, 22 (01) : 235 - 255
  • [48] Topology Data Analysis-Based Error Detection for Semantic Image Transmission with Incremental Knowledge-Based HARQ
    Ni, Fei
    Li, Rongpeng
    Zhao, Zhifeng
    Zhang, Honggang
    CHINA COMMUNICATIONS, 2025, 22 (01) : 235 - 255
  • [49] Quantitative constructability analysis with a neuro-fuzzy knowledge-based multi-criterion decision support system
    Department of Construction Engineering, Chung-Hua University, Hsinchu 300, Taiwan
    不详
    Autom Constr, 5 (553-565):
  • [50] Robustness and accuracy of methods for high dimensional data analysis based on Student's t-statistic
    Delaigle, Aurore
    Hall, Peter
    Jin, Jiashun
    JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2011, 73 : 283 - 301