Semi-supervised gene shaving method for predicting low variation biological pathways from genome-wide data

被引:2
|
作者
Zhu, Dongxiao [1 ,2 ]
机构
[1] Univ New Orleans, Dept Comp Sci, New Orleans, LA 70148 USA
[2] Childrens Hosp, Res Inst Children, New Orleans, LA 70118 USA
来源
BMC BIOINFORMATICS | 2009年 / 10卷
关键词
SINGULAR-VALUE DECOMPOSITION; EXPRESSION; SET; INFORMATION; PATTERNS; NETWORK;
D O I
10.1186/1471-2105-10-S1-S54
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: The gene shaving algorithm and many other clustering algorithms identify gene clusters showing high variation across samples. However, gene expression in many signaling pathways show only modest and concordant changes that fail to be identified by these methods. The increasingly available signaling pathway prior knowledge provide new opportunity to solve this problem. Results: We propose an innovative semi-supervised gene clustering algorithm, where the original gene shaving algorithm was extended and generalized so that prior knowledge of signaling pathways can be incorporated. Different from other methods, our method identifies gene clusters showing concerted and modest expression variation as well as strong expression correlation. Using available pathway gene sets as prior knowledge, whether complete or incomplete, our algorithm is capable of forming tightly regulated gene clusters showing modest variation across samples. We demonstrate the advantages of our algorithm over the original gene shaving algorithm using two microarray data sets. The stability of the gene clusters was accessed using a jackknife approach. Conclusion: Our algorithm represents one of the first clustering algorithms that is particularly designed to identify signaling pathways of low and concordant gene expression variation. The discriminating power is achieved by manufacturing a principal component enriched by signaling pathways.
引用
收藏
页数:12
相关论文
共 50 条
  • [31] Genes for g: A Novel Method for Analyzing Data from Genome-Wide Association Studies
    Carey, Gregory
    BEHAVIOR GENETICS, 2008, 38 (06) : 617 - 617
  • [32] IMPLICATIONS OF DNA METHYLATION FOR PTSD: FROM GENE-SPECIFIC TO GENOME-WIDE AND BIOLOGICAL AGING PATTERNS
    Vukojevic, Vanja
    Milnik, Annette
    de Quervain, Dominique J. -F.
    Papassotiropoulos, Andreas
    EUROPEAN NEUROPSYCHOPHARMACOLOGY, 2017, 27 : S311 - S311
  • [33] A semi-supervised short text sentiment classification method based on improved Bert model from unlabelled data
    Haochen Zou
    Zitao Wang
    Journal of Big Data, 10
  • [34] Gene set-based analysis of polymorphisms: finding pathways or biological processes associated to traits in genome-wide association studies
    Medina, Ignacio
    Montaner, David
    Bonifaci, Nuria
    Angel Pujana, Miguel
    Carbonell, Jose
    Tarraga, Joaquin
    Al-Shahrour, Fatima
    Dopazo, Joaquin
    NUCLEIC ACIDS RESEARCH, 2009, 37 : W340 - W344
  • [35] A semi-supervised short text sentiment classification method based on improved Bert model from unlabelled data
    Zou, Haochen
    Wang, Zitao
    JOURNAL OF BIG DATA, 2023, 10 (01)
  • [36] Identification of disease-associated pathways in pancreatic cancer by integrating genome-wide association study and gene expression data
    Long, Jin
    Liu, Zhe
    Wu, Xingda
    Xu, Yuanhong
    Ge, Chunlin
    ONCOLOGY LETTERS, 2016, 12 (01) : 537 - 543
  • [37] Predicting individual socioeconomic status from mobile phone data: a semi-supervised hypergraph-based factor graph approach
    Tao Zhao
    Hong Huang
    Xiaoming Yao
    Jar-der Luo
    Xiaoming Fu
    International Journal of Data Science and Analytics, 2020, 9 : 361 - 372
  • [38] Predicting individual socioeconomic status from mobile phone data: a semi-supervised hypergraph-based factor graph approach
    Zhao, Tao
    Huang, Hong
    Yao, Xiaoming
    Luo, Jar-der
    Fu, Xiaoming
    INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS, 2020, 9 (03) : 361 - 372
  • [39] Enricherator: A Bayesian Method for Inferring Regularized Genome-wide Enrichments from Sequencing Count Data
    Schroeder, Jeremy W.
    Freddolino, P. Lydia
    JOURNAL OF MOLECULAR BIOLOGY, 2024, 436 (17)
  • [40] PANOGA: a web server for identification of SNP-targeted pathways from genome-wide association study data
    Bakir-Gungor, Burcu
    Egemen, Ece
    Sezerman, Osman Ugur
    BIOINFORMATICS, 2014, 30 (09) : 1287 - 1289