The use of Gene Ontology terms and KEGG pathways for analysis and prediction of oncogenes

被引:50
|
作者
Xing, Zhihao [1 ]
Chu, Chen [2 ]
Chen, Lei [3 ]
Kong, Xiangyin [1 ]
机构
[1] Chinese Acad Sci, Shanghai Inst Biol Sci, Inst Hlth Sci, Shanghai 200031, Peoples R China
[2] Chinese Acad Sci, Shanghai Inst Biol Sci, Inst Biochem & Cell Biol, Shanghai 200031, Peoples R China
[3] Shanghai Maritime Univ, Coll Informat Engn, Shanghai 201306, Peoples R China
来源
基金
中国国家自然科学基金;
关键词
Oncogenes; Gene Ontology; KEGG pathway; Minimum redundancy maximum relevance; Incremental feature selection; Random forest; PROTEIN INTERACTION NETWORKS; HUMAN CANCER; EXPRESSION; RECEPTOR; IDENTIFICATION; DIFFERENTIATION; TRANSFORMATION; POLYMORPHISMS; ASSOCIATIONS; RELEVANCE;
D O I
10.1016/j.bbagen.2016.01.012
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Background: Oncogenes are a type of genes that have the potential to cause cancer. Most normal cells undergo programmed cell death, namely apoptosis, but activated oncogenes can help cells avoid apoptosis and survive. Thus, studying oncogenes is helpful for obtaining a good understanding of the formation and development of various types of cancers. Methods: In this study, we proposed a computational method, called OPM, for investigating oncogenes from the view of Gene Ontology (GO) and biological pathways. All investigated genes, including validated oncogenes retrieved from some public databases and other genes that have not been reported to be oncogenes thus far, were encoded into numeric vectors according to the enrichment theory of GO terms and KEGG pathways. Some popular feature selection methods, minimum redundancy maximum relevance and incremental feature selection, and an advanced machine learning algorithm, random forest, were adopted to analyze the numeric vectors to extract key GO terms and KEGG pathways. Results: Along with the oncogenes, GO terms and KEGG pathways were discussed in terms of their relevance in this study. Some important GO terms and KEGG pathways were extracted using feature selection methods and were confirmed to be highly related to oncogenes. Additionally, the importance of these terms and pathways in predicting oncogenes was further demonstrated by finding new putative oncogenes based on them. Conclusions: This study investigated oncogenes based on GO terms and KEGG pathways. Some important GO terms and KEGG pathways were confirmed to be highly related to oncogenes. We hope that these GO terms and KEGG pathways can provide new insight for the study of oncogenes, particularly for building more effective prediction models to identify novel oncogenes. The program is available upon request. General significance: We hope that the new findings listed in this study may provide a new insight for the investigation of oncogenes. This article is part of a Special Issue entitled "System Genetics" (C) 2016 Published by Elsevier B.V.
引用
收藏
页码:2725 / 2734
页数:10
相关论文
共 50 条
  • [31] The correlation of gene expression and co-regulated gene patterns in characteristic KEGG pathways
    Hua, Lin
    Li, Dong-guo
    Lin, Hui
    Li, Lin
    Li, Xia
    Liu, Zhi-Cheng
    JOURNAL OF THEORETICAL BIOLOGY, 2010, 266 (02) : 242 - 249
  • [32] Identification of compound-protein interactions through the analysis of gene ontology, KEGG enrichment for proteins and molecular fragments of compounds
    Chen, Lei
    Zhang, Yu-Hang
    Zheng, Mingyue
    Huang, Tao
    Cai, Yu-Dong
    MOLECULAR GENETICS AND GENOMICS, 2016, 291 (06) : 2065 - 2079
  • [33] A comprehensive evaluation of differentially expressed mRNAs and lncRNAs in cystitis glandularis with gene ontology, KEGG pathway, and ceRNA network analysis
    Li, Chao
    Hu, Jiao
    Liu, Peihua
    Li, Qiaqia
    Chen, Jinbo
    Cui, Yu
    Zhou, Xu
    Xue, Bichen
    Zhang, Xin
    Gao, Xin
    Zu, Xiongbing
    TRANSLATIONAL ANDROLOGY AND UROLOGY, 2020, 9 (02) : 232 - 242
  • [34] A New Feature Vector Based on Gene Ontology Terms for Protein-Protein Interaction Prediction
    Bandyopadhyay, Sanghamitra
    Mallick, Koushik
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2017, 14 (04) : 762 - 770
  • [35] Linking molecular function and biological process terms in the gene ontology for gene expression data analysis
    DeJongh, M
    Van Dort, P
    Ramsay, B
    PROCEEDINGS OF THE 26TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY, VOLS 1-7, 2004, 26 : 2984 - 2986
  • [36] deepSimDEF: deep neural embeddings of gene products and gene ontology terms for functional analysis of genes
    Pesaranghader, Ahmad
    Matwin, Stan
    Sokolova, Marina
    Grenier, Jean-Christophe
    Beiko, Robert G.
    Hussin, Julie
    BIOINFORMATICS, 2022, 38 (11) : 3051 - 3061
  • [37] Modeling biochemical pathways in the gene ontology
    Hill, David P.
    D'Eustachio, Peter
    Berardini, Tanya Z.
    Mungall, Christopher J.
    Renedo, Nikolai
    Blake, Judith A.
    DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION, 2016,
  • [38] Prediction of Gene Phenotypes Based on GO and KEGG Pathway Enrichment Scores
    Zhang, Tao
    Jiang, Min
    Chen, Lei
    Niu, Bing
    Cai, Yudong
    BIOMED RESEARCH INTERNATIONAL, 2013, 2013
  • [39] Probabilistic Latent Semantic Analysis for prediction of Gene Ontology annotations
    Masseroli, Marco
    Chicco, Davide
    Pinoli, Pietro
    2012 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2012,
  • [40] Bayesian assignment of gene ontology terms to gene expression experiments
    Sykacek, P.
    BIOINFORMATICS, 2012, 28 (18) : I603 - I610