CogNet: classification of gene expression data based on ranked active-subnetwork-oriented KEGG pathway enrichment analysis

被引:0
|
作者
Yousef M. [1 ,2 ]
Ülgen E. [3 ]
Sezerman O.U. [3 ]
机构
[1] Galilee Digital Health Research Center (GDH), Zefat Academic College, Zefat
[2] Department of Information Systems, Zefat Academic College, Zefat
[3] Department of Biostatistics and Medical Informatics, School of Medicine, Acibadem Mehmet Ali Aydinlar University, Istanbul
来源
Yousef, Malik (malik.yousef@zefat.ac.il) | 1600年 / PeerJ Inc.卷 / 07期
关键词
Bioinformatics; Classification; Data mining; Data science; Enrichment analysis; Gene expression; Genomics; KEGG pathway; Machine learning; Rank;
D O I
10.7717/PEERJ-CS.336
中图分类号
学科分类号
摘要
Most of the traditional gene selection approaches are borrowed from other fields such as statistics and computer science, However, they do not prioritize biologically relevant genes since the ultimate goal is to determine features that optimize model performance metrics not to build a biologically meaningful model. Therefore, there is an imminent need for new computational tools that integrate the biological knowledge about the data in the process of gene selection and machine learning. Integrative gene selection enables incorporation of biological domain knowledge from external biological resources. In this study, we propose a new computational approach named CogNet that is an integrative gene selection tool that exploits biological knowledge for grouping the genes for the computational modeling tasks of ranking and classification. In CogNet, the pathfindR serves as the biological grouping tool to allow the main algorithm to rank active-subnetwork-oriented KEGG pathway enrichment analysis results to build a biologically relevant model. CogNet provides a list of significant KEGG pathways that can classify the data with a very high accuracy. The list also provides the genes belonging to these pathways that are differentially expressed that are used as features in the classification problem. The list facilitates deep analysis and better interpretability of the role of KEGG pathways in classification of the data thus better establishing the biological relevance of these differentially expressed genes. Even though the main aim of our study is not to improve the accuracy of any existing tool, the performance of the CogNet outperforms a similar approach called maTE while obtaining similar performance compared to other similar tools including SVM-RCE. CogNet was tested on 13 gene expression datasets concerning a variety of diseases. © 2021. Yousef et al.
引用
收藏
页码:1 / 20
页数:19
相关论文
共 37 条
  • [31] A Filter Feature Selection Method Based LLRFC and Redundancy Analysis for Tumor Classification Using Gene Expression Data
    Li, Jiangeng
    Li, Xiaodan
    Zhang, Wei
    PROCEEDINGS OF THE 2016 12TH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION (WCICA), 2016, : 2861 - 2867
  • [32] Independent component analysis-based penalized discriminant method for tumor classification using gene expression data
    Huang, De-Shuang
    Zheng, Chun-Hou
    BIOINFORMATICS, 2006, 22 (15) : 1855 - 1862
  • [33] A Helicopter Perspective on TB Biomarkers: Pathway and Process Based Analysis of Gene Expression Data Provides New Insight into TB Pathogenesis
    Joosten, Simone A.
    Fletcher, Helen A.
    Ottenhoff, Tom H. M.
    PLOS ONE, 2013, 8 (09):
  • [34] Novel gene set identification and pathway specific survival patterns using gene expression profiling of human glioblastoma: A study based on TCGA data analysis.
    Kim, Yong Wan
    Koul, Dimpy
    Kim, Se
    Almeida, Jonas
    Bogler, Oliver
    Aldape, Ken
    Yung, Alfred
    CANCER RESEARCH, 2009, 69
  • [35] Long Short-Term Memory-Deep Belief Network-Based Gene Expression Data Analysis for Prostate Cancer Detection and Classification
    Sethi, Bijaya Kumar
    Singh, Debabrata
    Rout, Saroja Kumar
    Panda, Sandeep Kumar
    IEEE ACCESS, 2024, 12 : 1508 - 1524
  • [36] Meta-analysis of microarray data using a pathway-based approach identifies a 37-gene expression signature for systemic lupus erythematosus in human peripheral blood mononuclear cells
    Dhivya Arasappan
    Weida Tong
    Padmaja Mummaneni
    Hong Fang
    Shashi Amur
    BMC Medicine, 9
  • [37] Meta-analysis of microarray data using a pathway-based approach identifies a 37-gene expression signature for systemic lupus erythematosus in human peripheral blood mononuclear cells
    Arasappan, Dhivya
    Tong, Weida
    Mummaneni, Padmaja
    Fang, Hong
    Amur, Shashi
    BMC MEDICINE, 2011, 9