CogNet: classification of gene expression data based on ranked active-subnetwork-oriented KEGG pathway enrichment analysis

被引:0
|
作者
Yousef M. [1 ,2 ]
Ülgen E. [3 ]
Sezerman O.U. [3 ]
机构
[1] Galilee Digital Health Research Center (GDH), Zefat Academic College, Zefat
[2] Department of Information Systems, Zefat Academic College, Zefat
[3] Department of Biostatistics and Medical Informatics, School of Medicine, Acibadem Mehmet Ali Aydinlar University, Istanbul
来源
Yousef, Malik (malik.yousef@zefat.ac.il) | 1600年 / PeerJ Inc.卷 / 07期
关键词
Bioinformatics; Classification; Data mining; Data science; Enrichment analysis; Gene expression; Genomics; KEGG pathway; Machine learning; Rank;
D O I
10.7717/PEERJ-CS.336
中图分类号
学科分类号
摘要
Most of the traditional gene selection approaches are borrowed from other fields such as statistics and computer science, However, they do not prioritize biologically relevant genes since the ultimate goal is to determine features that optimize model performance metrics not to build a biologically meaningful model. Therefore, there is an imminent need for new computational tools that integrate the biological knowledge about the data in the process of gene selection and machine learning. Integrative gene selection enables incorporation of biological domain knowledge from external biological resources. In this study, we propose a new computational approach named CogNet that is an integrative gene selection tool that exploits biological knowledge for grouping the genes for the computational modeling tasks of ranking and classification. In CogNet, the pathfindR serves as the biological grouping tool to allow the main algorithm to rank active-subnetwork-oriented KEGG pathway enrichment analysis results to build a biologically relevant model. CogNet provides a list of significant KEGG pathways that can classify the data with a very high accuracy. The list also provides the genes belonging to these pathways that are differentially expressed that are used as features in the classification problem. The list facilitates deep analysis and better interpretability of the role of KEGG pathways in classification of the data thus better establishing the biological relevance of these differentially expressed genes. Even though the main aim of our study is not to improve the accuracy of any existing tool, the performance of the CogNet outperforms a similar approach called maTE while obtaining similar performance compared to other similar tools including SVM-RCE. CogNet was tested on 13 gene expression datasets concerning a variety of diseases. © 2021. Yousef et al.
引用
收藏
页码:1 / 20
页数:19
相关论文
共 37 条
  • [21] Mining pathway associations for disease-related pathway activity analysis based on gene expression and methylation data
    Hyeonjeong Lee
    Miyoung Shin
    BioData Mining, 10
  • [22] A route-based pathway analysis framework integrating mutation information and gene expression data
    Zhao, Yue
    Hoang, Tham H.
    Joshi, Pujan
    Hong, Seung-Hyun
    Giardina, Charles
    Shin, Dong-Guk
    METHODS, 2017, 124 : 3 - 12
  • [23] Gene expression profiles and pathway enrichment analysis to identification of differentially expressed gene and signaling pathways in epithelial ovarian cancer based on high-throughput RNA-seq data
    Siavoshi, A.
    Taghizadeh, M.
    Dookhe, E.
    Piran, M.
    GENOMICS, 2022, 114 (01) : 161 - 170
  • [24] Multiple-kernel SVM based multiple-task oriented data mining system for gene expression data analysis
    Chen, Zhenyu
    Li, Jianping
    Wei, Liwei
    Xu, Weixuan
    Shi, Yong
    EXPERT SYSTEMS WITH APPLICATIONS, 2011, 38 (10) : 12151 - 12159
  • [25] POPBic: Pathway-Based Order Preserving Biclustering Algorithm Towards the Analysis of Gene Expression Data
    Mandal, Koyel
    Sarmah, Rosy
    Bhattacharyya, Dhruba Kumar
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2021, 18 (06) : 2659 - 2670
  • [26] Cross-platform analysis of cancer microarray data improves gene expression based classification of phenotypes
    Patrick Warnat
    Roland Eils
    Benedikt Brors
    BMC Bioinformatics, 6
  • [27] Cross-platform analysis of cancer microarray data improves gene expression based classification of phenotypes
    Warnat, P
    Eils, R
    Brors, B
    BMC BIOINFORMATICS, 2005, 6 (1)
  • [28] Gene expression data classification based on improved semi-supervised local Fisher discriminant analysis
    Huang, Hong
    Li, Jianwei
    Liu, Jiamin
    EXPERT SYSTEMS WITH APPLICATIONS, 2012, 39 (03) : 2314 - 2320
  • [29] Pathway enrichment analysis of gene expression data from formalin-fixed paraffin embedded (FFPE) samples using the GeoMxT DSP Platform
    Hood, Tressa R.
    Reeves, Jason
    Norgaard, Zach
    Hoang, Margaret
    Warren, Sarah
    Piazza, Erin
    Boykin, Rich
    Beechem, Joseph
    CANCER RESEARCH, 2020, 80 (16)
  • [30] Pathway-Based Factor Analysis of Gene Expression Data Produces Highly Heritable Phenotypes That Associate with Age
    Brown, Andrew Anand
    Ding, Zhihao
    Vinuela, Ana
    Glass, Dan
    Parts, Leopold
    Spector, Tim
    Winn, John
    Durbin, Richard
    G3-GENES GENOMES GENETICS, 2015, 5 (05): : 839 - 847