cTAP: A Machine Learning Framework for Predicting Target Genes of a Transcription Factor using a Cohort of Gene Expression Data Sets

被引:1
|
作者
Wang, Honglin [1 ]
Joshi, Pujan [1 ]
Hong, Seung-Hyun [1 ]
Maye, Peter F. [2 ]
Rowe, David W. [2 ]
Shin, Dong-Guk [1 ]
机构
[1] Univ Connecticut, Comp Sci & Engn Dept, Storrs, CT 06269 USA
[2] Univ Connecticut, Dept Reconstruct Sci, Hlth Ctr, Farmington, CT 06030 USA
关键词
TF target analysis; Cohort analysis; Osteoclast differentiation; Gene abundance analysis; Machine learning;
D O I
10.1109/BIBM49941.2020.9313303
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Identifying target genes of a transcription factor is crucial in biomedical research. Thanks to ChIP-seq technology, scientists can estimate potential genome- wide target genes of a transcription factor. However, finding the consistently behaving Up/Down targets of a transcription factor in a given biological context is difficult because it requires analysis of a large number of studies under the same or comparable context. We present a transcription target prediction method, called Cohort-based TF target prediction system (cTAP). This method assumes that the pathway involving the transcription factor of interest is featured with multiple functional groups of marker genes pertaining to the concerned biological process. It uses the notion of gene-presence and gene-absence in addition to log2 ratios of gene expression values for the prediction. Target prediction is made by applying multiple machine-learning models that learn the patterns of gene-presence and gene-absence from log2 ratio and four types of Z scores from the normalized cohort's gene expression data. The learned patterns are then associated with the putative targets of the concerned transcription factor to elicit genes exhibiting Up/Down gene regulation patterns "consistently" within the cohort. Totally 11 publicly available GEO data sets related to osteoclastogenesis are used in our experiment. The learned models using gene-presence and gene-absence produce target genes different from using only log2 ratios such as CASP1, BID, and IRF5. Our literature survey reveals that all these predicted targets have known roles in bone remodeling, specifically related to immune and osteoclasts, suggesting confidence in our method and potential merit for a wet-lab experiment for validation.
引用
收藏
页码:164 / 167
页数:4
相关论文
共 50 条
  • [1] An Improved Systematic Approach to Predicting Transcription Factor Target Genes Using Support Vector Machine
    Cui, Song
    Youn, Eunseog
    Lee, Joohyun
    Maas, Stephan J.
    PLOS ONE, 2014, 9 (04):
  • [2] TGPred: efficient methods for predicting target genes of a transcription factor by integrating statistics, machine learning and optimization
    Cao, Xuewei
    Zhang, Ling
    Islam, Md Khairul
    Zhao, Mingxia
    He, Cheng
    Zhang, Kui
    Liu, Sanzhen
    Sha, Qiuying
    Wei, Hairong
    NAR GENOMICS AND BIOINFORMATICS, 2023, 5 (03)
  • [3] Predicting the Target Genes of Intronic MicroRNAs Using Large-scale Gene Expression Data
    Radfar, M. Hossein
    Wong, Willy
    Morris, Quaid D.
    2010 ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC), 2010, : 791 - 794
  • [4] Predicting potential target genes in molecular biology experiments using machine learning and multifaceted data sources
    Ito, Kei K.
    Tsuruoka, Yoshimasa
    Kitagawa, Daiju
    ISCIENCE, 2024, 27 (03)
  • [5] Evaluating Transcription Factor Activity Changes by Scoring Unexplained Target Genes in Expression Data
    Berchtold, Evi
    Csaba, Gergely
    Zimmer, Ralf
    PLOS ONE, 2016, 11 (10):
  • [6] Identification and characterization of transcription factor target genes using gene-targeted mice
    DeRyckere, D
    DeGregori, J
    METHODS, 2002, 26 (01) : 57 - 75
  • [7] MAGIC: A tool for predicting transcription factors and cofactors driving gene sets using ENCODE data
    Roopra, Avtar
    PLOS COMPUTATIONAL BIOLOGY, 2020, 16 (04)
  • [8] Predicting the targets of IRF8 and NFATc1 during osteoclast differentiation using the machine learning method framework cTAP
    Honglin Wang
    Pujan Joshi
    Seung-Hyun Hong
    Peter F. Maye
    David W. Rowe
    Dong-Guk Shin
    BMC Genomics, 23
  • [9] Predicting the targets of IRF8 and NFATc1 during osteoclast differentiation using the machine learning method framework cTAP
    Wang, Honglin
    Joshi, Pujan
    Hong, Seung-Hyun
    Maye, Peter F.
    Rowe, David W.
    Shin, Dong-Guk
    BMC GENOMICS, 2022, 23 (01)
  • [10] Machine Learning Framework for the Prediction of Alzheimer's Disease Using Gene Expression Data Based on Efficient Gene Selection
    El-Gawady, Aliaa
    Makhlouf, Mohamed A.
    Tawfik, BenBella S.
    Nassar, Hamed
    SYMMETRY-BASEL, 2022, 14 (03):