cTAP: A Machine Learning Framework for Predicting Target Genes of a Transcription Factor using a Cohort of Gene Expression Data Sets

被引:1
|
作者
Wang, Honglin [1 ]
Joshi, Pujan [1 ]
Hong, Seung-Hyun [1 ]
Maye, Peter F. [2 ]
Rowe, David W. [2 ]
Shin, Dong-Guk [1 ]
机构
[1] Univ Connecticut, Comp Sci & Engn Dept, Storrs, CT 06269 USA
[2] Univ Connecticut, Dept Reconstruct Sci, Hlth Ctr, Farmington, CT 06030 USA
关键词
TF target analysis; Cohort analysis; Osteoclast differentiation; Gene abundance analysis; Machine learning;
D O I
10.1109/BIBM49941.2020.9313303
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Identifying target genes of a transcription factor is crucial in biomedical research. Thanks to ChIP-seq technology, scientists can estimate potential genome- wide target genes of a transcription factor. However, finding the consistently behaving Up/Down targets of a transcription factor in a given biological context is difficult because it requires analysis of a large number of studies under the same or comparable context. We present a transcription target prediction method, called Cohort-based TF target prediction system (cTAP). This method assumes that the pathway involving the transcription factor of interest is featured with multiple functional groups of marker genes pertaining to the concerned biological process. It uses the notion of gene-presence and gene-absence in addition to log2 ratios of gene expression values for the prediction. Target prediction is made by applying multiple machine-learning models that learn the patterns of gene-presence and gene-absence from log2 ratio and four types of Z scores from the normalized cohort's gene expression data. The learned patterns are then associated with the putative targets of the concerned transcription factor to elicit genes exhibiting Up/Down gene regulation patterns "consistently" within the cohort. Totally 11 publicly available GEO data sets related to osteoclastogenesis are used in our experiment. The learned models using gene-presence and gene-absence produce target genes different from using only log2 ratios such as CASP1, BID, and IRF5. Our literature survey reveals that all these predicted targets have known roles in bone remodeling, specifically related to immune and osteoclasts, suggesting confidence in our method and potential merit for a wet-lab experiment for validation.
引用
收藏
页码:164 / 167
页数:4
相关论文
共 50 条
  • [31] Prediction of tumor purity from gene expression data using machine learning
    Koo, Bonil
    Rhee, Je-Keun
    BRIEFINGS IN BIOINFORMATICS, 2021, 22 (06)
  • [32] Predicting Spatial and Temporal Gene Expression Using an Integrative Model of Transcription Factor Occupancy and Chromatin State
    Wilczynski, Bartek
    Liu, Ya-Hsin
    Yeo, Zhen Xuan
    Furlong, Eileen E. M.
    PLOS COMPUTATIONAL BIOLOGY, 2012, 8 (12)
  • [33] A Comparative Study on Predicting Autism Spectrum Disorders (ASD) Using Gene Expression and Machine Learning
    Alshamlan, Hala
    AL-Nojaidi, Hissah
    AlSuliman, Maraheb
    Alabduljabbar, Reham
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2020, 20 (11): : 66 - 73
  • [34] Predicting fatigue levels of head and neck cancer patients with gene expression using machine learning
    Eldridge, Ronald C.
    Miller, Andrew H.
    Bruner, Deborah W.
    Beitler, Jonathan J.
    Higgins, Kristin A.
    Wommack, Evanthia C.
    Linh Kha Huynh
    Saba, Nabil F.
    Shin, Dong M.
    Xiao, Canhua
    CANCER RESEARCH, 2018, 78 (13)
  • [35] Predicting Neuroendocrine Tumor (Carcinoid) Neoplasia Using Gene Expression Profiling and Supervised Machine Learning
    Drozdov, Ignat
    Kidd, Mark
    Nadler, Boaz
    Camp, Robert L.
    Mane, Shrikant M.
    Hauso, Oyvind
    Gustafsson, Bjorn I.
    Modlin, Irvin M.
    CANCER, 2009, 115 (08) : 1638 - 1650
  • [36] Predicting single-cell gene expression profiles of imaging flow cytometry data with machine learning
    Chlis, Nikolaos-Kosmas
    Rausch, Lisa
    Brocker, Thomas
    Kranich, Jan
    Theis, Fabian J.
    NUCLEIC ACIDS RESEARCH, 2020, 48 (20) : 11335 - 11346
  • [37] MetastaSite: Predicting metastasis to different sites using deep learning with gene expression data
    Albaradei, Somayah
    Albaradei, Abdurhman
    Alsaedi, Asim
    Uludag, Mahmut
    Thafar, Maha A.
    Gojobori, Takashi
    Essack, Magbubah
    Gao, Xin
    FRONTIERS IN MOLECULAR BIOSCIENCES, 2022, 9
  • [38] Transcription Factor-Target Gene Mapping Enhanced by Integrating Motif Search, Function Annotation and Expression Data
    Bai, Yu
    BIOPHYSICAL JOURNAL, 2010, 98 (03) : 197A - 197A
  • [39] A deep learning model to identify gene expression level using cobinding transcription factor signals
    Zhang, Lirong
    Yang, Yanchao
    Chai, Lu
    Li, Qianzhong
    Liu, Junjie
    Lin, Hao
    Liu, Li
    BRIEFINGS IN BIOINFORMATICS, 2022, 23 (01)
  • [40] Predicting low cognitive ability at age 5 using machine learning methods and birth cohort data
    Bowe, A.
    Staines, A.
    McCarthy, F.
    Lightbody, G.
    Murray, D.
    EUROPEAN JOURNAL OF PUBLIC HEALTH, 2022, 32