MAGIC: A tool for predicting transcription factors and cofactors driving gene sets using ENCODE data

被引:0
|
作者
Roopra, Avtar [1 ]
机构
[1] Univ Wisconsin Madison, Dept Neurosci, 5507 WIMR, Madison, WI 53706 USA
关键词
CTCF; DEACETYLASE; EXPRESSION; REPRESSION; BINDING; GROWTH;
D O I
10.1371/journal.pcbi.1007800; 10.1371/journal.pcbi.1007800.r001; 10.1371/journal.pcbi.1007800.r002; 10.1371/journal.pcbi.1007800.r003; 10.1371/journal.pcbi.1007800.r004; 10.1371/journal.pcbi.1007800.r005; 10.1371/journal.pcbi.1007800.r006
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Transcriptomic profiling is an immensely powerful hypothesis generating tool. However, accurately predicting the transcription factors (TFs) and cofactors that drive transcriptomic differences between samples is challenging. A number of algorithms draw on ChIP-seq tracks to define TFs and cofactors behind gene changes. These approaches assign TFs and cofactors to genes via a binary designation of 'target', or 'non-target' followed by Fisher Exact Tests to assess enrichment of TFs and cofactors. ENCODE archives 2314 ChIP-seq tracks of 684 TFs and cofactors assayed across a 117 human cell lines under a multitude of growth and maintenance conditions. The algorithm presented herein, Mining Algorithm for GenetIc Controllers (MAGIC), uses ENCODE ChIP-seq data to look for statistical enrichment of TFs and cofactors in gene bodies and flanking regions in gene lists without an a priori binary classification of genes as targets or non-targets. When compared to other TF mining resources, MAGIC displayed favourable performance in predicting TFs and cofactors that drive gene changes in 4 settings: 1) A cell line expressing or lacking single TF, 2) Breast tumors divided along PAM50 designations 3) Whole brain samples from WT mice or mice lacking a single TF in a particular neuronal subtype 4) Single cell RNAseq analysis of neurons divided by Immediate Early Gene expression levels. In summary, MAGIC is a standalone application that produces meaningful predictions of TFs and cofactors in transcriptomic experiments. Author summary Key to the control of gene expression is the level of transcript in the cell. This level is controlled large part by Transcription factors (TFs) and cofactors. TFs are DNA binding proteins that recognize specific sequence elements to control levels of gene activity. TFs recruit cofactors that do not themselves bind DNA but are brought to promoters via TFs to either enhance or repress gene expression. TFs and cofactors are thus key regulators of transcript levels. It is now routine to obtain the expression levels of every gene transcript in the genome i.e. whole transcriptome data. Understanding how the transcriptome is controlled is challenging. Herein, a method is described that predicts which Factors organize and control sets of genes. The algorithm is termed Mining Algorithm for GenetIc Controllers (MAGIC). MAGIC uses data derived from ChIPseq tracks archived at ENCODE to decipher which Factors are most likely to preferentially bind lists of genes that are altered from one biological state to another. MAGIC circumvents the principal confounds of current methods to identify Factors and will aid in the discovery of organizing principles behind large scale gene changes seen in physiology and disease.
引用
收藏
页数:20
相关论文
共 50 条
  • [11] Predicting Alzheimer's Disease Using Driving Simulator Data
    Blanchette, Ryan
    Khojandi, Anahita
    Cox, Daniel
    Oliver, Michael
    Fernandez, Roberto
    42ND ANNUAL INTERNATIONAL CONFERENCES OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY: ENABLING INNOVATIVE TECHNOLOGIES FOR GLOBAL HEALTHCARE EMBC'20, 2020, : 5432 - 5435
  • [12] Predicting Cancer Prognosis Using Functional Genomics Data Sets
    Das, Jishnu
    Gayvert, Kaitlyn
    Yu, Haiyuan
    CANCER INFORMATICS, 2014, 13 : 85 - 88
  • [13] Novel human and mouse genes related to the Drosophila nuclear factor vestigial encode tissue-specific cofactors of the TEF-1 transcription factors.
    Zhu, CX
    Maeda, T
    Fu, J
    Stewart, AFR
    CIRCULATION, 2000, 102 (18) : 294 - 294
  • [14] Predicting bacterial transcription units using sequence and expression data
    Bockhorst, Joseph
    Qiu, Yu
    Glasner, Jeremy
    Liu, Mingzhu
    Blattner, Frederick
    Craven, Mark
    BIOINFORMATICS, 2003, 19 : i34 - i43
  • [15] Predicting transcription factors in human alcoholic hepatitis from gene regulatory network
    Mohammadnia, A.
    Yaqubi, M.
    Fallahi, H.
    EUROPEAN REVIEW FOR MEDICAL AND PHARMACOLOGICAL SCIENCES, 2015, 19 (12) : 2246 - 2253
  • [16] Integrating diverse genomic data using gene sets
    Svitlana Tyekucheva
    Luigi Marchionni
    Rachel Karchin
    Giovanni Parmigiani
    Genome Biology, 12
  • [17] Integrating diverse genomic data using gene sets
    Tyekucheva, Svitlana
    Marchionni, Luigi
    Karchin, Rachel
    Parmigiani, Giovanni
    GENOME BIOLOGY, 2011, 12 (10):
  • [18] A primer for generating and using transcriptome data and gene sets
    Cockrum, Chad
    Kaneshiro, Kiyomi R.
    Rechtsteiner, Andreas
    Tabuchi, Tomoko M.
    Strome, Susan
    DEVELOPMENT, 2020, 147 (24):
  • [19] Predicting Gene Networks and Transcription Factors in TNFα Treatment of Trabecular Meshwork Cells
    Choi, Dongseok
    Hayashi, Lauren
    Carr, Kathryn
    Kelley, Mary
    Acott, Ted
    INVESTIGATIVE OPHTHALMOLOGY & VISUAL SCIENCE, 2013, 54 (15)
  • [20] GENE-CBR:: A case-based reasonig tool for cancer diagnosis using microarray data sets
    Diaz, Fernando
    Fdez-Riverola, Florentino
    Corchado, Juan M.
    COMPUTATIONAL INTELLIGENCE, 2006, 22 (3-4) : 254 - 268