MAGIC: A tool for predicting transcription factors and cofactors driving gene sets using ENCODE data

被引：0

作者：

Roopra, Avtar ^{[1
]}

机构：

[1] Univ Wisconsin Madison, Dept Neurosci, 5507 WIMR, Madison, WI 53706 USA

来源：

PLOS COMPUTATIONAL BIOLOGY | 2020年 / 16卷 / 04期

关键词：

CTCF; DEACETYLASE; EXPRESSION; REPRESSION; BINDING; GROWTH;

D O I：

10.1371/journal.pcbi.1007800; 10.1371/journal.pcbi.1007800.r001; 10.1371/journal.pcbi.1007800.r002; 10.1371/journal.pcbi.1007800.r003; 10.1371/journal.pcbi.1007800.r004; 10.1371/journal.pcbi.1007800.r005; 10.1371/journal.pcbi.1007800.r006

中图分类号：

Q5 [生物化学];

学科分类号：

071010 ; 081704 ;

摘要：

Transcriptomic profiling is an immensely powerful hypothesis generating tool. However, accurately predicting the transcription factors (TFs) and cofactors that drive transcriptomic differences between samples is challenging. A number of algorithms draw on ChIP-seq tracks to define TFs and cofactors behind gene changes. These approaches assign TFs and cofactors to genes via a binary designation of 'target', or 'non-target' followed by Fisher Exact Tests to assess enrichment of TFs and cofactors. ENCODE archives 2314 ChIP-seq tracks of 684 TFs and cofactors assayed across a 117 human cell lines under a multitude of growth and maintenance conditions. The algorithm presented herein, Mining Algorithm for GenetIc Controllers (MAGIC), uses ENCODE ChIP-seq data to look for statistical enrichment of TFs and cofactors in gene bodies and flanking regions in gene lists without an a priori binary classification of genes as targets or non-targets. When compared to other TF mining resources, MAGIC displayed favourable performance in predicting TFs and cofactors that drive gene changes in 4 settings: 1) A cell line expressing or lacking single TF, 2) Breast tumors divided along PAM50 designations 3) Whole brain samples from WT mice or mice lacking a single TF in a particular neuronal subtype 4) Single cell RNAseq analysis of neurons divided by Immediate Early Gene expression levels. In summary, MAGIC is a standalone application that produces meaningful predictions of TFs and cofactors in transcriptomic experiments. Author summary Key to the control of gene expression is the level of transcript in the cell. This level is controlled large part by Transcription factors (TFs) and cofactors. TFs are DNA binding proteins that recognize specific sequence elements to control levels of gene activity. TFs recruit cofactors that do not themselves bind DNA but are brought to promoters via TFs to either enhance or repress gene expression. TFs and cofactors are thus key regulators of transcript levels. It is now routine to obtain the expression levels of every gene transcript in the genome i.e. whole transcriptome data. Understanding how the transcriptome is controlled is challenging. Herein, a method is described that predicts which Factors organize and control sets of genes. The algorithm is termed Mining Algorithm for GenetIc Controllers (MAGIC). MAGIC uses data derived from ChIPseq tracks archived at ENCODE to decipher which Factors are most likely to preferentially bind lists of genes that are altered from one biological state to another. MAGIC circumvents the principal confounds of current methods to identify Factors and will aid in the discovery of organizing principles behind large scale gene changes seen in physiology and disease.

引用

页数：20

共 50 条

[11] Predicting Alzheimer's Disease Using Driving Simulator Data
Blanchette, Ryan
Khojandi, Anahita
Cox, Daniel
Oliver, Michael
Fernandez, Roberto
42ND ANNUAL INTERNATIONAL CONFERENCES OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY: ENABLING INNOVATIVE TECHNOLOGIES FOR GLOBAL HEALTHCARE EMBC'20, 2020, : 5432 - 5435
[12] Predicting Cancer Prognosis Using Functional Genomics Data Sets
Das, Jishnu
Gayvert, Kaitlyn
Yu, Haiyuan
CANCER INFORMATICS, 2014, 13 : 85 - 88
[13] Novel human and mouse genes related to the Drosophila nuclear factor vestigial encode tissue-specific cofactors of the TEF-1 transcription factors.
Zhu, CX
Maeda, T
Fu, J
Stewart, AFR
CIRCULATION, 2000, 102 (18) : 294 - 294
[14] Predicting bacterial transcription units using sequence and expression data
Bockhorst, Joseph
Qiu, Yu
Glasner, Jeremy
Liu, Mingzhu
Blattner, Frederick
Craven, Mark
BIOINFORMATICS, 2003, 19 : i34 - i43
[15] Predicting transcription factors in human alcoholic hepatitis from gene regulatory network
Mohammadnia, A.
Yaqubi, M.
Fallahi, H.
EUROPEAN REVIEW FOR MEDICAL AND PHARMACOLOGICAL SCIENCES, 2015, 19 (12) : 2246 - 2253
[16] Integrating diverse genomic data using gene sets
Svitlana Tyekucheva
Luigi Marchionni
Rachel Karchin
Giovanni Parmigiani
Genome Biology, 12
[17] Integrating diverse genomic data using gene sets
Tyekucheva, Svitlana
Marchionni, Luigi
Karchin, Rachel
Parmigiani, Giovanni
GENOME BIOLOGY, 2011, 12 (10):
[18] A primer for generating and using transcriptome data and gene sets
Cockrum, Chad
Kaneshiro, Kiyomi R.
Rechtsteiner, Andreas
Tabuchi, Tomoko M.
Strome, Susan
DEVELOPMENT, 2020, 147 (24):
[19] Predicting Gene Networks and Transcription Factors in TNFα Treatment of Trabecular Meshwork Cells
Choi, Dongseok
Hayashi, Lauren
Carr, Kathryn
Kelley, Mary
Acott, Ted
INVESTIGATIVE OPHTHALMOLOGY & VISUAL SCIENCE, 2013, 54 (15)
[20] GENE-CBR:: A case-based reasonig tool for cancer diagnosis using microarray data sets
Diaz, Fernando
Fdez-Riverola, Florentino
Corchado, Juan M.
COMPUTATIONAL INTELLIGENCE, 2006, 22 (3-4) : 254 - 268

← 1 2 3 4 5 →