MAGIC: A tool for predicting transcription factors and cofactors driving gene sets using ENCODE data

被引：0

作者：

Roopra, Avtar ^{[1
]}

机构：

[1] Univ Wisconsin Madison, Dept Neurosci, 5507 WIMR, Madison, WI 53706 USA

来源：

PLOS COMPUTATIONAL BIOLOGY | 2020年 / 16卷 / 04期

关键词：

CTCF; DEACETYLASE; EXPRESSION; REPRESSION; BINDING; GROWTH;

D O I：

10.1371/journal.pcbi.1007800; 10.1371/journal.pcbi.1007800.r001; 10.1371/journal.pcbi.1007800.r002; 10.1371/journal.pcbi.1007800.r003; 10.1371/journal.pcbi.1007800.r004; 10.1371/journal.pcbi.1007800.r005; 10.1371/journal.pcbi.1007800.r006

中图分类号：

Q5 [生物化学];

学科分类号：

071010 ; 081704 ;

摘要：

Transcriptomic profiling is an immensely powerful hypothesis generating tool. However, accurately predicting the transcription factors (TFs) and cofactors that drive transcriptomic differences between samples is challenging. A number of algorithms draw on ChIP-seq tracks to define TFs and cofactors behind gene changes. These approaches assign TFs and cofactors to genes via a binary designation of 'target', or 'non-target' followed by Fisher Exact Tests to assess enrichment of TFs and cofactors. ENCODE archives 2314 ChIP-seq tracks of 684 TFs and cofactors assayed across a 117 human cell lines under a multitude of growth and maintenance conditions. The algorithm presented herein, Mining Algorithm for GenetIc Controllers (MAGIC), uses ENCODE ChIP-seq data to look for statistical enrichment of TFs and cofactors in gene bodies and flanking regions in gene lists without an a priori binary classification of genes as targets or non-targets. When compared to other TF mining resources, MAGIC displayed favourable performance in predicting TFs and cofactors that drive gene changes in 4 settings: 1) A cell line expressing or lacking single TF, 2) Breast tumors divided along PAM50 designations 3) Whole brain samples from WT mice or mice lacking a single TF in a particular neuronal subtype 4) Single cell RNAseq analysis of neurons divided by Immediate Early Gene expression levels. In summary, MAGIC is a standalone application that produces meaningful predictions of TFs and cofactors in transcriptomic experiments. Author summary Key to the control of gene expression is the level of transcript in the cell. This level is controlled large part by Transcription factors (TFs) and cofactors. TFs are DNA binding proteins that recognize specific sequence elements to control levels of gene activity. TFs recruit cofactors that do not themselves bind DNA but are brought to promoters via TFs to either enhance or repress gene expression. TFs and cofactors are thus key regulators of transcript levels. It is now routine to obtain the expression levels of every gene transcript in the genome i.e. whole transcriptome data. Understanding how the transcriptome is controlled is challenging. Herein, a method is described that predicts which Factors organize and control sets of genes. The algorithm is termed Mining Algorithm for GenetIc Controllers (MAGIC). MAGIC uses data derived from ChIPseq tracks archived at ENCODE to decipher which Factors are most likely to preferentially bind lists of genes that are altered from one biological state to another. MAGIC circumvents the principal confounds of current methods to identify Factors and will aid in the discovery of organizing principles behind large scale gene changes seen in physiology and disease.

引用

页数：20

共 50 条

[21] Predicting master transcription factors from pan-cancer expression data
Reddy, Jessica
Fonseca, Marcos A. S.
Corona, Rosario, I
Nameki, Robbin
Dezem, Felipe Segato
Klein, Isaac A.
Chang, Heidi
Chaves-Moreira, Daniele
Afeyan, Lena K.
Malta, Tathiane M.
Lin, Xianzhi
Abbasi, Forough
Font-Tello, Alba
Sabedot, Thais
Cejas, Paloma
Rodriguez-Malave, Norma
Seo, Ji-Heui
Lin, De-Chen
Matulonis, Ursula
Karlan, Beth Y.
Gayther, Simon A.
Pasaniuc, Bogdan
Gusev, Alexander
Noushmehr, Houtan
Long, Henry
Freedman, Matthew L.
Drapkin, Ronny
Young, Richard A.
Abraham, Brian J.
Lawrenson, Kate
SCIENCE ADVANCES, 2021, 7 (48):
[22] Predicting Hazardous Events in Work Zones Using Naturalistic Driving Data
Chang, Yohan
Edara, Praveen
2017 IEEE 20TH INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS (ITSC), 2017,
[23] Predicting hyperosmolality-inducible transcription factors using MEME tools
Kim, Chanhee
Haworth, Lorna
Fu, Yuhan
Kultz, Dietmar
FASEB JOURNAL, 2021, 35
[24] Predicting Preference of Transcription Factors for Methylated DNA Using Sequence Information
Liu, Meng-Lu
Su, Wei
Wang, Jia-Shu
Yang, Yu-He
Yang, Hui
Lin, Hao
MOLECULAR THERAPY NUCLEIC ACIDS, 2020, 22 : 1043 - 1050
[25] Predicting hyperosmolality-inducible transcription factors using MEME tools
Kim, C.
Kultz, D.
INTEGRATIVE AND COMPARATIVE BIOLOGY, 2023, 62 : S169 - S169
[26] Predicting proteome dynamics using gene expression data
Kuchta, Krzysztof
Towpik, Joanna
Biernacka, Anna
Kutner, Jan
Kudlicki, Andrzej
Ginalski, Krzysztof
Rowicka, Maga
SCIENTIFIC REPORTS, 2018, 8
[27] Predicting gene dosage using genomic sequence data
Barker, Jocelyn Elaine
Sherlock, Gavin
Hartman, James
Morgan, William
FASEB JOURNAL, 2008, 22
[28] Predicting proteome dynamics using gene expression data
Krzysztof Kuchta
Joanna Towpik
Anna Biernacka
Jan Kutner
Andrzej Kudlicki
Krzysztof Ginalski
Maga Rowicka
Scientific Reports, 8
[29] Predicting and deciphering preventive gene sets using gene expression data and protein encoded sequences of Genistein treated PC-3 cells
Laslo, R
Rowland, I
Klocker, H
Hancok, RL
Pardini, RS
Baba, AI
CANCER EPIDEMIOLOGY BIOMARKERS & PREVENTION, 2005, 14 (11) : 2698S - 2698S
[30] Gene function analysis in complex data sets using ErmineJ
Jesse Gillis
Meeta Mistry
Paul Pavlidis
Nature Protocols, 2010, 5 : 1148 - 1159

← 1 2 3 4 5 →