Computational prediction and characterization of cell-type-specific and shared binding sites

被引:7
|
作者
Zhang, Qinhu [1 ,2 ]
Teng, Pengrui [3 ]
Wang, Siguo [4 ]
He, Ying [4 ]
Cui, Zhen [4 ]
Guo, Zhenghao [4 ]
Liu, Yixin [5 ]
Yuan, Changan [6 ]
Liu, Qi [1 ,2 ]
Huang, De-Shuang [7 ]
机构
[1] Tongji Univ, Translat Med Ctr Stem Cell Therapy, Shanghai 200092, Peoples R China
[2] Tongji Univ, Shanghai East Hosp, Inst Regenerat Med, Sch Life Sci & Technol,Bioinformat Dept, Shanghai 200092, Peoples R China
[3] China Univ Min & Technol, Sch Informat & Control Engn, Xuzhou 221116, Peoples R China
[4] Tongji Univ, Inst Machine Learning & Syst Biol, Sch Elect & Informat Engn, Shanghai 201804, Peoples R China
[5] Univ Shanghai Sci & Technol, Sch Hlth Sci & Engn, Shanghai 200093, Peoples R China
[6] Guangxi Acad Sci, Big Data & Intelligent Comp Res Ctr, Nanning 530007, Peoples R China
[7] EIT Inst Adv Study, Ningbo 315201, Zhejiang, Peoples R China
基金
国家重点研发计划; 中国国家自然科学基金;
关键词
CHIP-SEQ; DNA; SEQUENCE; REVEALS;
D O I
10.1093/bioinformatics/btac798
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Cell-type-specific gene expression is maintained in large part by transcription factors (TFs) selectively binding to distinct sets of sites in different cell types. Recent research works have provided evidence that such cell-type-specific binding is determined by TF's intrinsic sequence preferences, cooperative interactions with co-factors, cell-type-specific chromatin landscapes and 3D chromatin interactions. However, computational prediction and characterization of cell-type-specific and shared binding sites is rarely studied. Results: In this article, we propose two computational approaches for predicting and characterizing cell-type-specific and shared binding sites by integrating multiple types of features, in which one is based on XGBoost and another is based on convolutional neural network (CNN). To validate the performance of our proposed approaches, ChIP-seq datasets of 10 binding factors were collected from the GM12878 (lymphoblastoid) and K562 (erythroleukemic) human hematopoietic cell lines, each of which was further categorized into cell-type-specific (GM12878- and K562-specific) and shared binding sites. Then, multiple types of features for these binding sites were integrated to train the XGBoost- and CNN-based models. Experimental results show that our proposed approaches significantly outperform other competing methods on three classification tasks. Moreover, we identified independent feature contributions for cell-type-specific and shared sites through SHAP values and explored the ability of the CNN-based model to predict cell-type-specific and shared binding sites by excluding or including DNase signals. Furthermore, we investigated the generalization ability of our proposed approaches to different binding factors in the same cellular environment.
引用
收藏
页数:9
相关论文
共 50 条
  • [21] A cell-type-specific jolt for motor disorders
    Yu-Wei Wu
    Jun B Ding
    Nature Neuroscience, 2017, 20 : 763 - 765
  • [22] Cell-Type-Specific Regulation of Transcription by Estrogen
    Krum, Susan A.
    Brown, Myles
    JOURNAL OF WOMENS HEALTH, 2008, 17 (08) : 1238 - 1239
  • [23] Cell-type-specific pathways of neurotensin endocytosis
    Savdie, C
    Ferguson, SSG
    Vincent, JP
    Beaudet, A
    Stroh, T
    CELL AND TISSUE RESEARCH, 2006, 324 (01) : 69 - 85
  • [24] EXPRESSION OF CELL-TYPE-SPECIFIC NEURONAL PHOSPHOPROTEINS
    LEWIS, RM
    WALLACE, WC
    KANAZIR, SD
    GREENGARD, P
    COLD SPRING HARBOR SYMPOSIA ON QUANTITATIVE BIOLOGY, 1983, 48 : 347 - 354
  • [25] Computational Prediction of RNA-Binding Proteins and Binding Sites
    Si, Jingna
    Cui, Jing
    Cheng, Jin
    Wu, Rongling
    INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2015, 16 (11): : 26303 - 26317
  • [26] Strain variation in glycosaminoglycan recognition influences cell-type-specific binding by Lyme disease spirochetes
    Parveen, N
    Robbins, D
    Leong, JM
    INFECTION AND IMMUNITY, 1999, 67 (04) : 1743 - 1749
  • [27] Prediction of cell-type-specific cohesin-mediated chromatin loops based on chromatin state
    Liu, Li
    Jia, Ranran
    Hou, Rui
    Huang, Chengbing
    METHODS, 2024, 226 : 151 - 160
  • [28] Detection of cell-type-specific risk-CpG sites in epigenome-wide association studies
    Xiangyu Luo
    Can Yang
    Yingying Wei
    Nature Communications, 10
  • [29] Computational prediction of protein phosphopeptide-binding sites
    Joughin, BA
    Yaffe, MB
    Tidor, B
    PROTEIN SCIENCE, 2004, 13 : 146 - 146
  • [30] Detection of cell-type-specific risk-CpG sites in epigenome-wide association studies
    Luo, Xiangyu
    Yang, Can
    Wei, Yingying
    NATURE COMMUNICATIONS, 2019, 10 (1)