Is this the right normalization? A diagnostic tool for ChIP-seq normalization

被引:8
|
作者
Angelini, Claudia [1 ]
Heller, Ruth [2 ]
Volkinshtein, Rita [2 ]
Yekutieli, Daniel [2 ]
机构
[1] Ist Applicaz Calcolo Mauro Picone, I-80131 Naples, Italy
[2] Tel Aviv Univ, Dept Stat & Operat Res, IL-69978 Tel Aviv, Israel
来源
BMC BIOINFORMATICS | 2015年 / 16卷
基金
以色列科学基金会;
关键词
Chip-Seq; Diagnostic plots; Normalization; TRANSCRIPTION FACTOR-BINDING; PROTEIN-DNA INTERACTIONS; HUMAN GENOME; CHROMATIN; IDENTIFICATION; DOMAINS; DESIGN;
D O I
10.1186/s12859-015-0579-z
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Chip-seq experiments are becoming a standard approach for genome-wide profiling protein-DNA interactions, such as detecting transcription factor binding sites, histone modification marks and RNA Polymerase II occupancy. However, when comparing a ChIP sample versus a control sample, such as Input DNA, normalization procedures have to be applied in order to remove experimental source of biases. Despite the substantial impact that the choice of the normalization method can have on the results of a ChIP-seq data analysis, their assessment is not fully explored in the literature. In particular, there are no diagnostic tools that show whether the applied normalization is indeed appropriate for the data being analyzed. Results: In this work we propose a novel diagnostic tool to examine the appropriateness of the estimated normalization procedure. By plotting the empirical densities of log relative risks in bins of equal read count, along with the estimated normalization constant, after logarithmic transformation, the researcher is able to assess the appropriateness of the estimated normalization constant. We use the diagnostic plot to evaluate the appropriateness of the estimates obtained by CisGenome, NCIS and CCAT on several real data examples. Moreover, we show the impact that the choice of the normalization constant can have on standard tools for peak calling such as MACS or SICER. Finally, we propose a novel procedure for controlling the FDR using sample swapping. This procedure makes use of the estimated normalization constant in order to gain power over the naive choice of constant (used in MACS and SICER), which is the ratio of the total number of reads in the ChIP and Input samples. Conclusions: Linear normalization approaches aim to estimate a scale factor, r, to adjust for different sequencing depths when comparing ChIP versus Input samples. The estimated scaling factor can easily be incorporated in many peak caller algorithms to improve the accuracy of the peak identification. The diagnostic plot proposed in this paper can be used to assess how adequate ChIP/Input normalization constants are, and thus it allows the user to choose the most adequate estimate for the analysis.
引用
收藏
页数:15
相关论文
共 50 条
  • [41] NORMSEQ: a tool for evaluation, selection and visualization of RNA-Seq normalization methods
    Scheepbouwer, Chantal
    Hackenberg, Michael
    van Eijndhoven, Monique A. J.
    Gerber, Alan
    Pegtel, Michiel
    Gomez-Martin, Cristina
    [J]. NUCLEIC ACIDS RESEARCH, 2023, 51 (W1) : W372 - W378
  • [42] W-ChIPeaks: a comprehensive web application tool for processing ChIP-chip and ChIP-seq data
    Lan, Xun
    Bonneville, Russell
    Apostolos, Jeff
    Wu, Wangcheng
    Jin, Victor X.
    [J]. BIOINFORMATICS, 2011, 27 (03) : 428 - 430
  • [43] ALGAEFUN with MARACAS, microALGAE FUNctional enrichment tool for MicroAlgae RnA-seq and Chip-seq AnalysiS
    Romero-Losada, Ana B.
    Arvanitidou, Christina
    de los Reyes, Pedro
    Garcia-Gonzalez, Mercedes
    Romero-Campero, Francisco J.
    [J]. BMC BIOINFORMATICS, 2022, 23 (01)
  • [44] ALGAEFUN with MARACAS, microALGAE FUNctional enrichment tool for MicroAlgae RnA-seq and Chip-seq AnalysiS
    Ana B. Romero-Losada
    Christina Arvanitidou
    Pedro de los Reyes
    Mercedes García-González
    Francisco J. Romero-Campero
    [J]. BMC Bioinformatics, 23
  • [45] NORMALIZATION THROUGH NORMALIZATION PRINCIPLE - RIGHT ENDS, WRONG MEANS
    THRONE, JM
    [J]. MENTAL RETARDATION, 1975, 13 (05): : 23 - 25
  • [46] CistromeFinder for ChIP-seq and DNase-seq data reuse
    Sun, Hanfei
    Qin, Bo
    Liu, Tao
    Wang, Qixuan
    Liu, Jing
    Wang, Juan
    Lin, Xueqiu
    Yang, Yulin
    Taing, Len
    Rao, Prakash K.
    Brown, Myles
    Zhang, Yong
    Long, Henry W.
    Liu, X. Shirley
    [J]. BIOINFORMATICS, 2013, 29 (10) : 1352 - 1354
  • [47] Impact of sequencing depth in ChIP-seq experiments
    Jung, Youngsook L.
    Luquette, Lovelace J.
    Ho, Joshua W. K.
    Ferrari, Francesco
    Tolstorukov, Michael
    Minoda, Aki
    Issner, Robbyn
    Epstein, Charles B.
    Karpen, Gary H.
    Kuroda, Mitzi I.
    Park, Peter J.
    [J]. NUCLEIC ACIDS RESEARCH, 2014, 42 (09) : e74
  • [48] COPAR: A ChIP-Seq Optimal Peak Analyzer
    Tang, Binhua
    Wang, Xihan
    Jin, Victor X.
    [J]. BIOMED RESEARCH INTERNATIONAL, 2017, 2017
  • [49] A computational pipeline for comparative ChIP-seq analyses
    Anaïs F Bardet
    Qiye He
    Julia Zeitlinger
    Alexander Stark
    [J]. Nature Protocols, 2012, 7 : 45 - 61
  • [50] Identifying ChIP-seq enrichment using MACS
    Jianxing Feng
    Tao Liu
    Bo Qin
    Yong Zhang
    Xiaole Shirley Liu
    [J]. Nature Protocols, 2012, 7 : 1728 - 1740