A binary search approach to whole-genome data analysis

被引:7
|
作者
Brodsky, Leonid [1 ]
Kogan, Simon [1 ]
BenJacob, Eshel [2 ]
Nevo, Eviatar [1 ]
机构
[1] Univ Haifa, Inst Evolut, IL-31905 Haifa, Israel
[2] Tel Aviv Univ, Sch Phys & Astron, IL-69978 Tel Aviv, Israel
关键词
genome segmentation; tiling array; next-generation sequencing; MODEL-BASED ANALYSIS; TILING MICROARRAY; CHIP-SEQ; MAP;
D O I
10.1073/pnas.1011134107
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
A sequence analysis-oriented binary search-like algorithm was transformed to a sensitive and accurate analysis tool for processing whole-genome data. The advantage of the algorithm over previous methods is its ability to detect the margins of both short and long genome fragments, enriched by up-regulated signals, at equal accuracy. The score of an enriched genome fragment reflects the difference between the actual concentration of up-regulated signals in the fragment and the chromosome signal baseline. The "divide-and-conquer"-type algorithm detects a series of nonintersecting fragments of various lengths with locally optimal scores. The procedure is applied to detected fragments in a nested manner by recalculating the lower-than-baseline signals in the chromosome. The algorithm was applied to simulated whole-genome data, and its sensitivity/specificity were compared with those of several alternative algorithms. The algorithm was also tested with four biological tiling array datasets comprising Arabidopsis (i) expression and (ii) histone 3 lysine 27 trimethylation CHIP-on-chip datasets; Saccharomyces cerevisiae (iii) spliced intron data and (iv) chromatin remodeling factor binding sites. The analyses' results demonstrate the power of the algorithm in identifying both the short up-regulated fragments (such as exons and transcription factor binding sites) and the long-even moderately up-regulated zones-at their precise genome margins. The algorithm generates an accurate whole-genome landscape that could be used for cross-comparison of signals across the same genome in evolutionary and general genomic studies.
引用
收藏
页码:16893 / 16898
页数:6
相关论文
共 50 条
  • [1] Saturation analysis for whole-genome bisulfite sequencing data
    Emanuele Libertini
    Simon C Heath
    Rifat A Hamoudi
    Marta Gut
    Michael J Ziller
    Javier Herrero
    Agata Czyz
    Victor Ruotti
    Hendrik G Stunnenberg
    Mattia Frontini
    Willem H Ouwehand
    Alexander Meissner
    Ivo G Gut
    Stephan Beck
    Nature Biotechnology, 2016, 34 : 691 - 693
  • [2] Chromosome fragmentation as an approach to whole-genome analysis in trypanosomes
    Kelly, JM
    Obado, S
    FUNCTIONAL MICROBIAL GENOMICS, 2002, 33 : 397 - 406
  • [3] Saturation analysis for whole-genome bisulfite sequencing data
    Libertini, Emanuele
    Heath, Simon C.
    Hamoudi, Rifat A.
    Gut, Marta
    Ziller, Michael J.
    Herrero, Javier
    Czyz, Agata
    Ruotti, Victor
    Stunnenberg, Hendrik G.
    Frontini, Mattia
    Ouwehand, Willem H.
    Meissner, Alexander
    Gut, Ivo G.
    Beck, Stephan
    NATURE BIOTECHNOLOGY, 2016, 34 (07) : 691 - 693
  • [4] Genome partitioning and whole-genome analysis
    Schork, NJ
    GENETIC DISSECTION OF COMPLEX TRAITS, 2001, 42 : 299 - 322
  • [5] An information-theoretic approach to the modeling and analysis of whole-genome bisulfite sequencing data
    Jenkinson, Garrett
    Abante, Jordi
    Feinberg, Andrew P.
    Goutsias, John
    BMC BIOINFORMATICS, 2018, 19
  • [6] An information-theoretic approach to the modeling and analysis of whole-genome bisulfite sequencing data
    Garrett Jenkinson
    Jordi Abante
    Andrew P. Feinberg
    John Goutsias
    BMC Bioinformatics, 19
  • [7] Exhaustive whole-genome tandem repeats search
    Krishnan, A
    Tang, F
    BIOINFORMATICS, 2004, 20 (16) : 2702 - 2710
  • [8] Whole-genome analyses of whole-brain data: working within an expanded search space
    Medland, Sarah E.
    Jahanshad, Neda
    Neale, Benjamin M.
    Thompson, Paul M.
    NATURE NEUROSCIENCE, 2014, 17 (06) : 791 - 800
  • [9] Whole-genome analyses of whole-brain data: working within an expanded search space
    Sarah E Medland
    Neda Jahanshad
    Benjamin M Neale
    Paul M Thompson
    Nature Neuroscience, 2014, 17 : 791 - 800
  • [10] Interpreting Whole-Genome Marker Data
    Weir B.S.
    Statistics in Biosciences, 2013, 5 (2) : 316 - 329