Inferring Phylogenies from RAD Sequence Data

被引:251
|
作者
Rubin, Benjamin E. R. [1 ,2 ]
Ree, Richard H. [3 ]
Moreau, Corrie S. [2 ]
机构
[1] Univ Chicago, Comm Evolutionary Biol, Chicago, IL 60637 USA
[2] Field Museum Nat Hist, Dept Zool, Chicago, IL 60605 USA
[3] Field Museum Nat Hist, Dept Bot, Chicago, IL 60605 USA
来源
PLOS ONE | 2012年 / 7卷 / 04期
基金
美国国家科学基金会;
关键词
MAXIMUM-LIKELIHOOD; DIVERGENCE TIMES; EVOLUTIONARY; IDENTIFICATION; MAMMALS; MARKERS; GENES; TREES; LOCI;
D O I
10.1371/journal.pone.0033394
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Reduced-representation genome sequencing represents a new source of data for systematics, and its potential utility in interspecific phylogeny reconstruction has not yet been explored. One approach that seems especially promising is the use of inexpensive short-read technologies (e. g., Illumina, SOLiD) to sequence restriction-site associated DNA (RAD) - the regions of the genome that flank the recognition sites of restriction enzymes. In this study, we simulated the collection of RAD sequences from sequenced genomes of different taxa (Drosophila, mammals, and yeasts) and developed a proof-of-concept workflow to test whether informative data could be extracted and used to accurately reconstruct "known" phylogenies of species within each group. The workflow consists of three basic steps: first, sequences are clustered by similarity to estimate orthology; second, clusters are filtered by taxonomic coverage; and third, they are aligned and concatenated for "total evidence" phylogenetic analysis. We evaluated the performance of clustering and filtering parameters by comparing the resulting topologies with well-supported reference trees and we were able to identify conditions under which the reference tree was inferred with high support. For Drosophila, whole genome alignments allowed us to directly evaluate which parameters most consistently recovered orthologous sequences. For the parameter ranges explored, we recovered the best results at the low ends of sequence similarity and taxonomic representation of loci; these generated the largest supermatrices with the highest proportion of missing data. Applications of the method to mammals and yeasts were less successful, which we suggest may be due partly to their much deeper evolutionary divergence times compared to Drosophila (crown ages of approximately 100 and 300 versus 60 Mya, respectively). RAD sequences thus appear to hold promise for reconstructing phylogenetic relationships in younger clades in which sufficient numbers of orthologous restriction sites are retained across species.
引用
收藏
页数:12
相关论文
共 50 条
  • [32] Prokaryotic Phylogenies Inferred from Whole-Genome Sequence and Annotation Data
    Du, Wei
    Cao, Zhongbo
    Wang, Yan
    Sun, Ying
    Blanzieri, Enrico
    Liang, Yanchun
    BIOMED RESEARCH INTERNATIONAL, 2013, 2013
  • [33] Inferring Social Network Structure from Bacterial Sequence Data
    Plucinski, Mateusz M.
    Starfield, Richard
    Almeida, Rodrigo P. P.
    PLOS ONE, 2011, 6 (08):
  • [34] Inferring phylogenies from pandemic-scale genome datasets
    Nature Genetics, 2023, 55 : 734 - 735
  • [35] Inferring Tumor Phylogenies from Multi-region Sequencing
    Hu, Zheng
    Curtis, Christina
    CELL SYSTEMS, 2016, 3 (01) : 12 - 14
  • [36] Inferring phylogenies from pandemic-scale genome datasets
    De Malo, Nicola
    Goldman, Nick
    NATURE GENETICS, 2023, 55 (05) : 734 - 735
  • [37] Power and pitfalls of computational methods for inferring clone phylogenies and mutation orders from bulk sequencing data
    Sayaka Miura
    Tracy Vu
    Jiamin Deng
    Tiffany Buturla
    Olumide Oladeinde
    Jiyeong Choi
    Sudhir Kumar
    Scientific Reports, 10
  • [38] Power and pitfalls of computational methods for inferring clone phylogenies and mutation orders from bulk sequencing data
    Miura, Sayaka
    Vu, Tracy
    Deng, Jiamin
    Buturla, Tiffany
    Oladeinde, Olumide
    Choi, Jiyeong
    Kumar, Sudhir
    SCIENTIFIC REPORTS, 2020, 10 (01)
  • [39] Inferring species phylogenies: A microarray approach
    Han, Xiaoxu
    COMPUTATIONAL INTELLIGENCE AND BIOINFORMATICS, PT 3, PROCEEDINGS, 2006, 4115 : 485 - 493
  • [40] On the desirability of models for inferring genome phylogenies
    McInerney, JO
    TRENDS IN MICROBIOLOGY, 2006, 14 (01) : 1 - 2