Inferring Phylogenies from RAD Sequence Data

被引:251
|
作者
Rubin, Benjamin E. R. [1 ,2 ]
Ree, Richard H. [3 ]
Moreau, Corrie S. [2 ]
机构
[1] Univ Chicago, Comm Evolutionary Biol, Chicago, IL 60637 USA
[2] Field Museum Nat Hist, Dept Zool, Chicago, IL 60605 USA
[3] Field Museum Nat Hist, Dept Bot, Chicago, IL 60605 USA
来源
PLOS ONE | 2012年 / 7卷 / 04期
基金
美国国家科学基金会;
关键词
MAXIMUM-LIKELIHOOD; DIVERGENCE TIMES; EVOLUTIONARY; IDENTIFICATION; MAMMALS; MARKERS; GENES; TREES; LOCI;
D O I
10.1371/journal.pone.0033394
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Reduced-representation genome sequencing represents a new source of data for systematics, and its potential utility in interspecific phylogeny reconstruction has not yet been explored. One approach that seems especially promising is the use of inexpensive short-read technologies (e. g., Illumina, SOLiD) to sequence restriction-site associated DNA (RAD) - the regions of the genome that flank the recognition sites of restriction enzymes. In this study, we simulated the collection of RAD sequences from sequenced genomes of different taxa (Drosophila, mammals, and yeasts) and developed a proof-of-concept workflow to test whether informative data could be extracted and used to accurately reconstruct "known" phylogenies of species within each group. The workflow consists of three basic steps: first, sequences are clustered by similarity to estimate orthology; second, clusters are filtered by taxonomic coverage; and third, they are aligned and concatenated for "total evidence" phylogenetic analysis. We evaluated the performance of clustering and filtering parameters by comparing the resulting topologies with well-supported reference trees and we were able to identify conditions under which the reference tree was inferred with high support. For Drosophila, whole genome alignments allowed us to directly evaluate which parameters most consistently recovered orthologous sequences. For the parameter ranges explored, we recovered the best results at the low ends of sequence similarity and taxonomic representation of loci; these generated the largest supermatrices with the highest proportion of missing data. Applications of the method to mammals and yeasts were less successful, which we suggest may be due partly to their much deeper evolutionary divergence times compared to Drosophila (crown ages of approximately 100 and 300 versus 60 Mya, respectively). RAD sequences thus appear to hold promise for reconstructing phylogenetic relationships in younger clades in which sufficient numbers of orthologous restriction sites are retained across species.
引用
收藏
页数:12
相关论文
共 50 条
  • [41] The Effect of Nonreversibility on Inferring Rooted Phylogenies
    Cherlin, Svetlana
    Heaps, Sarah E.
    Nye, Tom Mw
    Boys, Richard J.
    Williams, Tom A.
    Embley, T. Martin
    MOLECULAR BIOLOGY AND EVOLUTION, 2018, 35 (04) : 984 - 1002
  • [42] Inferring complex DNA substitution processes on phylogenies using uniformization and data augmentation
    Mateiu, L
    Rannala, B
    SYSTEMATIC BIOLOGY, 2006, 55 (02) : 259 - 269
  • [43] INFERRING PHYLOGENIES FROM DNA-SEQUENCES OF UNEQUAL BASE COMPOSITIONS
    GALTIER, N
    GOUY, M
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1995, 92 (24) : 11317 - 11321
  • [44] Paleovirology: inferring viral evolution from host genome sequence data
    Katzourakis, Aris
    PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY B-BIOLOGICAL SCIENCES, 2013, 368 (1626)
  • [45] A novel framework for inferring parameters of transmission from viral sequence data
    Lumby, Casper K.
    Nene, Nuno R.
    Illingworth, Christopher J. R.
    PLOS GENETICS, 2018, 14 (10):
  • [46] A genetic algorithm for inferring pseudoknotted RNA structures from sequence data
    Lee, D
    Han, K
    DISCOVERY SCIENCE, PROCEEDINGS, 2003, 2843 : 336 - 343
  • [47] Inferring polyploid phylogenies from multiply-labeled gene trees
    Lott, Martin
    Spillner, Andreas
    Huber, Katharina T.
    Petri, Anna
    Oxelman, Bengt
    Moulton, Vincent
    BMC EVOLUTIONARY BIOLOGY, 2009, 9
  • [48] Inferring Viral Transmission Time from Phylogenies for Known Transmission Pairs
    Goldberg, Emma E.
    Lundgren, Erik J.
    Romero-Severson, Ethan O.
    Leitner, Thomas
    MOLECULAR BIOLOGY AND EVOLUTION, 2024, 41 (01)
  • [49] Inferring polyploid phylogenies from multiply-labeled gene trees
    Martin Lott
    Andreas Spillner
    Katharina T Huber
    Anna Petri
    Bengt Oxelman
    Vincent Moulton
    BMC Evolutionary Biology, 9
  • [50] Inferring Stabilizing Mutations from Protein Phylogenies: Application to Influenza Hemagglutinin
    Bloom, Jesse D.
    Glassman, Matthew J.
    PLOS COMPUTATIONAL BIOLOGY, 2009, 5 (04)