Completing gene trees without species trees in sub-quadratic time

被引:12
|
作者
Mai, Uyen [1 ]
Mirarab, Siavash [2 ]
机构
[1] Univ Calif San Diego, Dept Comp Sci & Engn, San Diego, CA 92093 USA
[2] Univ Calif San Diego, Dept Elect & Comp Engn, San Diego, CA 92093 USA
基金
美国国家科学基金会;
关键词
MISSING DATA; IMPACT; INFERENCE; QUARTET;
D O I
10.1093/bioinformatics/btab875
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: As genome-wide reconstruction of phylogenetic trees becomes more widespread, limitations of available data are being appreciated more than ever before. One issue is that phylogenomic datasets are riddled with missing data, and gene trees, in particular, almost always lack representatives from some species otherwise available in the dataset. Since many downstream applications of gene trees require or can benefit from access to complete gene trees, it will be beneficial to algorithmically complete gene trees. Also, gene trees are often unrooted, and rooting them is useful for downstream applications. While completing and rooting a gene tree with respect to a given species tree has been studied, those problems are not studied in depth when we lack such a reference species tree. Results: We study completion of gene trees without a need for a reference species tree. We formulate an optimization problem to complete the gene trees while minimizing their quartet distance to the given set of gene trees. We extend a seminal algorithm by Brodal et al. to solve this problem in quasi-linear time. In simulated studies and on a large empirical data, we show that completion of gene trees using other gene trees is relatively accurate and, unlike the case where a species tree is available, is unbiased.
引用
收藏
页码:1532 / 1541
页数:10
相关论文
共 50 条
  • [1] A new template for solving p-Median problems for trees in sub-quadratic time
    Benkoczi, R
    Bhattacharya, B
    [J]. ALGORITHMS - ESA 2005, 2005, 3669 : 271 - 282
  • [2] Gene trees in species trees
    Maddison, WP
    [J]. SYSTEMATIC BIOLOGY, 1997, 46 (03) : 523 - 536
  • [3] Radiosity algorithms running in sub-quadratic time
    Szirmay-Kalos, L
    Foris, T
    [J]. WSCG '97: THE FIFTH INTERNATIONAL CONFERENCE IN CENTRAL EUROPE ON COMPUTER GRAPHICS AND VISUALIZATION '97, CONFERENCE PROCEEDINGS, VOL 1-4, 1997, : 552 - 561
  • [4] Delimiting species without monophyletic gene trees
    Knowles, L. Lacey
    Carstens, Bryan C.
    [J]. SYSTEMATIC BIOLOGY, 2007, 56 (06) : 887 - 895
  • [5] The Inference of Gene Trees with Species Trees
    Szoellosi, Gergely J.
    Tannier, Eric
    Daubin, Vincent
    Boussau, Bastien
    [J]. SYSTEMATIC BIOLOGY, 2015, 64 (01) : E42 - E62
  • [6] From gene trees to species trees
    Ma, B
    Li, M
    Zhang, LX
    [J]. SIAM JOURNAL ON COMPUTING, 2000, 30 (03) : 729 - 752
  • [7] Gene trees and species trees are not the same
    Nichols, R
    [J]. TRENDS IN ECOLOGY & EVOLUTION, 2001, 16 (07) : 358 - 364
  • [8] Gene trees and species trees: irreconcilable differences
    Swenson, Krister M.
    El-Mabrouk, Nadia
    [J]. BMC BIOINFORMATICS, 2012, 13
  • [9] Gene trees and species trees: irreconcilable differences
    Krister M Swenson
    Nadia El-Mabrouk
    [J]. BMC Bioinformatics, 13
  • [10] RELATIONSHIPS BETWEEN GENE TREES AND SPECIES TREES
    PAMILO, P
    NEI, M
    [J]. MOLECULAR BIOLOGY AND EVOLUTION, 1988, 5 (05) : 568 - 583