Automated Removal of Noisy Data in Phylogenomic Analyses

被引:59
|
作者
Goremykin, Vadim V. [1 ]
Nikiforova, Svetlana V. [1 ]
Bininda-Emonds, Olaf R. P. [2 ]
机构
[1] IASMA Res & Innovat Ctr, I-38010 San Michele All Adige, TN, Italy
[2] Carl von Ossietzky Univ Oldenburg, AG Systemat & Evolut Biol, IBU Fak 5, D-26111 Oldenburg, Germany
关键词
Noise reduction; Saturation; Long-branch attraction; Model testing; Noisy data; Placental mammals; Rodentia; COMPLETE MITOCHONDRIAL GENOME; LONG-BRANCH ATTRACTION; GUINEA-PIG; MAXIMUM-LIKELIHOOD; EVOLUTIONARY TREES; MOLECULAR EVIDENCE; BASAL ANGIOSPERMS; PLACENTAL MAMMALS; EUTHERIAN TREE; NUCLEAR GENES;
D O I
10.1007/s00239-010-9398-z
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Noisy data, especially in combination with misalignment and model misspecification can have an adverse effect on phylogeny reconstruction; however, effective methods to identify such data are few. One particularly important class of noisy data is saturated positions. To avoid potential errors related to saturation in phylogenomic analyses, we present an automated procedure involving the step-wise removal of the most variable positions in a given data set coupled with a stopping criterion derived from correlation analyses of pairwise ML distances calculated from the deleted (saturated) and the remaining (conserved) subsets of the alignment. Through a comparison with existing methods, we demonstrate both the effectiveness of our proposed procedure for identifying noisy data and the effect of the removal of such data using a well-publicized case study involving placental mammals. At the least, our procedure will identify data sets requiring greater data exploration, and we recommend its use to investigate the effect on phylogenetic analyses of removing subsets of variable positions exhibiting weak or no correlation to the rest of the alignment. However, we would argue that this procedure, by identifying and removing noisy data, facilitates the construction of more accurate phylogenies by, for example, ameliorating potential long-branch attraction artefacts.
引用
收藏
页码:319 / 331
页数:13
相关论文
共 50 条
  • [1] Automated Removal of Noisy Data in Phylogenomic Analyses
    Vadim V. Goremykin
    Svetlana V. Nikiforova
    Olaf R. P. Bininda-Emonds
    [J]. Journal of Molecular Evolution, 2010, 71 : 319 - 331
  • [2] ALPHA: a toolkit for Automated Local PHylogenomic Analyses
    Elworth, R. A. Leo
    Allen, Chabrielle
    Benedict, Travis
    Dulworth, Peter
    Nakhleh, Luay
    [J]. BIOINFORMATICS, 2018, 34 (16) : 2848 - 2850
  • [3] Phylogenomic analyses data of the avian phylogenomics project
    Jarvis, Erich D.
    Mirarab, Siavash
    Aberer, Andre J.
    Li, Bo
    Houde, Peter
    Li, Cai
    Ho, Simon Y. W.
    Faircloth, Brant C.
    Nabholz, Benoit
    Howard, Jason T.
    Suh, Alexander
    Weber, Claudia C.
    da Fonseca, Rute R.
    Alfaro-Nunez, Alonzo
    Narula, Nitish
    Liu, Liang
    Burt, Dave
    Ellegren, Hans
    Edwards, Scott V.
    Stamatakis, Alexandros
    Mindell, David P.
    Cracraft, Joel
    Braun, Edward L.
    Warnow, Tandy
    Jun, Wang
    Gilbert, M. Thomas Pius
    Zhang, Guojie
    [J]. GIGASCIENCE, 2015, 4
  • [4] Data-driven guidelines for phylogenomic analyses using SNP data
    Suissa, Jacob S.
    de la Cerda, Gisel Y.
    Graber, Leland C.
    Jelley, Chloe
    Wickell, David
    Phillips, Heather R.
    Grinage, Ayress D.
    Moreau, Corrie S.
    Specht, Chelsea D.
    Doyle, Jeff J.
    Landis, Jacob B.
    [J]. APPLICATIONS IN PLANT SCIENCES, 2024,
  • [5] Phylogenomic analyses of bat subordinal relationships based on transcriptome data
    Lei, Ming
    Dong, Dong
    [J]. SCIENTIFIC REPORTS, 2016, 6
  • [6] Phylogenomic analyses of bat subordinal relationships based on transcriptome data
    Ming Lei
    Dong Dong
    [J]. Scientific Reports, 6
  • [7] Phylogenomic data analyses provide evidence that Xenarthra and Afrotheria are sister groups
    Hallstroem, Bjoern M.
    Kullberg, Morgan
    Nilsson, Maria A.
    Janke, Axel
    [J]. MOLECULAR BIOLOGY AND EVOLUTION, 2007, 24 (09) : 2059 - 2068
  • [8] Dealing with incongruence in phylogenomic analyses
    Galtier, Nicolas
    Daubin, Vincent
    [J]. PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY B-BIOLOGICAL SCIENCES, 2008, 363 (1512) : 4023 - 4029
  • [9] The Removal of Noisy Bands for Hyperion Data using Extrema
    Han, Dong Yeob
    Kim, Dae Sung
    Kim, Yong Il
    [J]. KOREAN JOURNAL OF REMOTE SENSING, 2006, 22 (04) : 275 - 284
  • [10] Fast and accurate methods for phylogenomic analyses
    Yang, Jimmy
    Warnow, Tandy
    [J]. BMC BIOINFORMATICS, 2011, 12