Fast Fitch-parsimony algorithms for large data sets

被引:23
|
作者
Ronquist, F [1 ]
机构
[1] Univ Uppsala, Dept Zool, SE75236 Uppsala, Sweden
关键词
D O I
10.1111/j.1096-0031.1998.tb00346.x
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
The speed of analytical algorithms becomes increasingly important as systematists accumulate larger data sets. In this paper I discuss several time-saving modifications to published Fitch-parsimony tree search algorithms, including shortcuts that allow rapid evaluation of tree lengths and fast reoptimization of trees after clipping or joining of subtrees, as well as search strategies that allows one to successively increase the exhaustiveness of branch swapping. I also describe how Fitch-parsimony algorithms can be restructured to take full advantage of the computing power of modern microprocessors by horizontal or vertical packing of characters, allowing simultaneous processing of many characters, and by avoidance of conditional branches that disturb instruction flow. These new multicharacter algorithms are particularly useful for large data sets of characters with a small number of states, such as nucleotide characters. As an example, the multicharacter algorithms are estimated to be 3.6-10 times faster than single-character equivalents on a PowerPC 604. The speed gain is even larger on processors using MMX, Altivec or similar technologies allowing single instructions to be performed on multiple data simultaneously. (C) 1998 The Willi Hennig Society.
引用
收藏
页码:387 / 400
页数:14
相关论文
共 50 条
  • [41] Fast principal component analysis of large data sets based on information extraction
    Vogt, F
    Tacke, M
    [J]. JOURNAL OF CHEMOMETRICS, 2002, 16 (11) : 562 - 575
  • [42] Very Fast Interactive Visualization of Large Sets of High-dimensional Data
    Dzwinel, Witold
    Wcislo, Rafal
    [J]. INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE, ICCS 2015 COMPUTATIONAL SCIENCE AT THE GATES OF NATURE, 2015, 51 : 572 - 581
  • [43] Core vector machines: Fast SVM training on very large data sets
    Tsang, IW
    Kwok, JT
    Cheung, PM
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2005, 6 : 363 - 392
  • [44] Fast spectral clustering for large data sets using minimal enclosing ball
    Qian, Peng-Jiang
    Wang, Shi-Tong
    Deng, Zhao-Hong
    Xu, Hua
    [J]. Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2010, 38 (09): : 2035 - 2041
  • [45] CoreCruncher: Fast and Robust Construction of Core Genomes in Large Prokaryotic Data Sets
    Harris, Connor D.
    Torrance, Ellis L.
    Raymann, Kasie
    Bobay, Louis-Marie
    [J]. MOLECULAR BIOLOGY AND EVOLUTION, 2021, 38 (02) : 727 - 734
  • [46] On divide-and-conquer strategies for parsimony analysis of large data sets: Rec-I-DCM3 versus TNT
    Goloboff, Pablo A.
    Pol, Diego
    [J]. SYSTEMATIC BIOLOGY, 2007, 56 (03) : 485 - 495
  • [47] Assessing peptide de novo sequencing algorithms performance on large and diverse data sets
    Pitzer, Erik
    Masselot, Alexandre
    Colinge, Jacques
    [J]. PROTEOMICS, 2007, 7 (17) : 3051 - 3054
  • [48] MapReduce algorithms for efficient generation of CPS models from large historical data sets
    Windmann, Stefan
    Niggemann, Oliver
    [J]. PROCEEDINGS OF 2015 IEEE 20TH CONFERENCE ON EMERGING TECHNOLOGIES & FACTORY AUTOMATION (ETFA), 2015,
  • [49] L 1 C 1 polynomial spline approximation algorithms for large data sets
    Gajny, Laurent
    Gibaru, Olivier
    Nyiri, Eric
    [J]. NUMERICAL ALGORITHMS, 2014, 67 (04) : 807 - 826
  • [50] Fast RNC and NC algorithms for maximal path sets
    Uehara, R
    Chen, ZZ
    He, X
    [J]. THEORETICAL COMPUTER SCIENCE, 1999, 215 (1-2) : 89 - 98