Fast Fitch-parsimony algorithms for large data sets

被引:23
|
作者
Ronquist, F [1 ]
机构
[1] Univ Uppsala, Dept Zool, SE75236 Uppsala, Sweden
关键词
D O I
10.1111/j.1096-0031.1998.tb00346.x
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
The speed of analytical algorithms becomes increasingly important as systematists accumulate larger data sets. In this paper I discuss several time-saving modifications to published Fitch-parsimony tree search algorithms, including shortcuts that allow rapid evaluation of tree lengths and fast reoptimization of trees after clipping or joining of subtrees, as well as search strategies that allows one to successively increase the exhaustiveness of branch swapping. I also describe how Fitch-parsimony algorithms can be restructured to take full advantage of the computing power of modern microprocessors by horizontal or vertical packing of characters, allowing simultaneous processing of many characters, and by avoidance of conditional branches that disturb instruction flow. These new multicharacter algorithms are particularly useful for large data sets of characters with a small number of states, such as nucleotide characters. As an example, the multicharacter algorithms are estimated to be 3.6-10 times faster than single-character equivalents on a PowerPC 604. The speed gain is even larger on processors using MMX, Altivec or similar technologies allowing single instructions to be performed on multiple data simultaneously. (C) 1998 The Willi Hennig Society.
引用
收藏
页码:387 / 400
页数:14
相关论文
共 50 条
  • [11] Fast principal component analysis of large data sets
    Vogt, F
    Tacke, M
    [J]. CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2001, 59 (1-2) : 1 - 18
  • [12] An architecture for fast processing of large unstructured data sets
    Franklin, M
    Chamberlain, R
    Henrichs, M
    Shands, B
    White, J
    [J]. IEEE INTERNATIONAL CONFERENCE ON COMPUTER DESIGN: VLSI IN COMPUTERS & PROCESSORS, PROCEEDINGS, 2004, : 280 - 287
  • [13] DPLS and PPLS:: two PLS algorithms for large data sets
    Milidiú, RL
    Rentería, RP
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2005, 48 (01) : 125 - 138
  • [14] Efficient algorithms for mining outliers from large data sets
    Ramaswamy, S
    Rastogi, R
    Shim, K
    [J]. SIGMOD RECORD, 2000, 29 (02) : 427 - 438
  • [15] Relabelling algorithms for mixture models with applications for large data sets
    Zhu, W.
    Fan, Y.
    [J]. JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2016, 86 (02) : 394 - 413
  • [16] Fast Scalable Selection Algorithms for Large Scale Data
    Thompson, Lee Parnell
    Xu, Weijia
    Miranker, Daniel P.
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON BIG DATA, 2013,
  • [17] Comparison of Pagination Algorithms Based-on Large Data Sets
    Cao, Junkuo
    Wang, Weihua
    Shu, Yuanzhong
    [J]. INFORMATION AND AUTOMATION, 2011, 86 : 384 - 389
  • [18] A FAST ALGORITHM FOR TRANSPOSING LARGE MULTIDIMENSIONAL IMAGE DATA SETS
    VANHEEL, M
    [J]. ULTRAMICROSCOPY, 1991, 38 (01) : 75 - 83
  • [19] A Fast Method of Coarse Density Clustering for Large Data Sets
    Zhao, Lei
    Yang, Jiwen
    Fan, Jianxi
    [J]. PROCEEDINGS OF THE 2009 2ND INTERNATIONAL CONFERENCE ON BIOMEDICAL ENGINEERING AND INFORMATICS, VOLS 1-4, 2009, : 1941 - 1945
  • [20] Fast nearest neighbor condensation for large data sets classification
    Angiulli, Fabrizio
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2007, 19 (11) : 1450 - 1464