Simulation of random dendrograms and comparison tests: Some comments

被引:24
|
作者
Podani, J [1 ]
机构
[1] Eotvos Lorand Univ, Dept Plant Taxon & Ecol, H-1083 Budapest, Hungary
关键词
clustering methodology; dendrogram topology; matrix correlation; Monte Carlo studies;
D O I
10.1007/s003570000007
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
It is shown that there is a simple, easily understood alternative to the double permutation algorithm for generating random, fully ranked dendrograms. The paper also examines the utility of five different dendrogram descriptors in statistical analyses of dendrogram similarity. They serve as a logical basis for comparisons under different simulation models: cophenetic difference is valid for weighted dendrograms, partition membership divergence for fully ranked dendrograms, whereas subtree membership divergence and cluster membership divergence are best suited to partially ranked dendrograms. The latter two descriptors possess the ultrametric property for all triples, but are called quasi-ultrametrics because they do not satisfy the identity axiom. The fifth descriptor considered is path difference which is not recommended for comparisons except for unrooted trees. Correlations among dendrogram descriptors are evaluated through simulation experiments, and it is shown that the significance of dendrogram comparisons is greatly influenced by the choice of the descriptor. The paper emphasizes that choice of the underlying tree distribution to be used as a reference in testing significance of a dendrogram comparison measure should be consistent with the descriptor incorporated by that measure.
引用
收藏
页码:123 / 142
页数:20
相关论文
共 50 条