Combinatorics of least-squares trees

被引:10
|
作者
Mihaescu, Radu [1 ]
Pachter, Lior [2 ]
机构
[1] Univ Calif Berkeley, Dept Math, Berkeley, CA 94704 USA
[2] Univ Calif Berkeley, Dept Comp Sci, Berkeley, CA 94704 USA
基金
美国国家科学基金会;
关键词
phylogenetics; tree additivity; independence of irrelevant paths; minimum evolution; semimultiplicative maps;
D O I
10.1073/pnas.0802089105
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
A recurring theme in the least-squares approach to phylogenetics has been the discovery of elegant combinatorial formulas for the least-squares estimates of edge lengths. These formulas have proved useful for the development of efficient algorithms, and have also been important for understanding connections among popular phylogeny algorithms. For example, the selection criterion of the neighbor-joining algorithm is now understood in terms of the combinatorial formulas of Pauplin for estimating tree length. We highlight a phylogenetically desirable property that weighted least-squares methods should satisfy, and provide a complete characterization of methods that satisfy the property. The necessary and sufficient condition is a multiplicative four-point condition that the variance matrix needs to satisfy. The proof is based on the observation that the Lagrange multipliers in the proof of the Gauss-Markov theorem are tree-additive. Our results generalize and complete previous work on ordinary least squares, balanced minimum evolution, and the taxon-weighted variance model. They also provide a time-optimal algorithm for computation.
引用
收藏
页码:13206 / 13211
页数:6
相关论文
共 50 条