共 50 条
From typical sequences to typical genotypes
被引:1
|作者:
Tal, Omri
[1
]
Tran, Tat Dat
[1
]
Portegies, Jacobus
[1
]
机构:
[1] Max Planck Inst Math Sci, Inselstr 22, D-04103 Leipzig, Germany
关键词:
Typical sequences;
Typical genotypes;
Population entropy rate;
Population cross entropy rate;
Classification;
INFORMATION-THEORY;
GENETIC-MARKERS;
D O I:
10.1016/j.jtbi.2017.02.010
中图分类号:
Q [生物科学];
学科分类号:
07 ;
0710 ;
09 ;
摘要:
We demonstrate an application of a core notion of information theory, typical sequences and their related properties, to analysis of population genetic data. Based on the asymptotic equipartition property (AEP) for nonstationary discrete-time sources producing independent symbols, we introduce the concepts of typical genotypes and population entropy and cross entropy rate. We analyze three perspectives on typical genotypes: a set perspective on the interplay of typical sets of genotypes from two populations, a geometric perspective on their structure in high dimensional space, and a statistical learning perspective on the prospects of constructing typical-set based classifiers. In particular, we show that such classifiers have a surprising resilience to noise originating from small population samples, and highlight the potential for further links between inference and communication.
引用
下载
收藏
页码:159 / 183
页数:25
相关论文