ALU SEQUENCES;
REPETITIVE ELEMENTS;
MOLECULAR EVOLUTION;
MACHINE DISCOVERY;
DATA COMPRESSION;
D O I:
10.1023/A:1022871401069
中图分类号:
TP18 [人工智能理论];
学科分类号:
081104 ;
0812 ;
0835 ;
1405 ;
摘要:
We apply the Minimal Length Encoding Principle to formalize inference about the evolution of macromolecular sequences. The Principle is shown to imply a combination of Weighted Parsimony and Compatibility methods that have long been used by biologists because of their good practical performance. The background assumptions are expressed as an encoding scheme for the observed data and as heuristic rules for selection of diagnostic positions in the sequences. The Principle was applied to discover new subfamilies of Alu sequences, the most numerous family of repetitive DNA sequences in the human genome.
机构:
Chinese Acad Meteorol Sci, State Key Lab Severe Weather, Beijing, Peoples R ChinaChinese Acad Meteorol Sci, State Key Lab Severe Weather, Beijing, Peoples R China
Liu, C.
Xu, H.
论文数: 0引用数: 0
h-index: 0
机构:Chinese Acad Meteorol Sci, State Key Lab Severe Weather, Beijing, Peoples R China
Xu, H.
Liu, Y.
论文数: 0引用数: 0
h-index: 0
机构:Chinese Acad Meteorol Sci, State Key Lab Severe Weather, Beijing, Peoples R China