共 50 条
Generalized k-medians clustering for strings
被引:0
|作者:
Martínez-Hinarejos, CD
[1
]
Juan, A
[1
]
Casacuberta, F
[1
]
机构:
[1] Univ Politecn Valencia, Inst Tecnol Informat, Dept Sistemes Informat & Computacio, Valencia 46022, Spain
来源:
关键词:
D O I:
暂无
中图分类号:
TP18 [人工智能理论];
学科分类号:
081104 ;
0812 ;
0835 ;
1405 ;
摘要:
Clustering methods are used in pattern recognition to obtain natural groups from a data set in the framework Of unsupervised learning as well as for obtaining clusters of data from a known class. In sets of strings, the concept of set median string can be extended to the (set) k-medians problem. The solution of the k-medians problem can be viewed as a clustering method, where each cluster is generated by each of the k strings of that solution. A concept which is related to set median string is the (generalized) median string, which is an NP-Hard problem. However, different algorithms have been proposed to find approximations to the (generalized) median string. We propose extending the (generalized) median string problem to k strings, resulting in the generalized k-medians problem, which can also be viewed as a clustering technique. This new technique is applied to a corpus of chromosomes represented by strings and compared to the conventional k-medians technique.
引用
收藏
页码:502 / 509
页数:8
相关论文