A genetic algorithm-based clustering approach for database partitioning

被引:46
|
作者
Cheng, CH [1 ]
Lee, WK
Wong, KF
机构
[1] Chinese Univ Hong Kong, Dept Syst Engn & Engn Management, Shatin, Hong Kong, Peoples R China
[2] Bank China, Hong Kong, Hong Kong, Peoples R China
关键词
clustering; data partitioning; genetic algorithm (GA); optimization;
D O I
10.1109/TSMCC.2002.804444
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In a typical distributed/parallel database system, a request mostly accesses a subset of the entire database. It is, therefore, natural to organize commonly accessed data together and to place them on nearby, preferably the same, machine(s)/site(s). For this reason, data partitioning and data allocation are performance critical issues in distributed database application design., In this paper, we are dealing with data partitioning. Data partitioning requires the use of clustering. Although many clustering algorithms have been proposed, their performance has not been extensively studied. Moreover, the special problem structure in clustering is rarely exploited. In this paper, we explore the use of a genetic search-based clustering algorithm for data partitioning to achieve high database retrieval performance. By formulating the underlying problem as a traveling salesman problem (TSP), we can take advantage of this particular structure. Three new operators for GAs are also proposed and experimental results indicate that they outperform other operators in solving the TSP. The proposed GA is applied to solve the data-partitioning problem. Our computational study shows that our GA performs well for this application.
引用
收藏
页码:215 / 230
页数:16
相关论文
共 50 条