Co-clustering;
Biclustering;
Contingency table;
Information theory;
62-07;
EM ALGORITHM;
OPTIMIZATION;
SPARSE;
D O I:
10.1007/s11634-016-0274-6
中图分类号:
O21 [概率论与数理统计];
C8 [统计学];
学科分类号:
020208 ;
070103 ;
0714 ;
摘要:
Many of the datasets encountered in statistics are two-dimensional in nature and can be represented by a matrix. Classical clustering procedures seek to construct separately an optimal partition of rows or, sometimes, of columns. In contrast, co-clustering methods cluster the rows and the columns simultaneously and organize the data into homogeneous blocks (after suitable permutations). Methods of this kind have practical importance in a wide variety of applications such as document clustering, where data are typically organized in two-way contingency tables. Our goal is to offer coherent frameworks for understanding some existing criteria and algorithms for co-clustering contingency tables, and to propose new ones. We look at two different frameworks for the problem of co-clustering. The first involves minimizing an objective function based on measures of association and in particular on phi-squared and mutual information. The second uses a model-based co-clustering approach, and we consider two models: the block model and the latent block model. We establish connections between different approaches, criteria and algorithms, and we highlight a number of implicit assumptions in some commonly used algorithms. Our contribution is illustrated by numerical experiments on simulated and real-case datasets that show the relevance of the presented methods in the document clustering field.
机构:
China Univ Min & Technol, Engn Res Ctr Intelligent Control Round Space, Minist Educ, Xuzhou 221116, Jiangsu, Peoples R China
JiuZhou Polytech, Xuzhou 221116, Jiangsu, Peoples R ChinaChina Univ Min & Technol, Engn Res Ctr Intelligent Control Round Space, Minist Educ, Xuzhou 221116, Jiangsu, Peoples R China
Liu, Zhaoyang
Yin, Hongsheng
论文数: 0引用数: 0
h-index: 0
机构:
China Univ Min & Technol, Engn Res Ctr Intelligent Control Round Space, Minist Educ, Xuzhou 221116, Jiangsu, Peoples R China
China Univ Min & Technol, Sch Informat & Control Engn, Xuzhou 221116, Jiangsu, Peoples R ChinaChina Univ Min & Technol, Engn Res Ctr Intelligent Control Round Space, Minist Educ, Xuzhou 221116, Jiangsu, Peoples R China
Yin, Hongsheng
Chen, Shutao
论文数: 0引用数: 0
h-index: 0
机构:
China Univ Min & Technol, Engn Res Ctr Intelligent Control Round Space, Minist Educ, Xuzhou 221116, Jiangsu, Peoples R ChinaChina Univ Min & Technol, Engn Res Ctr Intelligent Control Round Space, Minist Educ, Xuzhou 221116, Jiangsu, Peoples R China
Chen, Shutao
Liu, Hui
论文数: 0引用数: 0
h-index: 0
机构:
China Univ Min & Technol, Engn Res Ctr Intelligent Control Round Space, Minist Educ, Xuzhou 221116, Jiangsu, Peoples R China
China Univ Min & Technol, Sch Informat & Control Engn, Xuzhou 221116, Jiangsu, Peoples R ChinaChina Univ Min & Technol, Engn Res Ctr Intelligent Control Round Space, Minist Educ, Xuzhou 221116, Jiangsu, Peoples R China
Liu, Hui
Meng, Jia
论文数: 0引用数: 0
h-index: 0
机构:
Xian Jiaotong Liverpool Univ, Dept Biol Sci, Suzhou 215123, Jiangsu, Peoples R China
Univ Liverpool, Inst Integrat Biol, Liverpool L7 8TX, Merseyside, EnglandChina Univ Min & Technol, Engn Res Ctr Intelligent Control Round Space, Minist Educ, Xuzhou 221116, Jiangsu, Peoples R China
Meng, Jia
Wang, HongLei
论文数: 0引用数: 0
h-index: 0
机构:
China Univ Min & Technol, Engn Res Ctr Intelligent Control Round Space, Minist Educ, Xuzhou 221116, Jiangsu, Peoples R ChinaChina Univ Min & Technol, Engn Res Ctr Intelligent Control Round Space, Minist Educ, Xuzhou 221116, Jiangsu, Peoples R China
Wang, HongLei
Zhang, Lin
论文数: 0引用数: 0
h-index: 0
机构:
China Univ Min & Technol, Engn Res Ctr Intelligent Control Round Space, Minist Educ, Xuzhou 221116, Jiangsu, Peoples R China
China Univ Min & Technol, Sch Informat & Control Engn, Xuzhou 221116, Jiangsu, Peoples R ChinaChina Univ Min & Technol, Engn Res Ctr Intelligent Control Round Space, Minist Educ, Xuzhou 221116, Jiangsu, Peoples R China