Constrained Co-Clustering for Textual Documents

被引:0
|
作者
Song, Yangqiu [1 ]
Pan, Shimei [2 ]
Liu, Shixia [1 ]
Wei, Furu [1 ]
Zhou, Michelle X. [3 ]
Qian, Weihong [1 ]
机构
[1] IBM Res China, Beijing, Peoples R China
[2] IBM Res TJ Watson Ctr, Hawthorne, NY USA
[3] IBM Res Almaden Ctr, San Jose, CA USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we present a constrained co-clustering approach for clustering textual documents. Our approach combines the benefits of information-theoretic co-clustering and constrained clustering. We use a two-sided hidden Markov random field (HMRF) to model both the document and word constraints. We also develop an alternating expectation maximization (EM) algorithm to optimize the constrained co-clustering model. We have conducted two sets of experiments on a benchmark data set: (1) using human-provided category labels to derive document and word constraints for semi-supervised document clustering, and (2) using automatically extracted named entities to derive document constraints for unsupervised document clustering. Compared to several representative constrained clustering and co-clustering approaches, our approach is shown to be more effective for high-dimensional, sparse text data.
引用
收藏
页码:581 / 586
页数:6
相关论文
共 50 条
  • [21] Spectral co-clustering ensemble
    Huang, Shudong
    Wang, Hongjun
    Li, Dingcheng
    Yang, Yan
    Li, Tianrui
    KNOWLEDGE-BASED SYSTEMS, 2015, 84 : 46 - 55
  • [22] Spectral co-clustering documents and words using fuzzy K-harmonic means
    Na Liu
    Fei Chen
    Mingyu Lu
    International Journal of Machine Learning and Cybernetics, 2013, 4 : 75 - 83
  • [23] Spectral co-clustering documents and words using fuzzy K-harmonic means
    Liu, Na
    Chen, Fei
    Lu, Mingyu
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2013, 4 (01) : 75 - 83
  • [24] Evolutionary Spectral Co-Clustering
    Green, Nathan
    Rege, Manjeet
    Liu, Xumin
    Bailey, Reynold
    2011 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2011, : 1074 - 1081
  • [25] Latent Dirichlet co-clustering
    Shafiei, M. Mahdi
    Milios, Evangelos E.
    ICDM 2006: SIXTH INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2006, : 542 - +
  • [26] Co-clustering for Microdata Anonymization
    Benkhelif, Tarek
    Fessant, Francoise
    Clerot, Fabrice
    Raschia, Guillaume
    DATABASE AND EXPERT SYSTEMS APPLICATIONS, DEXA 2017, PT I, 2017, 10438 : 343 - 351
  • [27] Co-clustering with augmented matrix
    Wu, Meng-Lun
    Chang, Chia-Hui
    Liu, Rui-Zhe
    APPLIED INTELLIGENCE, 2013, 39 (01) : 153 - 164
  • [28] Co-clustering for Fair Recommendation
    Frisch, Gabriel
    Leger, Jean-Benoist
    Grandvalet, Yves
    MACHINE LEARNING AND PRINCIPLES AND PRACTICE OF KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2021, PT I, 2021, 1524 : 607 - 630
  • [29] Methods for co-clustering: a review
    Brault, Vincent
    Lomet, Aurore
    JOURNAL OF THE SFDS, 2015, 156 (03): : 27 - 51
  • [30] A Morphology Method for Determining the Number of Clusters Present in Spectral Co-clustering Documents and Words
    Liu, Na
    Lu, Mingyu
    COMPUTATIONAL GEOMETRY, GRAPHS AND APPLICATIONS, 2011, 7033 : 130 - +