Active Learning of Constraints for Semi-supervised Text Clustering

被引:0
|
作者
Huang, Ruizhang [1 ]
Lam, Wai [1 ]
Zhang, Zhigang [1 ]
机构
[1] Chinese Univ Hong Kong, Dept Syst Engn & Engn Management, Shatin, Hong Kong, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper investigates active learning of constraints for semi-supervised document clustering. We make use of the intermediate clustering results to guide the document pair selection for obtaining user judgments for constraint generation. A gain function is designed for choosing the most informative document pairs given the current cluster assignments. This gain function measures how much we can learn by revealing the judgment of the document pairs. Two methods are investigated, namely, independent gain model and dependent gain model. In the independent gain model, we assume that the information learned by revealing the judgment of a document pair is independent of revealing the judgment of other document pairs. The dependent gain model also considers previously chosen documents to avoid redundant selection and maximize the gain collectively for a set of document. pairs. Constrained semi-supervised clustering and gain directed document pair selection are conducted in an iterative manner. We have conducted extensive experiments on several real-world corpora. The results demonstrate that the intermediate clustering assignments and the interactions among a set of document pairs are useful for improving the clustering performance. Our approach is also superior to a recent existing work for this problem.
引用
收藏
页码:113 / 124
页数:12
相关论文
共 50 条
  • [1] Active Learning of Constraints for Semi-Supervised Clustering
    Xiong, Sicheng
    Azimi, Javad
    Fern, Xiaoli Z.
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2014, 26 (01) : 43 - 54
  • [2] Semi-supervised document clustering via active learning with pairwise constraints
    Huang, Ruizhang
    Lam, Wai
    ICDM 2007: PROCEEDINGS OF THE SEVENTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, 2007, : 517 - 522
  • [3] Active learning of pair-wise constraints in semi-supervised clustering
    Jiang, Weijin, 1600, Editorial Board of Journal of Basic Science and (22):
  • [4] Semi-Supervised Clustering Algorithms Through Active Constraints
    Almazroi, Abdulwahab Ali
    Atwa, Walid
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2024, 15 (07) : 338 - 345
  • [5] Active Learning of Instance-level Constraints for Semi-supervised Document Clustering
    Zhao, Weizhong
    He, Qing
    Ma, Huifang
    Shi, Zhongzhi
    2009 IEEE/WIC/ACM INTERNATIONAL JOINT CONFERENCES ON WEB INTELLIGENCE (WI) AND INTELLIGENT AGENT TECHNOLOGIES (IAT), VOL 1, 2009, : 264 - 268
  • [6] Active semi-supervised spectral clustering based on pairwise constraints
    Wang, Na
    Li, Xia
    Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2010, 38 (01): : 172 - 176
  • [7] Semi-Supervised Selective Affinity Propagation Ensemble Clustering With Active Constraints
    Lei, Qi
    Li, Ting
    IEEE ACCESS, 2020, 8 : 46255 - 46266
  • [8] Effective semi-supervised document clustering via active learning with instance-level constraints
    Weizhong Zhao
    Qing He
    Huifang Ma
    Zhongzhi Shi
    Knowledge and Information Systems, 2012, 30 : 569 - 587
  • [9] Effective semi-supervised document clustering via active learning with instance-level constraints
    Zhao, Weizhong
    He, Qing
    Ma, Huifang
    Shi, Zhongzhi
    KNOWLEDGE AND INFORMATION SYSTEMS, 2012, 30 (03) : 569 - 587
  • [10] Active semi-supervised fuzzy clustering
    Grira, Nizar
    Crucianu, Michel
    Boujemaa, Nozha
    PATTERN RECOGNITION, 2008, 41 (05) : 1834 - 1844