Semi-supervised concept factorization for document clustering

被引:45
|
作者
Lu, Mei [1 ,2 ]
Zhao, Xiang-Jun [2 ]
Zhang, Li [1 ]
Li, Fan-Zhang [1 ]
机构
[1] Suzhou Univ, Coll Comp Sci & Technol, Suzhou 215006, Jiangsu, Peoples R China
[2] Jiangsu Normal Univ, Coll Comp Sci & Technol, Xuzhou 221116, Jiangsu, Peoples R China
基金
中国国家自然科学基金;
关键词
Concept factorization; Locally consistent concept factorization; Semi-supervised document clustering; NONNEGATIVE MATRIX FACTORIZATION;
D O I
10.1016/j.ins.2015.10.038
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Nonnegative Matrix Factorization (NMF) and Concept Factorization (CF) are two popular methods for finding the low-rank approximation of nonnegative matrix. Different from NMF, CF can be applied not only to the matrix containing negative values but also to the kernel space. Based on NMF and CF, many methods, such as Graph regularized Nonnegative Matrix Factorization (GNMF) and Locally Consistent Clustering Factorization (LCCF) can significandy improve the performance of clustering. Unfortunately, these are unsupervised learning methods. In order to enhance the clustering performance with the supervisory information, a Semi-Supervised Concept Factorization (SSCF) is proposed in this paper by incorporating the pairwise constraints into CF as the reward and penalty terms, which can guarantee that the data points belonging to a cluster in the original space are still in the same cluster in the transformed space. By comparing with the state-of-the-arts algorithms (KM, NMF, CF, LCCF, GNMF, PCCF), experimental results on document clustering show that the proposed algorithm has better performance in terms of accuracy and mutual information. (C) 2015 Elsevier Inc. All rights reserved.
引用
收藏
页码:86 / 98
页数:13
相关论文
共 50 条
  • [21] SEMI-SUPERVISED SPECTRAL CLUSTERING
    Mai, Xiaoyi
    Couillet, Romain
    [J]. 2018 CONFERENCE RECORD OF 52ND ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS, AND COMPUTERS, 2018, : 2012 - 2016
  • [22] A review on semi-supervised clustering
    Cai, Jianghui
    Hao, Jing
    Yang, Haifeng
    Zhao, Xujun
    Yang, Yuqing
    [J]. INFORMATION SCIENCES, 2023, 632 : 164 - 200
  • [23] Semi-supervised clustering methods
    Bair, Eric
    [J]. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL STATISTICS, 2013, 5 (05): : 349 - 361
  • [24] Solving consensus and semi-supervised clustering problems using nonnegative matrix factorization
    Li, Tao
    Ding, Chris
    Jordan, Michael I.
    [J]. ICDM 2007: PROCEEDINGS OF THE SEVENTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, 2007, : 577 - +
  • [25] Semi-supervised fuzzy co-clustering algorithm for document categorization
    Yan, Yang
    Chen, Lihui
    Tjhi, William-Chandra
    [J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2013, 34 (01) : 55 - 74
  • [26] Semi-supervised model-based document clustering: A comparative study
    Zhong, Shi
    [J]. MACHINE LEARNING, 2006, 65 (01) : 3 - 29
  • [27] An active learning framework for semi-supervised document clustering with language modeling
    Huang, Ruizhang
    Lam, Wai
    [J]. DATA & KNOWLEDGE ENGINEERING, 2009, 68 (01) : 49 - 67
  • [28] Hypergraph based semi-supervised symmetric nonnegative matrix factorization for image clustering
    Yin, Jingxing
    Peng, Siyuan
    Yang, Zhijing
    Chen, Badong
    Lin, Zhiping
    [J]. PATTERN RECOGNITION, 2023, 137
  • [29] Semi-supervised Nonnegative Matrix Factorization for Microblog Clustering Based on Term Correlation
    Ma, Huifang
    Jia, Meihuizi
    Shi, Yakai
    Hao, Zhanjun
    [J]. WEB TECHNOLOGIES AND APPLICATIONS, APWEB 2014, 2014, 8709 : 511 - 516
  • [30] Discriminative semi-supervised non-negative matrix factorization for data clustering
    Xing, Zhiwei
    Wen, Meng
    Peng, Jigen
    Feng, Jinqian
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2021, 103