Semi-supervised concept factorization for document clustering

被引:45
|
作者
Lu, Mei [1 ,2 ]
Zhao, Xiang-Jun [2 ]
Zhang, Li [1 ]
Li, Fan-Zhang [1 ]
机构
[1] Suzhou Univ, Coll Comp Sci & Technol, Suzhou 215006, Jiangsu, Peoples R China
[2] Jiangsu Normal Univ, Coll Comp Sci & Technol, Xuzhou 221116, Jiangsu, Peoples R China
基金
中国国家自然科学基金;
关键词
Concept factorization; Locally consistent concept factorization; Semi-supervised document clustering; NONNEGATIVE MATRIX FACTORIZATION;
D O I
10.1016/j.ins.2015.10.038
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Nonnegative Matrix Factorization (NMF) and Concept Factorization (CF) are two popular methods for finding the low-rank approximation of nonnegative matrix. Different from NMF, CF can be applied not only to the matrix containing negative values but also to the kernel space. Based on NMF and CF, many methods, such as Graph regularized Nonnegative Matrix Factorization (GNMF) and Locally Consistent Clustering Factorization (LCCF) can significandy improve the performance of clustering. Unfortunately, these are unsupervised learning methods. In order to enhance the clustering performance with the supervisory information, a Semi-Supervised Concept Factorization (SSCF) is proposed in this paper by incorporating the pairwise constraints into CF as the reward and penalty terms, which can guarantee that the data points belonging to a cluster in the original space are still in the same cluster in the transformed space. By comparing with the state-of-the-arts algorithms (KM, NMF, CF, LCCF, GNMF, PCCF), experimental results on document clustering show that the proposed algorithm has better performance in terms of accuracy and mutual information. (C) 2015 Elsevier Inc. All rights reserved.
引用
收藏
页码:86 / 98
页数:13
相关论文
共 50 条
  • [1] Semi-supervised collective matrix factorization for topic detection and document clustering
    Wang, Ye
    Zhang, Yanchun
    Zhou, Bin
    Jia, Yan
    [J]. 2017 IEEE SECOND INTERNATIONAL CONFERENCE ON DATA SCIENCE IN CYBERSPACE (DSC), 2017, : 88 - 97
  • [2] Robust Semi-supervised Concept Factorization
    Yan, Wei
    Zhang, Bob
    Ma, Sihan
    [J]. 2017 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2017, : 1011 - 1017
  • [3] Correntropy based semi-supervised concept factorization with adaptive neighbors for clustering
    Peng, Siyuan
    Yang, Zhijing
    Nie, Feiping
    Chen, Badong
    Lin, Zhiping
    [J]. NEURAL NETWORKS, 2022, 154 : 203 - 217
  • [4] A nonnegative matrix factorization framework for semi-supervised document clustering with dual constraints
    Ma, Huifang
    Zhao, Weizhong
    Shi, Zhongzhi
    [J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2013, 36 (03) : 629 - 651
  • [5] A nonnegative matrix factorization framework for semi-supervised document clustering with dual constraints
    Huifang Ma
    Weizhong Zhao
    Zhongzhi Shi
    [J]. Knowledge and Information Systems, 2013, 36 : 629 - 651
  • [6] A semi-supervised framework for concept-based hierarchical document clustering
    Seyed Mojtaba Sadjadi
    Hoda Mashayekhi
    Hamid Hassanpour
    [J]. World Wide Web, 2023, 26 : 3861 - 3890
  • [7] A semi-supervised framework for concept-based hierarchical document clustering
    Sadjadi, Seyed Mojtaba
    Mashayekhi, Hoda
    Hassanpour, Hamid
    [J]. WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2023, 26 (06): : 3861 - 3890
  • [8] Semi-supervised adaptive kernel concept factorization
    Wu, Wenhui
    Hou, Junhui
    Wang, Shiqi
    Kwong, Sam
    Zhou, Yu
    [J]. PATTERN RECOGNITION, 2023, 134
  • [9] Graph Based Semi-Supervised Non-negative Matrix Factorization for Document Clustering
    Guan, Naiyang
    Huang, Xuhui
    Lan, Long
    Luo, Zhigang
    Zhang, Xiang
    [J]. 2012 11TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2012), VOL 1, 2012, : 404 - 408
  • [10] Orthogonal Nonnegative Matrix Tri-factorization for Semi-supervised Document Co-clustering
    Ma, Huifang
    Zhao, Weizhong
    Tan, Qing
    Shi, Zhongzhi
    [J]. ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PT II, PROCEEDINGS, 2010, 6119 : 189 - +