Semi-supervised concept factorization for document clustering

被引:45
|
作者
Lu, Mei [1 ,2 ]
Zhao, Xiang-Jun [2 ]
Zhang, Li [1 ]
Li, Fan-Zhang [1 ]
机构
[1] Suzhou Univ, Coll Comp Sci & Technol, Suzhou 215006, Jiangsu, Peoples R China
[2] Jiangsu Normal Univ, Coll Comp Sci & Technol, Xuzhou 221116, Jiangsu, Peoples R China
基金
中国国家自然科学基金;
关键词
Concept factorization; Locally consistent concept factorization; Semi-supervised document clustering; NONNEGATIVE MATRIX FACTORIZATION;
D O I
10.1016/j.ins.2015.10.038
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Nonnegative Matrix Factorization (NMF) and Concept Factorization (CF) are two popular methods for finding the low-rank approximation of nonnegative matrix. Different from NMF, CF can be applied not only to the matrix containing negative values but also to the kernel space. Based on NMF and CF, many methods, such as Graph regularized Nonnegative Matrix Factorization (GNMF) and Locally Consistent Clustering Factorization (LCCF) can significandy improve the performance of clustering. Unfortunately, these are unsupervised learning methods. In order to enhance the clustering performance with the supervisory information, a Semi-Supervised Concept Factorization (SSCF) is proposed in this paper by incorporating the pairwise constraints into CF as the reward and penalty terms, which can guarantee that the data points belonging to a cluster in the original space are still in the same cluster in the transformed space. By comparing with the state-of-the-arts algorithms (KM, NMF, CF, LCCF, GNMF, PCCF), experimental results on document clustering show that the proposed algorithm has better performance in terms of accuracy and mutual information. (C) 2015 Elsevier Inc. All rights reserved.
引用
收藏
页码:86 / 98
页数:13
相关论文
共 50 条
  • [31] Solving consensus and semi-supervised clustering problems using nonnegative matrix factorization
    Li, Tao
    Ding, Chris
    Jordan, Michael I.
    [J]. ICDM 2007: PROCEEDINGS OF THE SEVENTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, 2007, : 577 - +
  • [32] Semi-supervised Nonnegative Matrix Factorization for Microblog Clustering Based on Term Correlation
    Ma, Huifang
    Jia, Meihuizi
    Shi, Yakai
    Hao, Zhanjun
    [J]. WEB TECHNOLOGIES AND APPLICATIONS, APWEB 2014, 2014, 8709 : 511 - 516
  • [33] Discriminative semi-supervised non-negative matrix factorization for data clustering
    Xing, Zhiwei
    Wen, Meng
    Peng, Jigen
    Feng, Jinqian
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2021, 103
  • [34] Hypergraph based semi-supervised symmetric nonnegative matrix factorization for image clustering
    Yin, Jingxing
    Peng, Siyuan
    Yang, Zhijing
    Chen, Badong
    Lin, Zhiping
    [J]. PATTERN RECOGNITION, 2023, 137
  • [35] Local regularization concept factorization and its semi-supervised extension for image representation
    Shu, Zhenqiu
    Zhao, Chunxia
    Huang, Pu
    [J]. NEUROCOMPUTING, 2015, 158 : 1 - 12
  • [36] Semi-supervised ranking for document retrieval
    Duh, Kevin
    Kirchhoff, Katrin
    [J]. COMPUTER SPEECH AND LANGUAGE, 2011, 25 (02): : 261 - 281
  • [37] Active Learning of Instance-level Constraints for Semi-supervised Document Clustering
    Zhao, Weizhong
    He, Qing
    Ma, Huifang
    Shi, Zhongzhi
    [J]. 2009 IEEE/WIC/ACM INTERNATIONAL JOINT CONFERENCES ON WEB INTELLIGENCE (WI) AND INTELLIGENT AGENT TECHNOLOGIES (IAT), VOL 1, 2009, : 264 - 268
  • [38] User-interest-based document filtering via semi-supervised clustering
    Tang, N
    Vemuri, VR
    [J]. FOUNDATIONS OF INTELLIGENT SYSTEMS, PROCEEDINGS, 2005, 3488 : 573 - 582
  • [39] Hierarchical Semi-Supervised Factorization for Learning the Semantics
    Shen, Bin
    Makhambetov, Olzhas
    [J]. JOURNAL OF ADVANCED COMPUTATIONAL INTELLIGENCE AND INTELLIGENT INFORMATICS, 2014, 18 (03) : 366 - 374
  • [40] Continuous Semi-Supervised Nonnegative Matrix Factorization
    Lindstrom, Michael R. R.
    Ding, Xiaofu
    Liu, Feng
    Somayajula, Anand
    Needell, Deanna
    [J]. ALGORITHMS, 2023, 16 (04)