Intention-guided deep semi-supervised document clustering via metric learning

被引:2
|
作者
Li, Jingnan [1 ,2 ]
Lin, Chuan [1 ,2 ,3 ]
Huang, Ruizhang [1 ,2 ]
Qin, Yongbin [1 ,2 ]
Chen, Yanping [1 ,2 ]
机构
[1] Guizhou Univ, State Key Lab Publ Big Data, Guiyang 550025, Peoples R China
[2] Guizhou Univ, Coll Comp Sci & Technol, Guiyang 550025, Peoples R China
[3] Guizhou Univ, Guiyang 550025, Peoples R China
基金
中国国家自然科学基金;
关键词
Intention; Semi; -supervised; Clustering; Metric learning; NETWORKS;
D O I
10.1016/j.jksuci.2022.12.010
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The intention expresses the user's preference for document structure division. Intention-guided document structure division is an important task in the field of text mining. To achieve this goal, deep semi-supervised document clustering provides a promising solution to personalized document clustering. However, traditional deep semi-supervised clustering models suffer from the problem of the limited number of constraints which is insufficient for intention-guided document clustering. Moreover, documents normally have various emphases on their representations to reflect different structural opinions. In this paper, we proposed an intention-guided deep semi-supervised document clustering model, namely IGSC, to divide document structure based on a small amount of user-provided supervised information. IGSC designs a deep metric learning network to solve the above problems. The deep metric learner explores the user's global intention and outputs an intention matrix. The intention is explored from the small amount user provided pairwise constraints and is used to guide the representation learning. Moreover, IGSC uses the intention matrix to guide the clustering process, to get the clustering results that best meet the user's intention. This paper compares IGSC with a number of document clustering models on four real-world text datasets, namely Reu-10k, BBC, ACM, and Abstract. The results show that IGSC evidently improves the clustering performance and outperforms the best result of benchmark models with 7% on average. The comparison with other models and the visualization results can demonstrate that IGSC is effective.& COPY; 2022 The Author(s). Published by Elsevier B.V. on behalf of King Saud University. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
引用
收藏
页码:416 / 425
页数:10
相关论文
共 50 条
  • [1] Semi-supervised Clustering with Deep Metric Learning
    Li, Xiaocui
    Yin, Hongzhi
    Zhou, Ke
    Chen, Hongxu
    Sadiq, Shazia
    Zhou, Xiaofang
    [J]. DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, 2019, 11448 : 383 - 386
  • [2] Semi-supervised clustering with deep metric learning and graph embedding
    Xiaocui Li
    Hongzhi Yin
    Ke Zhou
    Xiaofang Zhou
    [J]. World Wide Web, 2020, 23 : 781 - 798
  • [3] Semi-supervised clustering with deep metric learning and graph embedding
    Li, Xiaocui
    Yin, Hongzhi
    Zhou, Ke
    Zhou, Xiaofang
    [J]. WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2020, 23 (02): : 781 - 798
  • [4] Distance metric learning guided adaptive subspace semi-supervised clustering
    Yin, Xuesong
    Hu, Enliang
    [J]. FRONTIERS OF COMPUTER SCIENCE IN CHINA, 2011, 5 (01): : 100 - 108
  • [5] Distance metric learning guided adaptive subspace semi-supervised clustering
    Xuesong Yin
    Enliang Hu
    [J]. Frontiers of Computer Science in China, 2011, 5 : 100 - 108
  • [6] Semi-supervised document clustering via active learning with pairwise constraints
    Huang, Ruizhang
    Lam, Wai
    [J]. ICDM 2007: PROCEEDINGS OF THE SEVENTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, 2007, : 517 - 522
  • [7] Semi-Supervised Metric Learning: A Deep Resurrection
    Dutta, Ujjal Kr
    Harandi, Mehrtash
    Sekhar, Chellu Chandra
    [J]. THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 7279 - 7287
  • [8] DEEP SEMI-SUPERVISED METRIC LEARNING VIA IDENTIFICATION OF MANIFOLD MEMBERSHIPS
    Zhuang, Furen
    Moulin, Pierre
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 1755 - 1759
  • [9] Semi-supervised metric learning via topology preserving multiple semi-supervised assumptions
    Wang, Qianying
    Yuen, Pong C.
    Feng, Guocan
    [J]. PATTERN RECOGNITION, 2013, 46 (09) : 2576 - 2587
  • [10] Metric learning by similarity network for deep semi-supervised learning
    Wu, Sanyou
    Feng, Xingdong
    Zhou, Fan
    [J]. DEVELOPMENTS OF ARTIFICIAL INTELLIGENCE TECHNOLOGIES IN COMPUTATION AND ROBOTICS, 2020, 12 : 995 - 1002