Topic-Constrained Hierarchical Clustering for Document Datasets

被引:0
|
作者
Zhao, Ying [1 ]
机构
[1] Tsinghua Univ, Dept Comp Sci & Technol, Beijing 100084, Peoples R China
关键词
Constrained hierarchical clustering; Semi-supervised learning; Criterion functions;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we propose the topic-constrained hierarchical clustering, which organizes document datasets into hierarchical trees consistant with a given set of topics. The proposed algorithm is based on a constrained agglomerative clustering framework and a semi-supervised criterion function that emphasizes the relationship between documents and topics and the relationship among documents themselves simultaneously. The experimental evaluation show that our algorithm outperformed the traditional agglomerative algorithm by 7.8% to 11.4%.
引用
收藏
页码:181 / 192
页数:12
相关论文
共 50 条
  • [21] Using topic keyword clusters for automatic document clustering
    Chang, HC
    Hsu, CC
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2005, E88D (08) : 1852 - 1860
  • [22] Using topic keyword clusters for automatic document clustering
    Chang, HC
    Hsu, CC
    THIRD INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY AND APPLICATIONS, VOL 1, PROCEEDINGS, 2005, : 419 - 424
  • [23] Document Clustering Meets Topic Modeling with Word Embeddings
    Costa, Gianni
    Ortale, Riccardo
    PROCEEDINGS OF THE 2020 SIAM INTERNATIONAL CONFERENCE ON DATA MINING (SDM), 2020, : 244 - 252
  • [24] Fuzzy clustering for topic analysis and summarization of document collections
    Witte, Rene
    Bergler, Sabine
    ADVANCES IN ARTIFICIAL INTELLIGENCE, 2007, 4509 : 476 - +
  • [25] A Novel Approach of Neural Topic Modelling for Document Clustering
    Subramani, Sandhya
    Sridhar, Vaishnavi
    Shetty, Kaushal
    2018 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI), 2018, : 2169 - 2173
  • [26] A Document Clustering Method based on Hierarchical Algorithm with Model Clustering
    Sun, Haojun
    Liu, Zhihui
    Kong, Lingjun
    2008 22ND INTERNATIONAL WORKSHOPS ON ADVANCED INFORMATION NETWORKING AND APPLICATIONS, VOLS 1-3, 2008, : 1229 - +
  • [27] Regularized Multimodal Hierarchical Topic Model for Document-by-Document Exploratory Search
    Ianina, Anastasia
    Vorontsov, Konstantin
    PROCEEDINGS OF THE 2019 25TH CONFERENCE OF OPEN INNOVATIONS ASSOCIATION (FRUCT), 2019, : 131 - 138
  • [28] Unsupervised Topic Aware Document-Level Semantic Representation for Document Clustering
    Rafi, Muhammad
    Khan, Hamza
    Nadeem, Haya
    Shakeel, Hassan
    2021 22ND INTERNATIONAL ARAB CONFERENCE ON INFORMATION TECHNOLOGY (ACIT), 2021, : 170 - 179
  • [29] CUES: A New Hierarchical Approach for Document Clustering
    Basu, Tanmay
    Murthy, C. A.
    JOURNAL OF PATTERN RECOGNITION RESEARCH, 2013, 8 (01): : 66 - 84
  • [30] Hierarchical document clustering using local patterns
    Malik, Hassan H.
    Kender, John R.
    Fradkin, Dmitriy
    Moerchen, Fabian
    DATA MINING AND KNOWLEDGE DISCOVERY, 2010, 21 (01) : 153 - 185