Topic-Constrained Hierarchical Clustering for Document Datasets

被引:0
|
作者
Zhao, Ying [1 ]
机构
[1] Tsinghua Univ, Dept Comp Sci & Technol, Beijing 100084, Peoples R China
关键词
Constrained hierarchical clustering; Semi-supervised learning; Criterion functions;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we propose the topic-constrained hierarchical clustering, which organizes document datasets into hierarchical trees consistant with a given set of topics. The proposed algorithm is based on a constrained agglomerative clustering framework and a semi-supervised criterion function that emphasizes the relationship between documents and topics and the relationship among documents themselves simultaneously. The experimental evaluation show that our algorithm outperformed the traditional agglomerative algorithm by 7.8% to 11.4%.
引用
收藏
页码:181 / 192
页数:12
相关论文
共 50 条
  • [31] An Efficient Hybrid Hierarchical Document Clustering Method
    Zhu, Yehang
    Fung, Benjamin C. M.
    Mu, Dejun
    Li, Yanling
    FIFTH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, VOL 2, PROCEEDINGS, 2008, : 395 - +
  • [32] Hierarchical model-based clustering for large datasets
    Posse, C
    JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2001, 10 (03) : 464 - 486
  • [33] Hierarchical document clustering using frequent itemsets
    Fung, BCM
    Wang, K
    Ester, M
    PROCEEDINGS OF THE THIRD SIAM INTERNATIONAL CONFERENCE ON DATA MINING, 2003, : 59 - 70
  • [34] Hierarchical document clustering using local patterns
    Hassan H. Malik
    John R. Kender
    Dmitriy Fradkin
    Fabian Moerchen
    Data Mining and Knowledge Discovery, 2010, 21 : 153 - 185
  • [35] A hierarchical consensus architecture for robust document clustering
    Sevillano, Xavier
    Cobo, German
    Alias, Francese
    Socoro, Joan Claudi
    ADVANCES IN INFORMATION RETRIEVAL, 2007, 4425 : 741 - +
  • [36] Effective data summarization for hierarchical clustering in large datasets
    Patra, Bidyut Kr.
    Nandi, Sukumar
    KNOWLEDGE AND INFORMATION SYSTEMS, 2015, 42 (01) : 1 - 20
  • [37] Efficient Hierarchical Clustering of Large High Dimensional Datasets
    Gilpin, Sean
    Qian, Buyue
    Davidson, Ian
    PROCEEDINGS OF THE 22ND ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT (CIKM'13), 2013, : 1371 - 1380
  • [38] Hierarchical Aggregation Approach for Distributed clustering of spatial datasets
    Bendechache, Malika
    Le-Khac, Nhien-An
    Kechadi, M-Tahar
    2016 IEEE 16TH INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW), 2016, : 1098 - 1103
  • [39] DHC: A Distributed Hierarchical Clustering Algorithm for Large Datasets
    Zhang, Wei
    Zhang, Gongxuan
    Chen, Xiaohui
    Liu, Yueqi
    Zhou, Xiumin
    Zhou, Junlong
    JOURNAL OF CIRCUITS SYSTEMS AND COMPUTERS, 2019, 28 (04)
  • [40] A hierarchical topic modelling approach for short text clustering
    Pradhan R.
    Sharma D.K.
    International Journal of Information and Communication Technology, 2022, 20 (04): : 463 - 481