Topic-Constrained Hierarchical Clustering for Document Datasets

被引:0
|
作者
Zhao, Ying [1 ]
机构
[1] Tsinghua Univ, Dept Comp Sci & Technol, Beijing 100084, Peoples R China
关键词
Constrained hierarchical clustering; Semi-supervised learning; Criterion functions;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we propose the topic-constrained hierarchical clustering, which organizes document datasets into hierarchical trees consistant with a given set of topics. The proposed algorithm is based on a constrained agglomerative clustering framework and a semi-supervised criterion function that emphasizes the relationship between documents and topics and the relationship among documents themselves simultaneously. The experimental evaluation show that our algorithm outperformed the traditional agglomerative algorithm by 7.8% to 11.4%.
引用
收藏
页码:181 / 192
页数:12
相关论文
共 50 条
  • [1] Hierarchical clustering algorithms for document datasets
    Zhao, Y
    Karypis, G
    [J]. DATA MINING AND KNOWLEDGE DISCOVERY, 2005, 10 (02) : 141 - 168
  • [2] Hierarchical Clustering Algorithms for Document Datasets
    Ying Zhao
    George Karypis
    Usama Fayyad
    [J]. Data Mining and Knowledge Discovery, 2005, 10 : 141 - 168
  • [3] Topic-driven Clustering for Document Datasets
    Zhao, Ying
    Karypis, George
    [J]. PROCEEDINGS OF THE FIFTH SIAM INTERNATIONAL CONFERENCE ON DATA MINING, 2005, : 358 - 369
  • [4] Topic-Constrained Influence Maximization in Social Networks
    Manaskasemsak, Bundit
    Phuangpanya, Rattana
    Rungsawang, Arnon
    [J]. PROCEEDINGS OF THE 3RD INTERNATIONAL CONFERENCE ON COMMUNICATION AND INFORMATION PROCESSING (ICCIP 2017), 2017, : 405 - 410
  • [6] A Hierarchical Clustering Approach for Image Datasets
    Pandey, Shreelekha
    Khanna, Pritee
    [J]. 2014 9TH INTERNATIONAL CONFERENCE ON INDUSTRIAL AND INFORMATION SYSTEMS (ICIIS), 2014, : 194 - +
  • [7] Hierarchical clustering algorithms for large datasets
    Stekh, Yuri
    Kernytskyy, Andriy
    Lobur, Mykhaylo
    [J]. TCSET 2006: MODERN PROBLEMS OF RADIO ENGINEERING, TELECOMMUNICATIONS AND COMPUTER SCIENCE, PROCEEDINGS, 2006, : 388 - 390
  • [8] Topic Discovery and Topic-Driven Clustering for Audit Method Datasets
    Zhao, Ying
    Fu, Wanyu
    Huang, Shaobin
    [J]. ADVANCED DATA MINING AND APPLICATIONS, PT II, 2011, 7121 : 346 - +
  • [9] Document clustering with hierarchical algorithm
    Wang, Y
    Hodges, J
    [J]. Proceedings of the 8th Joint Conference on Information Sciences, Vols 1-3, 2005, : 1614 - 1617
  • [10] Distributed hierarchical document clustering
    Deb, Debzani
    Fuad, M. Muztaba
    Angryk, Rafal A.
    [J]. PROCEEDINGS OF THE IASTED INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTER SCIENCE AND TECHNOLOGY, 2006, : 328 - +