Mining fuzzy frequent itemsets for hierarchical document clustering

被引:34
|
作者
Chen, Chun-Ling [1 ]
Tseng, Frank S. C. [2 ]
Liang, Tyne [1 ]
机构
[1] Natl Chiao Tung Univ, Dept Comp Sci, Hsinchu 300, Taiwan
[2] Natl Kaohsiung First Univ Sci & Technol, Dept Informat Management, Yenchao 824, Kaoshiung, Taiwan
关键词
Fuzzy association rule mining; Text mining; Hierarchical document clustering; Frequent itemsets;
D O I
10.1016/j.ipm.2009.09.009
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
As text documents are explosively increasing in the Internet, the process of hierarchical document clustering has been proven to be useful for grouping similar documents for versatile applications. However, most document clustering methods still suffer from challenges in dealing with the problems of high dimensionality, scalability, accuracy, and meaningful cluster labels. In this paper, we will present an effective Fuzzy Frequent Item-set-Based Hierarchical Clustering ((FIHC)-I-2) approach, which uses fuzzy association rule mining algorithm to improve the clustering accuracy of Frequent Item-set-Based Hierarchical Clustering (FIHC) method, In our approach, the key terms will be extracted from the document set, and each document is pre-processed into the designated representation for the following mining process. Then, a fuzzy association rule mining algorithm for text is employed to discover a set of highly-related fuzzy frequent itemsets, which contain key terms to be regarded as the labels of the candidate clusters. Finally, these documents will be clustered into a hierarchical cluster tree by referring to these candidate clusters. We have conducted experiments to evaluate the performance based on Classic4, Hitech, ReO, Reuters, and Wap datasets. The experimental results show that our approach not only absolutely retains the merits of FIHC, but also improves the accuracy quality of FIHC. Crown Copyright (C) 2009 Published by Elsevier Ltd. All rights reserved.
引用
收藏
页码:193 / 211
页数:19
相关论文
共 50 条
  • [1] Hierarchical document clustering using frequent itemsets
    Fung, BCM
    Wang, K
    Ester, M
    PROCEEDINGS OF THE THIRD SIAM INTERNATIONAL CONFERENCE ON DATA MINING, 2003, : 59 - 70
  • [2] A fast Algorithm for mining fuzzy frequent itemsets
    Lin, Jerry Chun-Wei
    Li, Ting
    Fournier-Viger, Philippe
    Hong, Tzung-Pei
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2015, 29 (06) : 2373 - 2379
  • [3] Efficient Mining of Multiple Fuzzy Frequent Itemsets
    Jerry Chun-Wei Lin
    Ting Li
    Philippe Fournier-Viger
    Tzung-Pei Hong
    Jimmy Ming-Tai Wu
    Justin Zhan
    International Journal of Fuzzy Systems, 2017, 19 : 1032 - 1040
  • [4] Efficient Mining of Multiple Fuzzy Frequent Itemsets
    Lin, Jerry Chun-Wei
    Li, Ting
    Fournier-Viger, Philippe
    Hong, Tzung-Pei
    Wu, Jimmy Ming-Tai
    Zhan, Justin
    INTERNATIONAL JOURNAL OF FUZZY SYSTEMS, 2017, 19 (04) : 1032 - 1040
  • [5] MINING FUZZY FREQUENT ITEMSETS BY PROJECTION TECHNIQUES
    Lan, Guo-Cheng
    Hong, Tzung-Pei
    Lin, Yi-Hsin
    Tsai, Chun-Wei
    PROCEEDINGS OF 2013 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS (ICMLC), VOLS 1-4, 2013, : 1691 - 1694
  • [6] Frequent Document Mining Algorithm with Clustering
    Soni, Rakesh Kumar
    Gupta, Neetesh
    Sinhal, Amit
    Sahu, Shiv K.
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2015, 15 (09): : 38 - 43
  • [7] Mining fuzzy frequent itemsets based on UBFFP trees
    Lin, Chun-Wei
    Hong, Tzung-Pei
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2014, 27 (01) : 535 - 548
  • [8] Fast Algorithms for Mining Multiple Fuzzy Frequent Itemsets
    Lin, Jerry Chun-Wei
    Li, Ting
    Fournier-Viger, Philippe
    Hong, Tzung-Pei
    Su, Ja-Hwung
    2016 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE), 2016, : 2113 - 2119
  • [9] An UBMFFP Tree for Mining Multiple Fuzzy Frequent Itemsets
    Lin, Jerry Chun-Wei
    Hong, Tzung-Pei
    Lin, Tsung-Ching
    Pan, Shing-Tai
    INTERNATIONAL JOURNAL OF UNCERTAINTY FUZZINESS AND KNOWLEDGE-BASED SYSTEMS, 2015, 23 (06) : 861 - 879
  • [10] Mining Complete Fuzzy Frequent Itemsets by Tree Structures
    Hong, Tzung-Pei
    Lin, Chun-Wei
    Lin, Tsung-Ching
    2010 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC 2010), 2010,