Adherence clustering: an efficient method for mining market-basket clusters

被引:7
|
作者
Yun, CH [1 ]
Chuang, KT [1 ]
Chen, MS [1 ]
机构
[1] Natl Taiwan Univ, Dept Elect Engn, Taipei, Taiwan
关键词
data mining; clustering market-basket data; category-based adherence; k-todes;
D O I
10.1016/j.is.2004.11.008
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We explore in this paper the efficient clustering of market-basket data. Different from those of the traditional data, the features of market-basket data are known to be of high dimensionality and sparsity. Without explicitly considering the presence of the taxonomy, most prior efforts on clustering market-basket data can be viewed as dealing with items in the leaf level of the taxonomy tree. Clustering transactions across different levels of the taxonomy is of great importance for marketing strategies as well as for the result representation of the clustering techniques for market-basket data. In view of the features of market-basket data, we devise in this paper a novel measurement, called the category-based adherence, and utilize this measurement to perform the clustering. With this category-based adherence measurement, we develop an efficient clustering algorithm, called algorithm k-todes, for market-basket data with the objective to minimize the category-based adherence. The distance of an item to a given cluster is defined as the number of links between this item and its nearest tode. The category-based adherence of a transaction to a cluster is then defined as the average distance of the items in this transaction to that cluster. A validation model based on information gain is also devised to assess the quality of clustering for market-basket data. As validated by both real and synthetic datasets, it is shown by our experimental results, with the taxonomy information, algorithm k-todes devised in this paper significantly outperforms the prior works in both the execution efficiency and the clustering quality as measured by information gain, indicating the usefulness of category-based adherence in market-basket data clustering. (c) 2004 Elsevier B.V. All rights reserved.
引用
收藏
页码:170 / 186
页数:17
相关论文
共 50 条
  • [31] Market Basket Analysis: Identify the changing trends of market data using association rule mining
    Kaur, Manpreet
    Kang, Shivani
    INTERNATIONAL CONFERENCE ON COMPUTATIONAL MODELLING AND SECURITY (CMS 2016), 2016, 85 : 78 - 85
  • [32] Efficient Market Basket Analysis based on FP-Bonsai
    Gayathri, Behera
    2017 INTERNATIONAL CONFERENCE ON I-SMAC (IOT IN SOCIAL, MOBILE, ANALYTICS AND CLOUD) (I-SMAC), 2017, : 788 - 792
  • [33] Daily intake of isoflavones based on the market basket method
    Kikuchi, Y
    Shimamura, Y
    Hirokado, M
    Yasuda, K
    Nishijima, M
    JOURNAL OF THE FOOD HYGIENIC SOCIETY OF JAPAN, 2001, 42 (02): : 122 - 127
  • [34] A new method for similarity indexing of market basket data
    Aggarwal, CC
    Wolf, JL
    Yu, PS
    SIGMOD RECORD, VOL 28, NO 2 - JUNE 1999: SIGMOD99: PROCEEDINGS OF THE 1999 ACM SIGMOD - INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 1999, : 407 - 418
  • [35] Survey on Frequent Item-Set Mining Approaches in Market Basket Analysis
    Maske, Anisha
    Joglekar, Bela
    2018 FOURTH INTERNATIONAL CONFERENCE ON COMPUTING COMMUNICATION CONTROL AND AUTOMATION (ICCUBEA), 2018,
  • [36] Grid-based data mining for market basket analysis in the retail sector
    Singh, R. P.
    Turi, A.
    Malerba, D.
    DATA MINING VIII: DATA, TEXT AND WEB MINING AND THEIR BUSINESS APPLICATIONS, 2007, 38 : 293 - +
  • [37] Research on the four kinds of frequent itemset mining algorithms in market basket analysis
    Feng, L
    Chen, HS
    CONCURRENT ENGINEERING: THE WORLDWIDE ENGINEERING GRID, PROCEEDINGS, 2004, : 329 - 332
  • [38] Market Basket Analysis with Data Mining Methods Six Sigma methodology improvement
    Trnka, Andrej
    2010 INTERNATIONAL CONFERENCE ON NETWORKING AND INFORMATION TECHNOLOGY (ICNIT 2010), 2010, : 446 - 450
  • [39] An Efficient Numerical Method for the Prediction of Clusters Using K-Means Clustering Algorithm with Bisection Method
    Napoleon, D.
    Praneesh, M.
    Sathya, S.
    SivaSubramani, M.
    GLOBAL TRENDS IN INFORMATION SYSTEMS AND SOFTWARE APPLICATIONS, PT 2, 2012, 270 : 256 - 266
  • [40] An Efficient Clustering Algorithm for Irregularly Shaped Clusters
    Tang, DongMing
    Zhu, QingXin
    Cao, Yong
    Yang, Fan
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2010, E93D (02): : 384 - 387