Adherence clustering: an efficient method for mining market-basket clusters

被引:7
|
作者
Yun, CH [1 ]
Chuang, KT [1 ]
Chen, MS [1 ]
机构
[1] Natl Taiwan Univ, Dept Elect Engn, Taipei, Taiwan
关键词
data mining; clustering market-basket data; category-based adherence; k-todes;
D O I
10.1016/j.is.2004.11.008
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We explore in this paper the efficient clustering of market-basket data. Different from those of the traditional data, the features of market-basket data are known to be of high dimensionality and sparsity. Without explicitly considering the presence of the taxonomy, most prior efforts on clustering market-basket data can be viewed as dealing with items in the leaf level of the taxonomy tree. Clustering transactions across different levels of the taxonomy is of great importance for marketing strategies as well as for the result representation of the clustering techniques for market-basket data. In view of the features of market-basket data, we devise in this paper a novel measurement, called the category-based adherence, and utilize this measurement to perform the clustering. With this category-based adherence measurement, we develop an efficient clustering algorithm, called algorithm k-todes, for market-basket data with the objective to minimize the category-based adherence. The distance of an item to a given cluster is defined as the number of links between this item and its nearest tode. The category-based adherence of a transaction to a cluster is then defined as the average distance of the items in this transaction to that cluster. A validation model based on information gain is also devised to assess the quality of clustering for market-basket data. As validated by both real and synthetic datasets, it is shown by our experimental results, with the taxonomy information, algorithm k-todes devised in this paper significantly outperforms the prior works in both the execution efficiency and the clustering quality as measured by information gain, indicating the usefulness of category-based adherence in market-basket data clustering. (c) 2004 Elsevier B.V. All rights reserved.
引用
收藏
页码:170 / 186
页数:17
相关论文
共 50 条
  • [21] An efficient clustering method for high-dimensional data mining
    Chang, JW
    Kim, YK
    ADVANCES IN ARTIFICIAL INTELLIGENCE - SBIA 2004, 2004, 3171 : 276 - 285
  • [22] A review of market basket analysis on business intelligence and data mining
    Sjarif N.N.A.
    Azmi N.F.M.
    And S.S.Y.
    Wong D.H.-T.
    International Journal of Business Intelligence and Data Mining, 2021, 18 (03): : 383 - 394
  • [23] Calculating a new data mining algorithm for market basket analysis
    Hu, ZJ
    Chin, WN
    Takeichi, M
    PRACTICAL ASPECTS OF DECLARATIVE LANGUAGES, 2000, 1753 : 169 - 184
  • [24] Various Mining Techniques Defined For Mining Product Valuation Instances In Market Basket Data
    Chavan, Gaurav
    Samal, Twinkle
    Palivela, Hemant
    Gaikwad, Nikhil
    Sonule, Avinash
    2014 INTERNATIONAL CONFERENCE ON GREEN COMPUTING COMMUNICATION AND ELECTRICAL ENGINEERING (ICGCCEE), 2014,
  • [25] An efficient clustering method of data mining for high-dimensional data
    Chang, JW
    Kang, HM
    8TH WORLD MULTI-CONFERENCE ON SYSTEMICS, CYBERNETICS AND INFORMATICS, VOL II, PROCEEDINGS: COMPUTING TECHNIQUES, 2004, : 273 - 278
  • [26] Mutual information based clustering of market basket data for profiling users
    Ende, Bartholomaeus
    Brause, Ruediger
    19TH IEEE INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, VOL I, PROCEEDINGS, 2007, : 374 - +
  • [27] Integrating Collaborative Filtering and Association Rule Mining for Market Basket Recommendation
    Wang, Feiran
    Wen, Yiping
    Chen, Jinjun
    Cao, Buqing
    WEB INFORMATION SYSTEMS ENGINEERING, WISE 2018, PT II, 2018, 11234 : 19 - 34
  • [28] Extending market basket analysis with graph mining techniques: A real case
    Videla-Cavieres, Ivan F.
    Rios, Sebastian A.
    EXPERT SYSTEMS WITH APPLICATIONS, 2014, 41 (04) : 1928 - 1936
  • [29] Mining market basket data using share measures and characterized itemsets
    Hilderman, RJ
    Carter, CL
    Hamilton, HJ
    Cercone, N
    RESEARCH AND DEVELOPMENT IN KNOWLEDGE DISCOVERY AND DATA MINING, 1998, 1394 : 159 - 173
  • [30] Mining association rules with T-test in market basket data
    Qiang, Y
    Li, YJ
    Jie, Z
    PROCEEDINGS OF 2002 INTERNATIONAL CONFERENCE ON MANAGEMENT SCIENCE & ENGINEERING, VOLS I AND II, 2002, : 234 - 237