New algorithms for finding approximate frequent item sets

被引:7
|
作者
Borgelt, Christian [1 ]
Braune, Christian [1 ,2 ]
Koetter, Tobias [3 ]
Gruen, Sonja [4 ,5 ]
机构
[1] European Ctr Soft Comp, Mieres 33600, Asturias, Spain
[2] Otto Von Guericke Univ, Dept Comp Sci, D-39106 Magdeburg, Germany
[3] Univ Konstanz, Dept Comp Sci, D-78457 Constance, Germany
[4] RIKEN, Brain Sci Inst, Wako, Saitama 3510198, Japan
[5] Res Ctr Julich, Inst Neurosci & Med INM 6, Julich, Germany
关键词
ASSOCIATION; NOISE;
D O I
10.1007/s00500-011-0776-2
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In standard frequent item set mining a transaction supports an item set only if all items in the set are present. However, in many cases this is too strict a requirement that can render it impossible to find certain relevant groups of items. By relaxing the support definition, allowing for some items of a given set to be missing from a transaction, this drawback can be amended. The resulting item sets have been called approximate, fault-tolerant or fuzzy item sets. In this paper we present two new algorithms to find such item sets: the first is an extension of item set mining based on cover similarities and computes and evaluates the subset size occurrence distribution with a scheme that is related to the Eclat algorithm. The second employs a clustering-like approach, in which the distances are derived from the item covers with distance measures for sets or binary vectors and which is initialized with a one-dimensional Sammon projection of the distance matrix. We demonstrate the benefits of our algorithms by applying them to a concept detection task on the 2008/2009 Wikipedia Selection for schools and to the neurobiological task of detecting neuron ensembles in (simulated) parallel spike trains.
引用
收藏
页码:903 / 917
页数:15
相关论文
共 50 条
  • [31] FrogCOL and FrogMIS: new decentralized algorithms for finding large independent sets in graphs
    Christian Blum
    Borja Calvo
    Maria J. Blesa
    Swarm Intelligence, 2015, 9 : 205 - 227
  • [32] FrogCOL and FrogMIS: new decentralized algorithms for finding large independent sets in graphs
    Blum, Christian
    Calvo, Borja
    Blesa, Maria J.
    SWARM INTELLIGENCE, 2015, 9 (2-3) : 205 - 227
  • [33] New FPT Algorithms for Finding the Temporal Hybridization Number for Sets of Phylogenetic Trees
    Borst, Sander
    van Iersel, Leo
    Jones, Mark
    Kelk, Steven
    ALGORITHMICA, 2022, 84 (07) : 2050 - 2087
  • [34] DWMiner: A tool for mining frequent item sets efficiently in data warehouses
    Almentero, Bruno Kinder
    Evsukoff, Alexandre Goncalves
    Mattoso, Marta
    HIGH PERFORMANCE COMPUTING FOR COMPUTATIONAL SCIENCE - VECPAR 2006, 2007, 4395 : 212 - +
  • [35] An Efficient Approach for Mining Frequent Item sets with Transaction Deletion Operation
    Bay Vo
    Thien-Phuong Le
    Tzung-Pei Hong
    Bac Le
    Jung, Jason
    INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2016, 13 (05) : 595 - 602
  • [36] arules -: A computational environment for mining association rules and frequent item sets
    Hahsler, M
    Grün, B
    Hornik, K
    JOURNAL OF STATISTICAL SOFTWARE, 2005, 14 (15):
  • [37] Understanding spatial concentrations of road accidents using frequent item sets
    Geurts, K
    Thomas, I
    Wets, G
    ACCIDENT ANALYSIS AND PREVENTION, 2005, 37 (04): : 787 - 799
  • [38] Finding locally and periodically frequent sets and periodic association rules
    Mahanta, AK
    Mazarbhuiya, FA
    Baruah, HK
    PATTERN RECOGNITION AND MACHINE INTELLIGENCE, PROCEEDINGS, 2005, 3776 : 576 - 582
  • [39] Frequent Pattern Mining Algorithms for Finding Associated Frequent Patterns for Data Streams: A Survey
    Nasreen, Shamila
    Azam, Muhammad Awais
    Shehzad, Khurram
    Naeem, Usman
    Ghazanfar, Mustansar Ali
    5TH INTERNATIONAL CONFERENCE ON EMERGING UBIQUITOUS SYSTEMS AND PERVASIVE NETWORKS / THE 4TH INTERNATIONAL CONFERENCE ON CURRENT AND FUTURE TRENDS OF INFORMATION AND COMMUNICATION TECHNOLOGIES IN HEALTHCARE / AFFILIATED WORKSHOPS, 2014, 37 : 109 - +
  • [40] Efficient algorithms for association finding and frequent association pattern mining
    Cheng, Gong (gcheng@nju.edu.cn), 1600, Springer Verlag (9981 LNCS):