New algorithms for finding approximate frequent item sets

被引：7

作者：

Borgelt, Christian ^{[1
]}

Braune, Christian ^{[1
,2
]}

Koetter, Tobias ^{[3
]}

Gruen, Sonja ^{[4
,5
]}

机构：

[1] European Ctr Soft Comp, Mieres 33600, Asturias, Spain

[2] Otto Von Guericke Univ, Dept Comp Sci, D-39106 Magdeburg, Germany

[3] Univ Konstanz, Dept Comp Sci, D-78457 Constance, Germany

[4] RIKEN, Brain Sci Inst, Wako, Saitama 3510198, Japan

[5] Res Ctr Julich, Inst Neurosci & Med INM 6, Julich, Germany

来源：

SOFT COMPUTING | 2012年 / 16卷 / 05期

关键词：

ASSOCIATION; NOISE;

D O I：

10.1007/s00500-011-0776-2

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In standard frequent item set mining a transaction supports an item set only if all items in the set are present. However, in many cases this is too strict a requirement that can render it impossible to find certain relevant groups of items. By relaxing the support definition, allowing for some items of a given set to be missing from a transaction, this drawback can be amended. The resulting item sets have been called approximate, fault-tolerant or fuzzy item sets. In this paper we present two new algorithms to find such item sets: the first is an extension of item set mining based on cover similarities and computes and evaluates the subset size occurrence distribution with a scheme that is related to the Eclat algorithm. The second employs a clustering-like approach, in which the distances are derived from the item covers with distance measures for sets or binary vectors and which is initialized with a one-dimensional Sammon projection of the distance matrix. We demonstrate the benefits of our algorithms by applying them to a concept detection task on the 2008/2009 Wikipedia Selection for schools and to the neurobiological task of detecting neuron ensembles in (simulated) parallel spike trains.

引用

页码：903 / 917

页数：15

共 50 条

[41] Finding frequent items in sliding windows with multinomially-distributed item frequencies
Golab, L
DeHaan, D
López-Ortiz, A
Demaine, ED
16TH INTERNATIONAL CONFERENCE ON SCIENTIFIC AND STATISTICAL DATABASE MANAGEMENT, PROCEEDINGS, 2004, : 425 - 426
[42] Efficient Algorithms for Association Finding and Frequent Association Pattern Mining
Cheng, Gong
Liu, Daxin
Qu, Yuzhong
SEMANTIC WEB - ISWC 2016, PT I, 2016, 9981 : 119 - 134
[43] Algorithms for finding maximal-scoring segment sets
Csurös, M
ALGORITHMS IN BIOINFORMATICS, PROCEEDINGS, 2004, 3240 : 62 - 73
[44] A Compact Data Structure Based Technique for Mining Frequent Closed Item Sets
Ahuja, Kamlesh
Mishra, Durgesh Kumar
Jain, Sarika
SMART TRENDS IN INFORMATION TECHNOLOGY AND COMPUTER COMMUNICATIONS, SMARTCOM 2016, 2016, 628 : 503 - 508
[45] Application of Hybrid Ant Colony Algorithm for Mining Maximum Frequent Item Sets
Gao Ye
Tang Xiao-lan
2015 IEEE International Conference on Signal Processing, Communications and Computing (ICSPCC), 2015, : 781 - 784
[46] Energy efficient in-sensor data cleaning for mining frequent item sets
Bahi, Jacques M.
Makhoul, Abdallah
Medlej, Maguy
Sensors and Transducers, 2012, 14 (SPEC. 2): : 64 - 78
[47] Fast sequential and parallel algorithms for finding extremal sets
Shen, H
Evans, DJ
INTERNATIONAL JOURNAL OF COMPUTER MATHEMATICS, 1996, 61 (3-4) : 195 - 211
[48] A new tool for finding approximate symmetry
Brock, Carolyn Pratt
ACTA CRYSTALLOGRAPHICA SECTION C-STRUCTURAL CHEMISTRY, 2019, 75 (07): : 835 - 836
[49] Protocol Keywords Extraction Method Based on Frequent Item-Sets Mining
Li, Gaochao
Qian, Qiang
Wang, Zhonghua
Zou, Xin
Chen, Xunxun
Wu, Xiao
PROCEEDINGS OF THE 2018 INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND SYSTEM (ICISS 2018), 2018, : 53 - 58
[50] Distilling Architectural Design Decisions and their Relationships using Frequent Item-Sets
Sobernig, Stefan
Zdun, Uwe
2016 13TH WORKING IEEE/IFIP CONFERENCE ON SOFTWARE ARCHITECTURE (WICSA), 2016, : 61 - 70

← 1 2 3 4 5 →