Frequent Item Mining When Obtaining Support Is Costly

被引:0
|
作者
Lin, Joe Wing-Ho [1 ]
Wong, Raymond Chi-Wing [1 ]
机构
[1] Hong Kong Univ Sci & Technol, Kowloon, Hong Kong, Peoples R China
来源
BIG DATA ANALYTICS AND KNOWLEDGE DISCOVERY, DAWAK 2019 | 2019年 / 11708卷
关键词
Frequent item mining; Random sampling;
D O I
10.1007/978-3-030-27520-4_4
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Suppose there are n users and m items, and the preference of each user for the items is revealed only upon probing, which takes time and is therefore costly. How can we quickly discover all the frequent items that are favored individually by at least a given number of users? This new problem not only has strong connections with several well-known problems, such as the frequent item mining problem, it also finds applications in fields such as sponsored search and marketing surveys. Unlike traditional frequent item mining, however, our problem assumes no prior knowledge of users' preferences, and thus obtaining the support of an item becomes costly. Although our problem can be settled naively by probing the preferences of all n users, the number of users is typically enormous, and each probing itself can also incur a prohibitive cost. We present a sampling algorithm that drastically reduces the number of users needed to probe to O(logm)-regardless of the number of users-as long as slight inaccuracy in the output is permitted. For reasonably sized input, our algorithm needs to probe only 0.5% of the users, whereas the naive approach needs to probe all of them.
引用
收藏
页码:37 / 56
页数:20
相关论文
共 50 条
  • [1] Frequent item set mining
    Borgelt, Christian
    WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY, 2012, 2 (06) : 437 - 456
  • [2] Distributed Frequent Item Mining
    Zhang, Yu
    INFORMATION TECHNOLOGY APPLICATIONS IN INDUSTRY, PTS 1-4, 2013, 263-266 : 1202 - 1207
  • [3] Mining frequent itemsets without support threshold: With and without item constraints
    Cheung, YL
    Fu, AWC
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2004, 16 (09) : 1052 - 1069
  • [4] Mining frequent patterns with multiple item support thresholds in tourism information databases
    Chen, Yi-Chun
    Lin, Grace
    Chan, Ya-Hui
    Shih, Meng-Jung
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2014, 8916 : 89 - 98
  • [5] MapReduce-Based Frequent Pattern Mining Framework with Multiple Item Support
    Wang, Chen-Shu
    Lin, Shiang-Lin
    Chang, Jui-Yen
    INTELLIGENT INFORMATION AND DATABASE SYSTEMS (ACIIDS 2017), PT II, 2017, 10192 : 65 - 74
  • [6] Maximal Frequent Item Sequences Mining
    Zhou Lijuan
    Zhang Zhang
    PROGRESS IN MEASUREMENT AND TESTING, PTS 1 AND 2, 2010, 108-111 : 1211 - 1216
  • [7] Parallel Frequent Item Set Mining with Selective Item Replication
    Ozkural, Eray
    Ucar, Bora
    Aykanat, Cevdet
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2011, 22 (10) : 1632 - 1640
  • [8] Unique Constraint Frequent Item Set Mining
    Greeshma, L.
    Pradeepini, G.
    2016 IEEE 6TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING (IACC), 2016, : 68 - 72
  • [9] MISFP-Growth: Hadoop-Based Frequent Pattern Mining with Multiple Item Support
    Wang, Chen-Shu
    Chang, Jui-Yen
    APPLIED SCIENCES-BASEL, 2019, 9 (10):
  • [10] Multiple Item Support Constraints Based Frequent Pattern Mining Using Dynamic Prefix Tree
    Biswas, Sudarsan
    Saha, Diganta
    Pandit, Rajat
    INTERNATIONAL JOURNAL OF UNCERTAINTY FUZZINESS AND KNOWLEDGE-BASED SYSTEMS, 2025, 33 (02) : 143 - 172