Beyond Itemsets: Mining Frequent Featuresets over Structured Items

被引:1
|
作者
Thirumuruganathan, Saravanan [1 ]
Rahman, Habibur [1 ]
Abbar, Sofiane [2 ]
Das, Gautam [1 ]
机构
[1] Univ Texas Arlington, Arlington, TX 76102 USA
[2] Qatar Comp Res Inst, Ar Rayyan, Qatar
来源
PROCEEDINGS OF THE VLDB ENDOWMENT | 2014年 / 8卷 / 03期
关键词
D O I
10.14778/2735508.2735515
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We assume a dataset of transactions generated by a set of users over structured items where each item could be described through a set of features. In this paper, we are interested in identifying the frequent featuresets (set of features) by mining item transactions. For example, in a news website, items correspond to news articles, the features are the named-entities/topics in the articles and an item transaction would be the set of news articles read by a user within the same session. We show that mining frequent featuresets over structured item transactions is a novel problem and show that straightforward extensions of existing frequent itemset mining techniques provide unsatisfactory results. This is due to the fact that while users are drawn to each item in the transaction due to a subset of its features, the transaction by itself does not provide any information about such underlying preferred features of users. In order to overcome this hurdle, we propose a featureset uncertainty model where each item transaction could have been generated by various featuresets with different probabilities. We describe a novel approach to transform item transactions into uncertain transaction over featuresets and estimate their probabilities using constrained least squares based approach. We propose diverse algorithms to mine frequent featuresets. Our experimental evaluation provides a comparative analysis of the different approaches proposed.
引用
收藏
页码:257 / 268
页数:12
相关论文
共 50 条
  • [1] Mining Frequent Itemsets over Uncertain Databases
    Tong, Yongxin
    Chen, Lei
    Cheng, Yurong
    Yu, Philip S.
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2012, 5 (11): : 1650 - 1661
  • [2] Mining approximate closed frequent itemsets over stream
    Li, Haifeng
    Lu, Zongjian
    Chen, Hong
    [J]. PROCEEDINGS OF NINTH ACIS INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ARTIFICIAL INTELLIGENCE, NETWORKING AND PARALLEL/DISTRIBUTED COMPUTING, 2008, : 405 - 410
  • [3] Mining Maximal Frequent Itemsets over Sampling Databases
    Li, Haifeng
    [J]. PROCEEDINGS OF THE 2015 2ND INTERNATIONAL FORUM ON ELECTRICAL ENGINEERING AND AUTOMATION (IFEEA 2015), 2016, 54 : 28 - 31
  • [4] An Improved Method for Mining Generalized Frequent Itemsets Based on the Correlation Between Items
    Mao, Yu Xing
    Shi, Bai Le
    [J]. CSA 2008: INTERNATIONAL SYMPOSIUM ON COMPUTER SCIENCE AND ITS APPLICATIONS, PROCEEDINGS, 2008, : 56 - 61
  • [5] Discovering frequent itemsets in the presence of highly frequent items
    Groth, DP
    Robertson, EL
    [J]. WEB KNOWLEDGE MANAGEMENT AND DECISION SUPPORTS, 2003, 2543 : 251 - 264
  • [6] A survey on algorithms for mining frequent itemsets over data streams
    Cheng, James
    Ke, Yiping
    Ng, Wilfred
    [J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2008, 16 (01) : 1 - 27
  • [7] A survey on algorithms for mining frequent itemsets over data streams
    James Cheng
    Yiping Ke
    Wilfred Ng
    [J]. Knowledge and Information Systems, 2008, 16 : 1 - 27
  • [8] Mining of Probabilistic Frequent Itemsets over Uncertain Data Streams
    Liu Lixin
    Zhang Xiaolin
    Zhang Huanxiang
    [J]. 2014 11TH WEB INFORMATION SYSTEM AND APPLICATION CONFERENCE (WISA), 2014, : 231 - 237
  • [9] Mining frequent itemsets in a stream
    Calders, Toon
    Dexters, Nele
    Gillis, Joris J.M.
    Goethals, Bart
    [J]. Information Systems, 2014, 39 (01) : 233 - 255
  • [10] Mining frequent itemsets in a stream
    Calders, Toon
    Dexters, Nele
    Goethals, Bart
    [J]. ICDM 2007: PROCEEDINGS OF THE SEVENTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, 2007, : 83 - +