A More Accurate Space Saving Algorithm for Finding the Frequent Items

被引:0
|
作者
Zhou Jun [1 ]
Chen Ming [1 ]
Xiong Huan [2 ]
机构
[1] PLAUST, Inst Command Automat, Dept Comp Sci, Nanjing, Peoples R China
[2] China Elect Syst Engn Res Inst, Dept Comp Network, Beijing, Peoples R China
关键词
component; data stream; frequent items; LRU; NetFlow; anomaly detection;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The frequent items problem is to process a stream as a stream of items and find all items occurring more than a given fraction of the time. It is one of the most heavily studied problems in data stream mining, dating back to the 1980s. Aiming at higher false positive rate of the Space-Saving algorithm, an LRU-based (Least Recently Used, LRU) improved algorithm with low frequency item pre-eliminated is proposed. Accuracy, stability and adaptability of the improved algorithm have been apparently enhanced. Experimental results indicate that the algorithm can not only be used to find the frequent items, and can be used to estimate the frequency of them precisely. The improved algorithm can be used for online processing both high-speed network packet stream and backbone NetFlow stream.
引用
收藏
页数:5
相关论文
共 50 条
  • [41] A High-Performance Algorithm for Identifying Frequent Items in Data Streams
    Anderson, Daniel
    Bevan, Pryce
    Lang, Kevin
    Liberty, Edo
    Rhodes, Lee
    Thaler, Justin
    PROCEEDINGS OF THE 2017 INTERNET MEASUREMENT CONFERENCE (IMC'17), 2017, : 268 - 282
  • [42] A Mining Algorithm of Frequent Items in Data Streams Based on Apache Storm
    Hu, Weihua
    Guo, Ziang
    Chen, Mingzhong
    PROCEEDINGS OF THE 4TH INTERNATIONAL CONFERENCE ON MECHATRONICS, MATERIALS, CHEMISTRY AND COMPUTER ENGINEERING 2015 (ICMMCCE 2015), 2015, 39 : 2926 - 2930
  • [43] An Algorithm for Mining Frequent Items on Data Stream Using Fading Factor
    Chen, Ling
    Zhang, Shan
    Tu, Li
    2009 IEEE 33RD INTERNATIONAL COMPUTER SOFTWARE AND APPLICATIONS CONFERENCE, VOLS 1 AND 2, 2009, : 845 - +
  • [44] Building a more accurate classifier based on strong frequent patterns
    Sucahyo, YG
    Gopalan, RP
    AI 2004: ADVANCES IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2004, 3339 : 1036 - 1042
  • [45] Algorithm based on counting for mining frequent items over data stream
    Zhu, Ranwei
    Wang, Peng
    Liu, Majin
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2011, 48 (10): : 1803 - 1811
  • [46] Successful Explanations Start with Accurate Descriptions: Questionnaire Items as Personality Markers for More Accurate Predictions
    Seeboth, Anne
    Mottus, Rene
    EUROPEAN JOURNAL OF PERSONALITY, 2018, 32 (03) : 186 - 201
  • [47] A new algorithm finding frequent itemsets based on minimum separating
    Zhou Qihai
    Chen Yongming
    Wu Hongyu
    IITA 2007: WORKSHOP ON INTELLIGENT INFORMATION TECHNOLOGY APPLICATION, PROCEEDINGS, 2007, : 38 - 42
  • [48] Finding frequent closed itemsets with an extended version of the Eclat algorithm
    Szathmary, Laszlo
    ANNALES MATHEMATICAE ET INFORMATICAE, 2018, 48 : 75 - 82
  • [49] An improved parallel algorithm for finding frequent item-sets
    She, CD
    Li, L
    Wang, HB
    Gao, B
    Deng, HQ
    PROCEEDINGS OF THE 2004 INTERNATIONAL CONFERENCE ON INTELLIGENT MECHATRONICS AND AUTOMATION, 2004, : 383 - 386
  • [50] A Sampling Method of Finding Top-k Frequent Items on Timestamp-based Stream
    Li, Wenfeng
    Wang, Liwei
    Peng, Zhiyong
    Li, Deyi
    2014 11th Web Information System and Application Conference (WISA), 2014, : 221 - 226