A More Accurate Space Saving Algorithm for Finding the Frequent Items

被引:0
|
作者
Zhou Jun [1 ]
Chen Ming [1 ]
Xiong Huan [2 ]
机构
[1] PLAUST, Inst Command Automat, Dept Comp Sci, Nanjing, Peoples R China
[2] China Elect Syst Engn Res Inst, Dept Comp Network, Beijing, Peoples R China
关键词
component; data stream; frequent items; LRU; NetFlow; anomaly detection;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The frequent items problem is to process a stream as a stream of items and find all items occurring more than a given fraction of the time. It is one of the most heavily studied problems in data stream mining, dating back to the 1980s. Aiming at higher false positive rate of the Space-Saving algorithm, an LRU-based (Least Recently Used, LRU) improved algorithm with low frequency item pre-eliminated is proposed. Accuracy, stability and adaptability of the improved algorithm have been apparently enhanced. Experimental results indicate that the algorithm can not only be used to find the frequent items, and can be used to estimate the frequency of them precisely. The improved algorithm can be used for online processing both high-speed network packet stream and backbone NetFlow stream.
引用
下载
收藏
页数:5
相关论文
共 50 条
  • [1] A parallel space saving algorithm for frequent items and the Hurwitz zeta distribution
    Cafaro, Massimo
    Pulimeno, Marco
    Tempesta, Piergiulio
    INFORMATION SCIENCES, 2016, 329 : 1 - 19
  • [2] An Extension of the Apriori Algorithm for Finding Frequent Items
    Karimtabar, Noorollah
    Fard, Mohammad Javad Shayegan
    2020 6TH INTERNATIONAL CONFERENCE ON WEB RESEARCH (ICWR), 2020, : 330 - 334
  • [3] An Efficient Algorithm for Finding Frequent Items in a Stream
    Tu, Li
    Chen, Ling
    Zhang, Shan
    PROCEEDINGS OF THE SECOND INTERNATIONAL SYMPOSIUM ON ELECTRONIC COMMERCE AND SECURITY, VOL II, 2009, : 200 - +
  • [4] FINDING FREQUENT ITEMS: NOVEL METHOD FOR IMPROVING APRIORI ALGORITHM
    Karimtabar, Noorollah
    Fard, Mohammad Javad Shayegan
    COMPUTER SCIENCE-AGH, 2022, 23 (02): : 161 - 177
  • [5] Finding frequent items in parallel
    Cafaro, Massimo
    Tempesta, Piergiulio
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2011, 23 (15): : 1774 - 1788
  • [6] A Fast and Efficient Algorithm for Finding Frequent Items over Data Stream
    Chen, Ling
    Chen, Yixin
    Tu, Li
    JOURNAL OF COMPUTERS, 2012, 7 (07) : 1545 - 1554
  • [7] Finding frequent items in data streams
    Charikar, M
    Chen, K
    Farach-Colton, M
    THEORETICAL COMPUTER SCIENCE, 2004, 312 (01) : 3 - 15
  • [8] Finding the Frequent Items in Streams of Data
    Cormode, Graham
    Hadjieleftheriou, Marios
    COMMUNICATIONS OF THE ACM, 2009, 52 (10) : 97 - 105
  • [9] Finding frequent items in data streams
    Charikar, M
    Chen, K
    Farach-Colton, M
    AUTOMATA, LANGUAGES AND PROGRAMMING, 2002, 2380 : 693 - 703
  • [10] Finding Frequent Items in Data Streams
    Cormode, Graham
    Hadjieleftheriou, Marios
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2008, 1 (02): : 1530 - 1541