Optimal Matrix Sketching over Sliding Windows

被引:0
|
作者
Yin, Hanyan [1 ]
Wen, Dongxie [1 ]
Li, Jiajun [1 ]
Wei, Zhewei [1 ]
Zhang, Xiao [1 ]
Huang, Zengfeng [2 ]
Li, Feifei [3 ]
机构
[1] Renmin Univ China, Beijing, Peoples R China
[2] Fudan Univ, Shanghai, Peoples R China
[3] Alibaba Grp, Hangzhou, Peoples R China
来源
PROCEEDINGS OF THE VLDB ENDOWMENT | 2024年 / 17卷 / 09期
基金
中国国家自然科学基金;
关键词
FREQUENT DIRECTIONS; PCA;
D O I
10.14778/3665844.3665847
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Matrix sketching, aimed at approximating a matrix A E Rxd x d consisting of vector streams of length N with a smaller sketching matrix B E l x d , f <2 /M A M 2 R . The matrix sketching problem becomes particularly interesting in the context of sliding windows, where the goal is to approximate the matrix A ll , formed by input vectors over the most recent N time units. However, despite recent efforts, whether achieving the (d d ) optimal O R space bound on sliding windows is possible has remained an open question. In this paper, we introduce the DS-FD algorithm, which achieves ( d ) the optimal O R space bound for matrix sketching over row- normalized, sequence-based sliding windows. We also present matching upper and lower space bounds for time-based and unnormalized sliding windows, demonstrating the generality and optimality of DS-FD across various sliding window models. This conclusively answers the open question regarding the optimal space bound for matrix sketching over sliding windows. We conduct extensive experiments with both synthetic and real-world datasets, validating our theoretical claims and thus confirming the correctness and effectiveness of our algorithm, both theoretically and empirically.
引用
收藏
页码:2149 / 2161
页数:13
相关论文
共 50 条
  • [31] Efficient Representative Subset Selection over Sliding Windows
    Wang, Yanhao
    Li, Yuchen
    Tan, Kian-Lee
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2019, 31 (07) : 1327 - 1340
  • [32] Dynamic adjustment of sliding windows over data streams
    Zhang, DD
    Li, JZ
    Zhang, ZG
    Wang, WP
    Guo, LJ
    [J]. ADVANCES IN WEB-AGE INFORMATION MANAGEMENT: PROCEEDINGS, 2004, 3129 : 24 - 33
  • [33] Efficiently Summarizing Data Streams over Sliding Windows
    Rivetti, Nicolo
    Busnel, Yann
    Mostefaoui, Achour
    [J]. 2015 IEEE 14th International Symposium on Network Computing and Applications (NCA), 2015, : 151 - 158
  • [34] Maintaining Significant Stream Statistics over Sliding Windows
    Lee, L. K.
    Ting, H. F.
    [J]. PROCEEDINGS OF THE SEVENTHEENTH ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS, 2006, : 724 - 732
  • [35] Clustering Data Streams over Sliding Windows by DCA
    Ta Minh Thuy
    Le Thi Hoai An
    Boudjeloud-Assala, Lydia
    [J]. ADVANCED COMPUTATIONAL METHODS FOR KNOWLEDGE ENGINEERING, 2013, 479 : 65 - 75
  • [36] Distinct estimate of set expressions over sliding windows
    Jin, CQ
    Zhou, AY
    [J]. WEB TECHNOLOGIES RESEARCH AND DEVELOPMENT - APWEB 2005, 2005, 3399 : 530 - 535
  • [37] Outlier Detection over Sliding Windows for Probabilistic Data Streams
    Bin Wang
    Xiao-Chun Yang
    Guo-Ren Wang
    Ge Yu
    [J]. Journal of Computer Science and Technology, 2010, 25 : 389 - 400
  • [38] Querying sliding windows over on-line data streams
    Golab, Lukasz
    [J]. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2004, 3268 : 1 - 11
  • [39] Querying sliding windows over on-line data streams
    Golab, L
    [J]. CURRENT TRENDS IN DATABASE TECHNOLOGY - EDBT 2004 WORKSHOPS, PROCEEDINGS, 2004, 3268 : 1 - 11
  • [40] Tracking clusters in evolving data streams over sliding windows
    Zhou, Aoying
    Cao, Feng
    Qian, Weining
    Jin, Cheqing
    [J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2008, 15 (02) : 181 - 214