Optimal Matrix Sketching over Sliding Windows

被引:0
|
作者
Yin, Hanyan [1 ]
Wen, Dongxie [1 ]
Li, Jiajun [1 ]
Wei, Zhewei [1 ]
Zhang, Xiao [1 ]
Huang, Zengfeng [2 ]
Li, Feifei [3 ]
机构
[1] Renmin Univ China, Beijing, Peoples R China
[2] Fudan Univ, Shanghai, Peoples R China
[3] Alibaba Grp, Hangzhou, Peoples R China
来源
PROCEEDINGS OF THE VLDB ENDOWMENT | 2024年 / 17卷 / 09期
基金
中国国家自然科学基金;
关键词
FREQUENT DIRECTIONS; PCA;
D O I
10.14778/3665844.3665847
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Matrix sketching, aimed at approximating a matrix A E Rxd x d consisting of vector streams of length N with a smaller sketching matrix B E l x d , f <2 /M A M 2 R . The matrix sketching problem becomes particularly interesting in the context of sliding windows, where the goal is to approximate the matrix A ll , formed by input vectors over the most recent N time units. However, despite recent efforts, whether achieving the (d d ) optimal O R space bound on sliding windows is possible has remained an open question. In this paper, we introduce the DS-FD algorithm, which achieves ( d ) the optimal O R space bound for matrix sketching over row- normalized, sequence-based sliding windows. We also present matching upper and lower space bounds for time-based and unnormalized sliding windows, demonstrating the generality and optimality of DS-FD across various sliding window models. This conclusively answers the open question regarding the optimal space bound for matrix sketching over sliding windows. We conduct extensive experiments with both synthetic and real-world datasets, validating our theoretical claims and thus confirming the correctness and effectiveness of our algorithm, both theoretically and empirically.
引用
收藏
页码:2149 / 2161
页数:13
相关论文
共 50 条
  • [1] Matrix Sketching Over Sliding Windows
    Wei, Zhewei
    Liu, Xuancheng
    Li, Feifei
    Shang, Shuo
    Du, Xiaoyong
    Wen, Ji-Rong
    [J]. SIGMOD'16: PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2016, : 1465 - 1480
  • [2] Sketching asynchronous data streams over sliding windows
    Xu, Bojian
    Tirthapura, Srikanta
    Busch, Costas
    [J]. DISTRIBUTED COMPUTING, 2008, 20 (05) : 359 - 374
  • [3] Sketching asynchronous data streams over sliding windows
    Bojian Xu
    Srikanta Tirthapura
    Costas Busch
    [J]. Distributed Computing, 2008, 20 : 359 - 374
  • [4] Tracking Matrix Approximation over Distributed Sliding Windows
    Zhang, Haida
    Huang, Zengfeng
    Wei, Zhewei
    Zhang, Wenjie
    Lin, Xuemin
    [J]. 2017 IEEE 33RD INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2017), 2017, : 833 - 844
  • [5] Optimal sampling from sliding windows
    Braverman, Vladimir
    Ostrovsky, Rafail
    Zaniolo, Carlo
    [J]. JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 2012, 78 (01) : 260 - 272
  • [6] Optimal Sampling from Sliding Windows
    Braverman, Vladimir
    Ostrovsky, Rafail
    Zaniolo, Carlo
    [J]. PODS'09: PROCEEDINGS OF THE TWENTY-EIGHTH ACM SIGMOD-SIGACT-SIGART SYMPOSIUM ON PRINCIPLES OF DATABASE SYSTEMS, 2009, : 147 - 156
  • [7] Efficient Matrix Sketching over Distributed Data
    Huang, Zengfeng
    Lin, Xuemin
    Zhang, Wenjie
    Zhang, Ying
    [J]. PODS'17: PROCEEDINGS OF THE 36TH ACM SIGMOD-SIGACT-SIGAI SYMPOSIUM ON PRINCIPLES OF DATABASE SYSTEMS, 2017, : 347 - 359
  • [8] Frequency estimation over sliding windows
    Zhang, Linfeng
    Guan, Yong
    [J]. 2008 IEEE 24TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, VOLS 1-3, 2008, : 1385 - +
  • [9] Succinct Summing over Sliding Windows
    Ben Basat, Ran
    Einziger, Gil
    Friedman, Roy
    Kassner, Yaron
    [J]. ALGORITHMICA, 2019, 81 (05) : 2072 - 2091
  • [10] Submodular Optimization Over Sliding Windows
    Epasto, Alessandro
    Lattanzi, Silvio
    Vassilvitskii, Sergei
    Zadimoghaddam, Morteza
    [J]. PROCEEDINGS OF THE 26TH INTERNATIONAL CONFERENCE ON WORLD WIDE WEB (WWW'17), 2017, : 421 - 430