Lower Bounds for Processing Data with Few Random Accesses to External Memory

被引:7
|
作者
Grohe, Martin [1 ]
Hernich, Andre [2 ]
Schweikardt, Nicole [2 ]
机构
[1] Humboldt Univ, Inst Informat, D-10099 Berlin, Germany
[2] Goethe Univ Frankfurt, Inst Informat, D-60054 Frankfurt, Germany
关键词
Theory; Languages; Complexity; data streams; real-time data; query processing; query optimization; semi-structured data; XML; COMPLEXITY;
D O I
10.1145/1516512.1516514
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
We consider a scenario where we want to query a large dataset that is stored in external memory and does not fit into main memory. The most constrained resources in such a situation are the size of the main memory and the number of random accesses to external memory. We note that sequentially streaming data from external memory through main memory is much less prohibitive. We propose an abstract model of this scenario in which we restrict the size of the main memory and the number of random accesses to external memory, but admit arbitrary sequential access. A distinguishing feature of our model is that it allows the usage of unlimited external memory for storing intermediate results, such as several hard disks that can be accessed in parallel. In this model, we prove lower bounds for the problem of sorting a sequence of strings (or numbers), the problem of deciding whether two given sets of strings are equal, and two closely related decision problems. Intuitively, our results say that there is no algorithm for the problems that uses internal memory space bounded by N1-epsilon and at most o( log N) random accesses to external memory, but unlimited "streaming access", both for writing to and reading from external memory. (Here, N denotes the size of the input and e is an arbitrary constant greater than 0.) We even permit randomized algorithms with one-sided bounded error. We also consider the problem of evaluating database queries and prove similar lower bounds for evaluating relational algebra queries against relational databases and XQuery and XPath queries against XML-databases.
引用
收藏
页数:58
相关论文
共 50 条
  • [31] Communication Lower Bounds for Statistical Estimation Problems via a Distributed Data Processing Inequality
    Braverman, Mark
    Garg, Ankit
    Ma, Tengyu
    Nguyen, Huy L.
    Woodruff, David P.
    STOC'16: PROCEEDINGS OF THE 48TH ANNUAL ACM SIGACT SYMPOSIUM ON THEORY OF COMPUTING, 2016, : 1011 - 1020
  • [32] Data Processing Lower Bounds for Scalar Lossy Source Codes with Side Information at the Decoder
    Reani, Avraham
    Merhav, Neri
    2012 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY PROCEEDINGS (ISIT), 2012, : 1 - 5
  • [33] Superlinear Lower Bounds for Multipass Graph Processing
    Guruswami, Venkatesan
    Onak, Krzysztof
    ALGORITHMICA, 2016, 76 (03) : 654 - 683
  • [34] Superlinear lower bounds for multipass graph processing
    Guruswami, Venkatesan
    Onak, Krzysztof
    2013 IEEE CONFERENCE ON COMPUTATIONAL COMPLEXITY (CCC), 2013, : 287 - 298
  • [35] Superlinear Lower Bounds for Multipass Graph Processing
    Venkatesan Guruswami
    Krzysztof Onak
    Algorithmica, 2016, 76 : 654 - 683
  • [36] Lower Bounds on Retroactive Data Structures
    Chung, Lily
    Demaine, Erik D.
    Hendrickson, Dylan
    Lynch, Jayson
    Leibniz International Proceedings in Informatics, LIPIcs, 2022, 248
  • [37] Optimal Data Layout for Block-Level Random Accesses to Scratchpad
    Singapura, Shreyas G.
    Kannan, Rajgopal
    Prasanna, Viktor K.
    2017 IEEE HIGH PERFORMANCE EXTREME COMPUTING CONFERENCE (HPEC), 2017,
  • [38] Factorial Lower Bounds for (Almost) Random Order Streams
    Chiplunkar, Ashish
    Kallaugher, John
    Kapralov, Michael
    Price, Eric
    2022 IEEE 63RD ANNUAL SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE (FOCS), 2022, : 486 - 497
  • [39] Extended Formulation Lower Bounds for Refuting Random CSPs
    Brown-Cohen, Jonah
    Raghavendra, Prasad
    PROCEEDINGS OF THE THIRTY-FIRST ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS (SODA'20), 2020, : 305 - 324
  • [40] Fluctuation lower bounds in planar random growth models
    Bates, Erik
    Chatterjee, Sourav
    ANNALES DE L INSTITUT HENRI POINCARE-PROBABILITES ET STATISTIQUES, 2020, 56 (04): : 2406 - 2427