Parallel Processing of Very Large Databases Using Distributed Column Indexes

被引:4
|
作者
Ivanova, E. V. [1 ]
Sokolinsky, L. B. [1 ]
机构
[1] South Ural State Univ, Chelyabinsk 454080, Russia
关键词
D O I
10.1134/S0361768817030069
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
The development and investigation of efficient methods of parallel processing of very large databases using the columnar data representation designed for computer cluster is discussed. An approach that combines the advantages of relational and column-oriented DBMSs is proposed. A new type of distributed column indexes fragmented based on the domain-interval principle is introduced. The column indexes are auxiliary structures that are constantly stored in the distributed main memory of a computer cluster. To match the elements of a column index to the tuples of the original relation, surrogate keys are used. Resource hungry relational operations are performed on the corresponding column indexes rather than on the original relations of the database. As a result, a precomputation table is obtained. Using this table, the DBMS reconstructs the resulting relation. For basic relational operations on column indexes, methods for their parallel decomposition that do not require massive data exchanges between the processor nodes are proposed. This approach improves the class OLAP query performance by hundreds of times.
引用
收藏
页码:131 / 144
页数:14
相关论文
共 50 条
  • [1] Parallel processing of very large databases using distributed column indexes
    E. V. Ivanova
    L. B. Sokolinsky
    [J]. Programming and Computer Software, 2017, 43 : 131 - 144
  • [2] Distributed parallel generation of indices for very large text databases
    Kitajima, JP
    Resende, MD
    Ribeiro-Neto, B
    Ziviani, N
    [J]. ICA(3)PP 97 - 1997 3RD INTERNATIONAL CONFERENCE ON ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, 1997, : 745 - 752
  • [3] The parallel processing of spatial selection for very large geo-spatial databases
    Tamura, K
    Nakano, Y
    Kaneko, K
    Makinouchi, A
    [J]. PROCEEDINGS OF THE EIGHTH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS, 2001, : 721 - 726
  • [4] Parallel query processing on distributed clustering indexes
    Gil-Costa, Veronica
    Marin, Mauricio
    Reyes, Nora
    [J]. JOURNAL OF DISCRETE ALGORITHMS, 2009, 7 (01) : 3 - 17
  • [5] Parallel Processing of Sensor Network Data using Column-Oriented Databases
    Kim, Kyung-Chang
    Kim, Choung-Seok
    [J]. 2013 AASRI CONFERENCE ON PARALLEL AND DISTRIBUTED COMPUTING AND SYSTEMS, 2013, 5 : 2 - 8
  • [6] A parallel spatial join processing for distributed spatial databases
    Kang, MS
    Ko, SK
    Koh, K
    Choy, YC
    [J]. FLEXIBLE QUERY ANSWERING SYSTEMS, PROCEEDINGS, 2002, 2522 : 212 - 225
  • [7] Parallel and Distributed Frequent Pattern Mining in Large Databases
    Tanbeer, Syed Khairuzzaman
    Ahmed, Chowdhury Farhan
    Jeong, Byeong-Soo
    [J]. HPCC: 2009 11TH IEEE INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS, 2009, : 407 - 414
  • [8] Parallel membership queries on very large scientific data sets using bitmap indexes
    Yildiz, Beytullah
    Wu, Kesheng
    Byna, Suren
    Shoshani, Arie
    [J]. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2019, 31 (15):
  • [9] Handling very large databases with Informix Extended Parallel Server
    Weininger, A
    [J]. SIGMOD RECORD, 2000, 29 (02) : 548 - 549
  • [10] Efficient access methods for very large distributed graph databases
    Luaces, David
    Viqueira, Jose R. R.
    Cotos, Jose M.
    Flores, Julian C.
    [J]. INFORMATION SCIENCES, 2021, 573 : 65 - 81