PARALLELISM FOR HIGH-PERFORMANCE QUERY-PROCESSING

被引:0
|
作者
WINTERS, VG
机构
关键词
SIGNATURES; SEARCHING; RETRIEVAL; PARALLEL ALGORITHMS; COMPLEXITY;
D O I
暂无
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
We present a new method for a type of processing required in data base management systems. The method efficiently determin s tl relevance of a given query value to each of many (target) sets of data. By using a new type of data structure, the method allows complete parallelism both for operations on different target sets and for those within each target set. The method never generates a false drop (i.e. indicates that an irrelevant target set is relevant to the query) and always identifies all relevant target sets. This eliminates the the overhead of reading each selected target set to ensure that the selection was not a false drop. A good deterministic bound on the system's performance is established. With O(ln N(V) +ln ln M) processors, the relevance of any target set can be completely determined in 0(1) time against a query consisting of a subset of N(V) vocabulary items. The space complexity is O(N(i) (In N(V) + ln lnN(V))) bits , where N(i) is the number of items relevant to target set i. As a concrete example, for a database using 64 byte keys, having a 100,000 word vocabulary (potentially valid keys) and in which a target set can have up to 64 distinct relevant elements, the relevance of a target set can be determined in 2 parallel operations using 6 processors. In other words, with 64K processors a database of one million target sets can be processed in 184 parallel operations. No probability distribution assumptions are necessary.
引用
收藏
页码:344 / 356
页数:13
相关论文
共 50 条