GRACE: Relational Algebra Machine Based on Hash and Sort -Its Design Concepts-

Search this article

Description

Design considerations of a relational algebra machine GRACE are described. GRACE adopts a novel relational algebra processing algorithm based on hash and sort and can execute heavy load operations such as join projection (duplicate elimination) and set operations etc. much more efficiently. These operations have been a big burden for most of the data base machines proposed so far. The basic processing strategy is to decompose a relation into disjunctive buckets by using clustering feature of hash and then to process buckets in parallel activating many processors. O(n) hardware sorter of a processor is utilized to process a bucket. Buckets stored over multiple banks are processed in pipeline fashion. The design problems about the implementation of this method on a parallel machine is discussed in detail. The abstract architecture is presented which consists of three major components Data Stream Processor(DSP) Data Stream Generator(DSG) and Secondary Data Manager(SDM). Data stream is manipulated during the data transfer from the source DSG to the destination DSG. The operator level pipeline effect is explained by which hashing phase is overlapped with the relational algebra processing and GRACE can execute a complex query including many heavy load operations efficiently without time overhead of hashing.

Design considerations of a relational algebra machine GRACE are described. GRACE adopts a novel relational algebra processing algorithm based on hash and sort and can execute heavy load operations such as join, projection (duplicate elimination), and set operations etc. much more efficiently. These operations have been a big burden for most of the data base machines proposed so far. The basic processing strategy is to decompose a relation into disjunctive buckets by using clustering feature of hash and then to process buckets in parallel activating many processors. O(n) hardware sorter of a processor is utilized to process a bucket. Buckets stored over multiple banks are processed in pipeline fashion. The design problems about the implementation of this method on a parallel machine is discussed in detail. The abstract architecture is presented, which consists of three major components, Data Stream Processor(DSP), Data Stream Generator(DSG), and Secondary Data Manager(SDM). Data stream is manipulated during the data transfer from the source DSG to the destination DSG. The operator level pipeline effect is explained by which hashing phase is overlapped with the relational algebra processing and GRACE can execute a complex query including many heavy load operations efficiently without time overhead of hashing.

Journal

Details 詳細情報について

Report a problem

Back to top