An implementation of OpenMP compiler for PC clusters based on array section descriptor

説明

In this paper, we propose an implementation of OpenMP compiler for distributed memory environment While OpenMP provides a notion of shared address space, distributed memory environment does not have a physical shared memory. One of the approaches to implement OpenMP on distributed memory environment is communication code generation, in which a producer sends appropriate data to the consumer. Our compiler finds accesses to shared data and represents them by using quad, which is our proposed array section descriptor. To identify data to be sent, intersection operation is performed between quads representing written and read data. Since a quad can concisely represent stride accesses to an array section, our compiler can generate efficient code in the case which OpenMP directive divides a for-loop in block-cyclic manner. As a preliminary evaluation, we parallelized a matrix-multiply program by inserting an OpenMP directive and executed it on a PC cluster. In result, we achieved a speedup of 7.82 with 8 processors.

収録刊行物

詳細情報 詳細情報について

問題の指摘

ページトップへ