Automatic Optimization of Thread Mapping for a GPGPU Programming Framework

Ohno Kazuhiko, Kamiya Tomoharu, Maruyama Takanori, Matsumoto Masaki

doi:10.1109/candar.2014.104

Although General Purpose computation on Graphics Processing Units (GPGPU) is widely used for the high-performance computing, standard programming frameworks such as CUDA and OpenCL are still di cult to use. They require low-level speci cations and the handoptimization is a large burden. Therefore we are developing an easier framework named MESICUDA. Based on a virtual shared memory model, MESI-CUDA hides low-level memory management and data transfer from the user. The compiler generates low-level code and also optimizes memory accesses applying conventional hand-optimizing techniques. However, creating GPU threads is same as CUDA; the user speci es thread mapping, i.e. thread indexing and the size of thread blocks run on each streaming multiprocessors (SM). The mapping largely a ects the execution performance and may obstruct automatic optimization of MESI-CUDA compiler.Therefore, the user must nd optimal speci cation considering physical parameters. In this paper, we propose a new thread mapping scheme. We introduce new thread creation syntax specifying hardware-independent logical mapping, which is converted into optimized physical mapping at compile time. Making static analysis of array index expressions, we obtain groups of threads accessing the same or neighboring array elements. Mapping such threads into the same thread block and assigning consecutive thread indices, the physical mapping is determined to maximize the e ect of memory access optimization. As the result of evaluation, our scheme could nd optimal mapping strategies for ve benchmark programs. Memory access transactions were reduced to approximately 1/4 and 1.4-76 times speedup is achieved compared with the worst mapping.

Automatic Optimization of Thread Mapping for a GPGPU Programming Framework

この論文をさがす

説明

収録刊行物

被引用文献 (1)*注記

参考文献 (18)*注記

関連プロジェクト

キーワード

詳細情報詳細情報について

書き出し

問題の指摘

Automatic Optimization of Thread Mapping for a GPGPU Programming Framework

この論文をさがす

説明

収録刊行物

被引用文献 (1)*注記

参考文献 (18)*注記

関連プロジェクト

キーワード

詳細情報 詳細情報について

書き出し

問題の指摘

詳細情報詳細情報について