LFWS: Long-Operation First Warp Scheduling Algorithm to Effectively Hide the Latency for GPUs

  • LIU Song
    School of Computer Science and Technology, Xi'an Jiaotong University
  • MA Jie
    School of Computer Science and Technology, Xi'an Jiaotong University
  • ZHAO Chenyu
    School of Computer Science and Technology, Xi'an Jiaotong University
  • WAN Xinhe
    School of Computer Science and Technology, Xi'an Jiaotong University
  • WU Weiguo
    School of Computer Science and Technology, Xi'an Jiaotong University

抄録

<p>GPUs have become the dominant computing units to meet the need of high performance in various computational fields. But the long operation latency causes the underutilization of on-chip computing resources, resulting in performance degradation when running parallel tasks on GPUs. A good warp scheduling strategy is an effective solution to hide latency and improve resource utilization. However, most current warp scheduling algorithms on GPUs ignore the ability of long operations to hide latency. In this paper, we propose a long-operation-first warp scheduling algorithm, LFWS, for GPU platforms. The LFWS filters warps in the ready state to a ready queue and updates the queue in time according to changes in the status of the warp. The LFWS divides the warps in the ready queue into long and short operation groups based on the type of operations in their instruction buffers, and it gives higher priority to the long-operating warp in the ready queue. This can effectively use the long operations to hide some of the latency from each other and enhance the system's ability to hide the latency. To verify the effectiveness of the LFWS, we implement the LFWS algorithm on a simulation platform GPGPU-Sim. Experiments are conducted over various CUDA applications to evaluate the performance of LFWS algorithm, compared with other five warp scheduling algorithms. The results show that the LFWS algorithm achieves an average performance improvement of 8.01% and 5.09%, respectively, over three traditional and two novel warp scheduling algorithms, effectively improving computational resource utilization on GPU.</p>

収録刊行物

参考文献 (18)*注記

もっと見る

詳細情報 詳細情報について

問題の指摘

ページトップへ