Improving Resource Utilization in Data Centers using an LSTM-based Prediction Model

説明

Data centers are centralized facilities where computing and networking hardware are aggregated to handle large amounts of data and computation. In a data center, computing resources such as CPU and memory are usually managed by a resource manager. The resource manager accepts resource requests from users and allocates resources to their applications. A commonly known problem in resource management is that users often request more resources than their applications actually use. This leads to the degradation of overall resource utilization in a data center. This paper aims to improve resource utilization in data centers by predicting the required resource for each application. We designed and implemented a neural network model based on Long Short-Term Memory (LSTM) to predict more efficient resource allocation for a job based on historical data. Our model has two LSTM layers each of which learns the relationship between: (1) allocation and usage, and (2) CPU and memory. We used Googles cluster-usage trace, which contains a trace of resource allocation and usage for each job executed on a Google data center, to train our neural network. Googles cluster scheduler simulator was used to evaluate our proposed method. Our simulation indicated that the proposed method improved the CPU utilization and memory utilization by 10.71% and 47.36%, respectively, compared to a conventional resource manager. Moreover, we discovered that increasing the memory cell size of our LSTM model improves the accuracy of the prediction in return for longer training time.

収録刊行物

詳細情報 詳細情報について

問題の指摘

ページトップへ