Hanoi: 複数レイヤーのトレースログを用いたHadoopのパフォーマンス解析

清水, 裕亮, 櫻井, 孝平, 山根, 智

ネットワーク上で相互に接続された複数の計算機を用いて，テラバイト・ペタバイト級のデータセットに対して分散並列処理を行う大規模分散処理システムがある．このような大規模分散処理システムを構築するためのフレームワークとして，Apache Hadoop があげられる．Apache Hadoop を用いることでユーザは，大規模分散システムを構築する際に解決すべき非機能要件（ネットワーク通信の非決定性の隠蔽や，計算機の故障の管理等のフォールトトレラント性）を意識する必要なく，容易に分散プログラムを書くことができる．しかし，このような大規模分散処理システムにおいては，ネットワーク上で相互に接続された複数の計算機上のプロセスは非決定的であり，システム動作の再現は困難である．また，障害の種類も，アプリケーション自身のバグ以外に，ハードウェア，ネットワーク，クライアントの行動による障害など多層にわたっており，種類が多いこともデバッグを困難とする原因といえる．分散システムの動作の把握・デバッグのためには，クラスタを構築する各ノードの動作を複数レイヤーに渡ってリアルタイムに監視を行い，取得したログをリアルタイムに収集し解析することが有効と考えられる．本研究では， Hadoop を用いて構築されたクラスタを対象として，複数レイヤーのログを収集・解析することで，システムのパフォーマンスを解析する．計算機リソースの使用率、 HDFS/MapReduce クライアントとの通信ログ、そして Java のメソッドトレースを収集し解析に利用しており，既存研究と比べてログの粒度が細かく，より詳細な解析を可能とする．

Large-scale distributed system processes a job using a huge number of computer that are connected to each other on the network, and manage a terabytes or petabytes of data sets. For building those massively distributed system, Apache Hadoop a framework designed for build a distirbuted system do good. Apache Hadoop enabled us to write and running a distributed program easily, thanks to it transparently fulfill non-functional requirements(non-deterministic behavior on network communication, fault-tolerance treat any fault of cluster node, etc..) that we must be solved. However, distributed systems works on network which is non-deterministic, so it is diffficult to reproduce behaviors of distributed system. In addition, distributed system involved various kind of failures (MapReduce application itself, hardware, network, and client failure). Debugging or understanding the distributed system behaviors is a difficult problem. For debugging and understanding the distributed system operations, it is valid to monitoring system operations at real-time across multi layers of each node, and analyze those collected logs. In this study, targeting a cluster built using Hadoop, we demonstrate detection of the peformance problem, using monitoring of system behavior and log analysis. Compared with existing research, our method differ in monitoring the method-trace of Java program, using fine-grained logs about system behaviors enabled us to analyze more advanced system states.

Hanoi: 複数レイヤーのトレースログを用いたHadoopのパフォーマンス解析

書誌事項

抄録

収録刊行物

関連プロジェクト

キーワード

詳細情報詳細情報について

書き出し

問題の指摘

Hanoi: 複数レイヤーのトレースログを用いたHadoopのパフォーマンス解析

書誌事項

抄録

収録刊行物

関連プロジェクト

キーワード

詳細情報 詳細情報について

書き出し

問題の指摘

詳細情報詳細情報について