Performance Analysis of a Data Diffusion Machine with High Fanout and Split Directories

Bibliographic Information

Other Title
  • 共有メモリアーキテクチャ

Search this article

Description

The Data Diffusion Machine is a virtual shared memory architecture which has the advantage that data migrates from node to node when needed. However its disadvantages compared with other shared memory architectures such as CC-NUMA are higher miss penalties due to its hierarchical structure in interconnection and contention of the transactions at higher level directories. One way to alleviate these disadvantages is by increasing fanout and splitting directories. We analyze the performance improvement of the DDM by adopting these two schemes by extending the experimental results obtained from the DDM emulator. From the emulation result of mp3d running on 3 x 3 configuration the performance of a DDM with flat 9-node configuration has been estimated. Its execution time is 1.3 times faster than 3x3 configuration. To see the accuracy of our estimation method we have compared the actual execution time and the estimated execution time in the case of a DDM with flat 4-node which can be configured with current DDM emulator with minimum modifications. The relative error was 3%. We also discuss about possible sources of errors in our method.

The Data Diffusion Machine is a virtual shared memory architecture which has the advantage that data migrates from node to node when needed. However its disadvantages compared with other shared memory architectures such as CC-NUMA are higher miss penalties due to its hierarchical structure in interconnection and contention of the transactions at higher level directories. One way to alleviate these disadvantages is by increasing fanout and splitting directories. We analyze the performance improvement of the DDM by adopting these two schemes by extending the experimental results obtained from the DDM emulator. From the emulation result of mp3d running on 3 x 3 configuration, the performance of a DDM with flat 9-node configuration has been estimated. Its execution time is 1.3 times faster than 3x3 configuration. To see the accuracy of our estimation method, we have compared the actual execution time and the estimated execution time in the case of a DDM with flat 4-node, which can be configured with current DDM emulator with minimum modifications. The relative error was 3%. We also discuss about possible sources of errors in our method.

Journal

References(7)*help

See more

Details 詳細情報について

Report a problem

Back to top