- 【Updated on May 12, 2025】 Integration of CiNii Dissertations and CiNii Books into CiNii Research
- Trial version of CiNii Research Automatic Translation feature is available on CiNii Labs
- Suspension and deletion of data provided by Nikkei BP
- Regarding the recording of “Research Data” and “Evidence Data”
Direct MPI Library for Intel Xeon Phi Co-Processors
Description
DCFA-MPI is an MPI library implementation for Intel Xeon Phi co-processor clusters, where a compute node consists of an Intel Xeon Phi co-processor card connected to the host via PCI Express with InfiniBand. DCFA-MPI enables direct data transfer between Intel Xeon Phi co-processors without assistance from the host. Since DCFA, a direct communication facility for many-core based accelerators, provides direct Infini-Band communication functionality with the same interface as that on the host processor for Xeon Phi co-processor user space programs, direct InfiniBand communication between Xeon Phi co-processors could easily be developed. Using DCFA, an MPI library able to perform direct inter-node communication between Xeon Phi co-processors, has been designed and implemented. The implementation is based on the Mellanox InfiniBand HCA and the pre-production version of the Intel Xeon Phi coprocessor. DCFA-MPI delivers 3 times greater bandwidth than the 'Intel MPI on Xeon Phi co-processors' mode, and a from 2 to 12 times speed-up when compared to the 'Intel MPI on Xeon where it offloads computation to Xeon Phi co-processors' mode in communication with 2 MPI processes. It also shows from 2 to 4 times speed-up over the Intel MPI on Xeon Phi Intel MPI on Xeon where it offloads computation to Xeon Phi co-processors' mode in a five point stencil computation with an 8 processes * 56 threads parallelization by MPI + OpenMP.
Journal
-
- 2013 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum
-
2013 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum 816-824, 2013-05-01
IEEE