Single Root I/O Virtualization (SR-IOV) technology has been steadily gaining
momentum for high-performance interconnects. SR-IOV can deliver near-native
performance but lacks locality-aware communication support. This talk presents
an efficient approach to building HPC clouds based on MVAPICH2 over OpenStack
with SR-IOV enabled virtualized heterogeneous clusters. We discuss
high-performance designs of the VM and container aware MVAPICH2 library over
OpenStack-based HPC Clouds with SR-IOV-enabled InfiniBand, KNL, and GPGPU. The
talk will present a high-performance VM migration framework for MPI
applications on SR-IOV enabled InfiniBand clouds. A comprehensive performance
evaluation with micro-benchmarks and HPC applications on NSF-supported
Chameleon Cloud, which is developed on OpenStack, shows that our design can
deliver the near bare-metal performance. The MVAPICH2 over OpenStack software
package presented in this talk is publicly available from
http://mvapich.cse.ohio-state.edu.
a. What are the performance benefits of SR-IOV and its limitations on locality-aware inter-VM communication within the same physical node?
b. How to design a high-performance MPI library to efficiently take advantage of novel features such as SR-IOV and IVShmem provided in HPC clouds?
c. How to build an HPC Cloud with virtual machines and containers to deliver near-native performance for MPI applications over SR-IOV enabled InfiniBand clusters?
d. How much performance improvement can be achieved by our proposed design on MPI point-to-point operations, collective operations and applications in HPC clouds?
e. Can a high-performance virtual machine migration framework for MPI applications on SR-IOV enabled InfiniBand clouds can be designed? How fast is it?
f. How to efficiently utilize KNL and GPU with MVAPICH2 in the HPC clouds?
