In recent times, there has been a growing interest in evaluating the feasibility of running High Performance Computing (HPC) applications on cloud computing environments. Flexibility, scalability, and dynamic provisioning capabilities provided by a cloud infrastructure make it an attractive platform for running HPC applications. One important requirement of running today's data-intensive HPC applications is the availability of a parallel high performance storage system. Scalability and high bandwidth from the storage system are critical for HPC applications. Researchers have studied the I/O performance obtained from the traditional cloud storage options such as the persistent and ephemeral block storage. However there has not been any comprehensive studies about how a parallel file system (PFS) can be effectively integrated and provided as a service to cloud users running HPC applications and what would be the performance and security implications of such systems.
The objectives of our work to integrate a PFS in a cloud environment are two-fold: first, to determine the most efficient way to access a PFS from virtual-machine instances in the cloud, and second, to design a framework to provide the PFS as a service through the OpenStack platform.
We have identified and implemented three ways in which a high-performance file system can be used in HPC-cloud environment using Lustre:
A. PFS as a back-end to persistent volume storage
B. Accessing the PFS directly through file-system clients running inside the virtual machine instances
C. Accessing the PFS mounted on the virtual machine hosts through file-system pass-through from the instances. We are in the process of evaluating in detail the performance characteristics and security implications of each of these methods.
We are also looking at incorporating a file-system service inside the OpenStack framework that would allow users to provision and configure PFS storage dynamically and securely, in much the same way they can provision instances and volumes. The challenges involve ensuring security and isolation in allocating storage to the cloud users, and determining the file-system configuration options that should be exposed to the users through the file-system service APIs.