This article can also be found in the Premium Editorial Download "High-Performance Computing: How the latest file systems are impacting high-performance computing."
Download it now to read this article plus other related content.
It’s common knowledge that data volumes are increasing exponentially these days across all forms of computing. The high-performance computing (HPC) systems arena is no exception. While HPC systems have enough compute power to analyze the data, in this case it is storage systems that often prove to be a bottleneck. Storage systems may not be able to deliver the data to compute resources fast enough or in parallel.
To address this issue, one solution is the Multi Path File System (MPFS), a multi-path network file system technology developed by EMC. This technology has been designed specifically for the requirements of HPC systems, wherein computer nodes require concurrent access to data sets at faster rates. Here are some best practices while implementing MPFS to ensure better read/write performance:
- MPFS can be deployed either by using the iSCSI SAN option or FC SAN option. The iSCSI option should be used when concurrent access of data is required and iSCSI data transfer rates satisfy the requirements of HPC systems. FC SAN should be used where a high data transfer rate is the prime requirement and iSCSI cannot achieve the desired transfer rates.
- In case of iSCSI MPFS deployment, two NIC cards should be installed on the nodes—one for NFS and one for iSCSI.
- GigE NIC cards should be used, and Gigabit Ethernet links should be the minimum between the nodes and the iSCSI storage.
- A stripe size of 256 KB is recommended
- for optimal performance of Linux, Windows and Unix MPFS clients.
- MPFS threads should be configured to “1” or higher and can be tuned as per the environment. If node performance is slow, increase the number of threads allotted for the specified data mover unit and tune accordingly.
- For sequential workloads, RAID 5 FC is the recommended storage configuration for optimum transfer rates.
In case of SATA drives, use RAID 3 configuration. For configuration of the cache on the storage systems, note that a 20% read cache and an 80% write cache will yield higher performance.
The recommended cache page for the storage system is 8 KB for optimum performance. Meta LUN configuration should be used for MPFS storage. While creating Meta LUNs, ensure that the component LUNs are from different RAID groups, as that will increase the number of disk spindles, resulting in performance enhancements.
Avoid using SATA disks. Instead, use 4 Gbps disks for maximum performance. RAID groups should be configured with disks from the same disk array enclosure. Meta LUNs should be created from component LUNs on different DAEs and, if possible, from different backend buses for optimum performance. If the work load is random, then RAID 1/0 is preferred. In other cases, RAID 5 is the preferred choice. Single initiator zoning should be used in case of FC SAN deployments. Jumbo frames with 9000 MTU should be used in case of iSCSI MPFS deployment.
A separate VLAN should be configured for MPFS clients in order to segregate the MPFS traffic and achieve high performance. If possible, configure a separate LAN.
This was first published in March 2012