Mistake 1: Deliberations on the hardware front
While deploying SAN boot in a NetApp environment, keep an eye on the redundancy, resiliency and performance levels, from the hardware perspective. Resiliency is important, because SAN boot typically has no single point of failure from the access to the boot LUN (logical unit numbers). Hence, the servers going off the boot LUN may result in downtime for any specific application running on the server.
From a hardware perspective, the standard requirement during SAN boot is to have redundancy and resiliency in place. In lab or test environments, organizations opt for single mapping to the boot LUN, which eventually ends up in a blue screen in the particular host bus adapter (HBA). To avoid such undesirable outcomes, consider multipath I/O (MPIO) to ensure resiliency.
However, you need to understand that Windows does not support MPIO during installation and may produce strange results when deployed against a LUN that is visible via multiple paths. During installation Windows boots to boot.wim, which runs on the Windows pre-installation environment (WinPE), and later reboots to complete the installation on a full Windows install.
It is best to have only one path to the LUN during installation. Once Windows is running you could add paths and enable Windows MPIO for SAN boot.
Also make sure that the servers are available at any given point of time. To ensure no single point of failure and build in redundancy you need to enable MPIO. There should also be enough head room to manage workloads. Carry out sizing exercises to meet the specific requirements of your environment.
Mistake 2: Master image considerations
When preparing the NetApp SAN array with the master image, the first thing to ensure is a simple and flexible management of your environment. For instance, you should be able to rapidly repurpose the master image and also make sure there is a proper layer of bare metal provisioning. You should be able to repurpose for the physical as well as the virtual environment. Data protection should also be given priority if there is a requirement for backup of the boot LUN.
The next dimension of SAN boot is to keep the performance in mind by achieving storage efficiency. It all boils down to the storage layout of the gold/master image. Plan your layout by leveraging the FlexClone capabilities of NetApp to achieve a satisfactory level of storage efficiency.
For example, in a production environment running applications such as Microsoft Exchange or SharePoint, isolate the applications based on their required storage levels. This will take care of the first level of data protection and availability requirements. Storage efficiency is facilitated by isolating them into separate aggregates, upgrade groups or volumes according to requirement; utilizing de-duplication; and, applying repurposing techniques such as leveraging the FlexClone capabilities.
Mistake 3: Server-level issues
One of the key challenges faced during SAN boot is the availability of the drivers. From a physical server perspective, you have to make sure that the HBA drivers are available and installed. Ensure that all the necessary drivers, primarily FC drivers, are downloaded and made available at a specific location to enable easy installation during the SAN boot transfer.
Since the BIOS is enabled on the physical server and the boot order is sent through the preferred LUN, you need to make sure you can see the NetApp LUN provisioned for SAN boot. The challenge with this is that there are several scenarios in which the server keeps waiting for the LUN to become available. The solution we successfully adopted in our lab ensures that the server boots to the first available LUN.
Mistake 4: Zoning consideration
Zoning is a very critical issue with SAN boot on fiber channels. Fiber channel zoning is a complicated process and is prone to a host of errors during deployment. In our labs too, we have experienced several errors during zoning. You should constantly monitor zoning to ensure the right configurations are in place to ensure that no error is preventing your servers from accessing the NetApp storage array. Constant revaluation is the only way to ensure no zoning glitches occur during SAN boot.
About the author: Amarnath Rampratap heads the Microsoft solutions engineering team at NetApp India’s Microsoft Business Unit, focusing on building solutions and reference architectures around Microsoft enterprise applications. He has been with NetApp for almost six years and specializes in core networking to data center solutions.
(As told to Mitchelle R Jansen.)
This was first published in November 2011