Move VMs between Hyper-V hosts with no downtime
Live migration is probably the most important technology that Microsoft has added to Hyper-V in Windows Server 2008 R2. It enables virtual machines (VMs) to be moved between Hyper-V hosts with no downtime. Using live migration, you can migrate all VMs off the Hyper-V host that needs maintenance, then migrate them back when the maintenance is done. In addition, live migration enables you to respond to high resource utilization periods by moving VMs to hosts with greater capacities, thereby enabling the VM to provide end users with high levels of performance even during busy periods.
Live migrations can be manually initiated, or if you have System Center Virtual Machine Manager 2008 R2 and System Center Operations Manager 2007, you can run automated live migrations in response to workload. You need to complete quite a few steps to set up two systems for live migration, and I’ll guide you through the process. First, I’ll explain how live migration works. Then I’ll cover some of the hardware and software prerequisites that must be in place. Finally, I’ll walk you through the important points of the Hyper-V and Failover Clustering configuration that must be performed to enable live migration.
How Live Migration Works
Live migration takes place between two Hyper-V hosts. Essentially, the VM memory is copied between Hyper-V hosts. After the memory is copied, the VM on the new host can access its virtual hard disk (VHD) files and continue to run. Both hosts access shared storage where the VM’s VHD files are stored. When you initiate a live migration, which Figure 1 shows, the following steps occur:
1. A new VM configuration file is created on the target server.
2. The source VM’s initial memory state is copied to the target.
3. Changed memory pages on the source VM are tagged and copied to the target.
4. This process continues until the number of changed pages is small.
5. The VM is paused on the source node.
6. The final memory state is copied from the source VM to the target.
7. The VM is resumed on the target.
8. An Address Resolution Protocol (ARP) is issued to update the network routing tables.
Requirements for Live Migration
On the hardware side, you need two x64 systems with compatible processors. It’s best if the host processors are identical, though it’s not required. However, they do need to be from the same processor manufacturer and family—you can’t perform a live migration when one host has an AMD processor and the other host has an Intel processor. Learn more about Hyper-V processor compatibility in the Microsoft white paper “Virtual Machine Processor Compatibility Mode.”
In addition, each of the servers should be equipped with at least three NIC cards, running at 1GHz: one for external network connections, one for iSCSI storage connectivity, and one for node management. Ideally, you’d have another NIC dedicated to the live migration, but the live migration can also occur over the external network connection—it will just be a little slower. It’s important to note that if you’re implementing a server consolidation environment, you will want additional NICs for the network traffic of the VMs.
On the software side, all the nodes that take part in live migration must have Server 2008 R2 x64 installed. This can be the Standard, Enterprise, or Datacenter editions. Live migration is also supported by the Hyper-V Server 2008 R2 product. In addition, the Hyper-V role and the Failover Cluster feature must be installed on all servers participating in live migration.
You also need shared storage, which can be either an iSCSI SAN or a Fibre Channel SAN. In this example, I used an iSCSI SAN. Be aware that the iSCSI SAN must support the iSCSI-3 specifications, which includes the ability to create persistent reservations, something that live migration requires. Some open-source iSCSI targets such as OpenFiler don’t have that support at this time. If you’re looking to try this for a local test and don’t want to buy an expensive SAN, you might want to check out the free StarWind Server product at www.starwind.com.
Failover Cluster Networking Configuration
Failover clustering is a requirement for live migration. You can live-migrate VMs only between the nodes in the failover cluster. The first step in creating a failover cluster is to configure the networking and storage. You can see an overview of the network configuration used to connect the Windows servers in the cluster to the external network and to the shared storage in Figure 2.
In Figure 2 the servers are using the network with the subnet of 192.168.100.xxx for client connections. The iSCSI SAN is running on an entirely separate physical network which was configured using the 192.168.0.xxx IP addresses. You can use different values for either of these IP address ranges. I selected these values to more clearly differentiate between the two networks. Ideally, you would also have additional NICs for management and an optional live migration connection, but these aren’t strictly required. Live migration can work with a minimum of two NICs in each server.
I used a LeftHand Networks ISCSI SAN for Hyper-V live migration as well as a test SQL Server implementation. On the iSCSI SAN I created four LUNs. One LUN was sized at 500MB to be used for the cluster quorum. Another was sized at 1024GB to be used for 10 VMs. Two other LUNS were for the test SQL Server implementation and consisted of a 200MB LUN for the Distributed Transaction Coordinator and a 500GB LUN for SQL Server data files.
After creating the LUNs, I configured the iSCSI Initiator on both the Windows Server nodes. To add the iSCSI targets, I selected the Administrative Tools, iSCSI Initiator option, then on the Discovery tab I chose the Discover Portal option. This displayed the Discover Portal dialog box where I entered the IP address and iSCSI port of the SAN. In my case, this was 192.168.0.1 and 3260, respectively.
Next, in the Connect to Target dialog box, I supplied the target name of the iSCSI SAN. This name came from the properties of the SAN and varies depending on the SAN vendor, the domain name, and the names of the LUNs that are created. I checked the option Add this connection to the list of Favorite Targets. After completing the ISCSI configuration, the iSCSI Initiator Targets tab was populated with the LUNs.
Finally, using Disk Administrator I assigned drive letters to the LUNs. I opened Disk Management and used Q for the quorum, R for DTC, S for SQL Server, and V for the VMs. You need to make the assignments on one node, then bring the disks offline and make identical assignments in the second node. Figure 3 shows the completed Disk Management disk assignments for one of the nodes.
Adding the Hyper-V Role and Failover Clustering Feature
The next step is to add the Hyper-V role, then the Failover Clustering feature. You add both by using Server Manager. To add the Hyper-V role, select Administrative Tools, Server Manager, then click the Add Role link. From the Select Server Roles dialog box, select Hyper-V, then click Next. You’ll be prompted with Create Virtual Networks. This essentially creates a bridge between the Hyper-V VMs and your external network.
Select the NICs that you want to use for your VM traffic. Be careful not to choose the NICs that are used for the iSCSI SAN connection. Click Next to complete the Add Role Wizard. The system then reboots. You need to perform this process for all of the nodes in the cluster.
Next, add the Failover Cluster Feature by using the Administrative Tools, Server Manager, Add Feature option. This starts the Add Features wizard. Scroll through the list of features and select the Failover Clustering feature. Click Next to complete the wizard. This process must be completed on all nodes.
Configuring Failover Clustering
Next, create a Failover Cluster. You can do this on any of the cluster nodes. Select the Administrative Tools, Failover Cluster Manager option to start the Failover Cluster Management console. Then select the Validate a Configuration link to start the wizard, which displays the Select Servers or Cluster dialog box.
Enter the fully quailed names of all the nodes that will belong to the cluster, then click Next. Click Next through the subsequent wizard screens to run the cluster validation tests, which check the OS level, network configuration, and storage of all cluster nodes. A summary of the test results is displayed. If the validation tests succeed, you can continue and create the cluster. If there are errors or warnings, you can display them in the report, correct them, and rerun the validation tests.
After the validation tests have run, you create the cluster using the Create a Cluster link from the Failover Cluster Management console. Like the validate option, the Create a Cluster option first starts by displaying a Select Servers dialog box where you enter the names of all cluster nodes. Clicking Next displays the Access Point for Administering the Cluster dialog box, which you can see in Figure 4.
You use the dialog box labeled Access Point for Administering the Cluster to assign the cluster a name and an IP address. The name and IP address both must be unique in the network. In Figure 4, you can see that I named the cluster WS08R2-CL01 and gave the cluster an IP address of 192.168.100.200. With Windows Server 2008 R2 you can choose to have the IP address assigned by DHCP, but I prefer to use manually assigned IP address for my server systems because it allows all of my servers to always have the same IP addresses, which is handy for troubleshooting problems.
Clicking Next displays the Confirmation screen where you review your cluster creation selections. You can page back and make changes. Click Next again to create the cluster. A summary screen then displays the configuration of the new cluster. This action configures the cluster on all of the selected clustered nodes.
The Create Cluster wizard automatically selects the storage for your quorum, but it doesn’t always choose the quorum drive that you want. You can check and change the quorum configuration by right-clicking the name of the cluster in the Failover Cluster Management console, then selecting More Actions, Configure Cluster Quorum Settings from the context menu. This displays the Select Quorum Configuration dialog box. A wizard automatically chooses the best quorum type, depending mainly on the number of nodes in the cluster. In my two-node cluster, it selected the Node and Disk Majority quorum type.
Next, the Configure Storage Witness dialog box is displayed. Here I changed the original value to the Q drive that I wanted to use as the quorum by selecting a check box. Clicking Next saves the cluster quorum changes. If you would like to know more about configuring Windows Server 2008 R2 failover clustering, the Windows IT Pro website has articles and FAQs that can help; you can start with “4 Failover Clustering Hassles and How to Avoid Them,” InstantDoc ID 103534.
Enabling Cluster Shared Volumes
The next step in cluster configuration is to enable Cluster Shared Volumes. The Cluster Shared Volumes feature lets multiple cluster nodes simultaneously access the shared storage locations, but it’s not enabled by default. To do so, use the Failover Cluster Management console and right-click the name of the cluster at the top of the navigation pane, then select Enable Cluster Shared Volumes from the context menu. This displays the summary pane for Cluster Shared Volumes, which initially is blank.
To select a shared storage location to be used by Cluster Shared Volumes, click the Add Storage option in the Action pane. This displays the Add Storage dialog box. The storage for Cluster Shared Volumes has to be visible to the cluster and it can’t be used for other purposes. Select the box next to the storage location you want to use. I selected the V drive, which is actually a LUN on the LeftHand Networks SAN. Click OK to enable Cluster Shared Volumes for that drive. This also results in the creation of a mount point on all the cluster nodes. By default, the mount point is labeled C:\ClusterStorage\Volume1.
Creating VMs on Cluster Shared Volumes
At this point, failover clustering is configured on all the nodes in the cluster and the Cluster Shared Volumes feature has been enabled, allowing all of the nodes to simultaneously access the storage. The next step is to create VMs that can take advantage of this infrastructure. Hyper-V VMs can be created using either the Hyper-V Manager or System Center Virtual Machine Manager. To create a new VM using Hyper-V Manager, click the Administrative Tools, Hyper-V Manager option at the Start menu, then select New from the Action pane to start the New Virtual Machine wizard. Figure 5 shows the dialog box you will see, labeled Specify Name and Location.
In Figure 5, you can see that the VM will be named vWS08-SQL01. Also note that the value for the VM location has been set to the Cluster Shared Volumes mount point: C:\ClusterStorage\Volume1. This causes the VM configuration files to be created on the shared storage.
Click Next to assign RAM to the VM. Click Next again to select the network connection for the VM. Assigning a network for the VM is optional. However, if you do select an external network, be sure that the external network connection is named the same on all of your Hyper-V nodes. In my case, I used the external network name of External Virtual Network on all of my Hyper-V cluster nodes.
Click Next to display the Connect Virtual Hard Disk dialog box. Here, again, it’s important to create the VHD files on the Cluster Shared Volumes storage. Initially, the dialog displays the Hyper-V Manager default values for name and location. I used the value of vWS08-SQL01.vhd for the VHD file and changed the location to C:\ClusterStorage\Volume1. Click Next to specify the guest OS installation options. All guest OSs including Linux can take advantage of live migration. The rest of the process for creating a VM is exactly like creating a normal VM.
When you complete the New Virtual Machine wizard, the VM will be created on the Cluster Shared Volumes storage. The next step is to start the VM and install the guest OS and the application that you want to run on the VM.
Enabling VMs for Live Migration
Open the Failover Cluster Management console, then navigate to the Services and Applications node under the cluster name and right-click to display the context menu. Select Configure Service or Application to start the High Availability wizard. On the Select Service or Application dialog box, select Virtual Machine from the list of services displayed, then click Next. This displays the Select Virtual Machine dialog box.
Scroll though the list of VMs until you find the one you want to enable for live migration. I selected the VM vWS08-SQL01 created earlier. The VM can’t be running while you perform this operation—it must be in the Off or Saved state.
Select the check box in front of the VM name, then click Next until you complete the wizard. A confirmation screen is displayed and the summary dialog box reports the status. If you see “Success” in the description, then the VM has been successfully enabled for live migration. If not, you need to review the VM properties and make sure all of the VM assets can be accessed on all of the nodes in the cluster.
Ready, Set, Migrate!
That’s all there is to configuring the Hyper-V live migration environment. At this point, you can initiate a live migration using the Failover Cluster Manager. To start a live migration, expand the Services and Applications node, then select the VM node displayed beneath it. This displays the summary pane, which shows the VMs that have been enabled for clustering, along with their current status, which Figure 6 shows.
In Figure 6, you can see that VM vWS08-SQL01 is currently running and that the current owner is node WS08R2-S1. To initiate a live migration, go to the Action pane and select the Live migrate virtual machine option shown in the upper third portion of the Action pane. A menu flyout prompts you for the name of the target node. In this example, the menu flyout shows 1 – Live migrate to node WS08R2-S2. Clicking this option starts the live migration. The summary window is updated with the status of the running live migration.
The running status is displayed until the live migration finishes. The length of time it takes to complete depends on the size and activity of the VM, as well as the speed and activity of the network connection between the Hyper-V host systems. Typically, my network live migrations take between about 10 seconds and a minute. When the live migration has been completed, the summary pane is redisplayed and the Current Owner value is updated with the name of the target node.
The Virtual Promised Land
Live migration addresses the issues of planned host downtime and lays the foundation for the dynamic datacenter. Although there are quite a few steps in the process, if you carefully navigate the critical points in the process, you will reach the promised land of Hyper-V live migration.