Microsoft's virtualization solution comes into its own
In this article, we'll look at virtualizing Microsoft SQL Server in a Windows Server environment. Although Microsoft has been offering virtualization products for the past several years, it's only been in the last generation of the Windows Server OS, with Windows Server 2008, that Microsoft's virtualization solution has come into its own with Hyper-V. First we'll review the benefits of virtual servers, then we'll examine the unique challenges SQL Server presents when you attempt to run it on a virtual machine (VM). We'll explore how Windows Server 2008 Hyper-V is well suited to host SQL Server instances and the intelligent way upcoming SQL Server 2008 R2 takes advantage of a virtual environment.
Benefits of Virtualization in Production
Before we begin discussing the benefits of virtualization, it might be best to start at the beginning. First there was a big bang...just kidding, not that far back. But we should at least explain what a VM is for those of you just joining the virtualization movement. Don't fret, we'll go deeper later in the article for you hard-core veterans.
In the simplest terms, virtualization is the practice of emulating a fully functioning server (known as a guest OS) via an application executing on the host OS of a physical server. The VM (VM) running the guest OS is delivered courtesy of a VM software application and can either be isolated, such as in a test or development environment, or be made available to the rest of the network as an independent server. Either way, fewer physical machines will be in the server room than the number of logical servers available to clients (see Figure 1).
Because a VM's guest OS can have applications installed and can provide network services to the rest of the network, a major benefit of virtualization in production is that a single physical server can serve multiple purposes on the network. Maintaining fewer robust physical servers, each providing multiple services, is more economical than supporting several physical servers, each dedicated to a single purpose. By consolidating services and applications, you can reduce ownership costs and power consumption while supporting a mix of physical and virtual network services within your environment. Centralized administration with the help of VM management utilities, such as Microsoft System Center Virtual Machine Manager (SCVMM), can reduce administration overhead while portable VMs lower the cost and downtime of disaster recovery, hardware migrations, and upgrades.
Optimal Hardware Utilization
With virtualization, multiple independent OSs can be running simultaneously, each with its own access to the physical hardware. Imagine a VM guest OS and host OS each taxing the CPU up to 25 percent concurrently, utilizing 50 percent of the CPU for a better rate of return on your chip investment. You can achieve higher yields on hardware investments with VMs, which can result in a lower total cost of ownership (TCO) for the network.
RAM is also heavily utilized in a virtualization solution when multiple OSs need memory space simultaneously. Server 2008 R2 Hyper-V employs an extra layer of memory address translation, which we'll discuss later in this article, to get the most out of the RAM chips. The I/O sub-system will also be well taxed by VMs that each require hard drive space to load their guest OS and applications. Windows Server 2008 R2 Hyper-V offers a choice of virtual hard disk sizing to get the most use of local hard drives without endangering disk space.
You can leverage VMs to enhance availability of mission-critical applications and services. Each VM exists within a virtual disk file (which contains all the aspects of a physical disk, including sectors, file systems, files, boot records, and so on) so the entire guest OS can be ported from one host server to another. This portability makes it easy to recreate the server in both high-availability and disaster-recovery scenarios. We'll talk more about these concepts later in this article.
Because virtualization provides OS isolation across common hardware resources, consolidating applications and services onto a specific server is possible. It has long been a Microsoft best practice to install resource-intensive server applications, such as SQL Server and Exchange, onto dedicated equipment. This best practice came into being because two beefy applications competing for the same resources (usually CPU and RAM) under the same OS roof would eventually cause a performance bottleneck.
But unlike loading both SQL Server and Exchange into the same OS, virtualization allows each application to be installed onto its own virtual server so that each application is unaware of the other application being on the same hardware. Obviously, concerns about making sure the host server has enough hardware resources to go around are valid, but the reduction in physical asset portfolio and cost—along with the opportunity to standardize best practices for all server applications—makes consolidation well worth the effort.
Server 2008 Hyper-V's native support for Microsoft applications eliminates additional hypervisor costs when setting up SQL Server in a virtual environment. Hyper-V is included with several editions of the OS. You can preconfigure each VM to a limited amount of RAM and hard disk drive space to prevent one particularly hungry guest OS from usurping more than its fair share of the hardware. You also can set standards for applications, and administrators can plan and govern SQL Server instance configuration settings appropriately to operate within those boundaries.
Consolidated Management and VM Portability
If virtualization creates more logical VMs than physical servers, doesn't that mean you have more to manage and not less? The IT requirements to manage multiple virtual servers consolidated under a physical system are less than the IT requirements to manage multiple physical servers. You have fewer hardware, space, and power considerations. Furthermore, VMs are portable and can be captured in various versions without the extraneous labor of an imaging solution often required to deploy or create a snapshot of a physical server's host OS and applications.
The Hyper-V console, which ships with Server 2008 R2, connects to individual VMs using RDP over TCP port 2179. But running multiple console windows into separate VMs is tedious when performing the same administration across multiple guest OS's. A better idea is SCVMM 2008 R2 (see Figure 2), which consolidates all VMs into a single management utility and console. A single monitoring or configuration operation can be executed only once yet targeted to multiple VMs simultaneously. You can tailor the SCVMM UI to a user's preferences, facilitating smarter and faster server management. And here's the best news: SCVMM 2008 R2 connects to physical, virtual, and heterogeneous OSs to manage your entire network from one place.
Additionally, you can use System Center Operations Manager to monitor individual services across Microsoft, Linux, and UNIX platforms on both physical servers and VMs in a customizable single interface. And slated for release in early 2010, the new System Center Service Manager 2010 will help enterprises enforce best practices, change control, and lifecycle management across all platforms in the network. You can employ these and more System Center solutions to centralize administration of large enterprises and reduce support costs.
Sometimes it might be necessary to move or copy a VM from one host OS to another; for example, during disaster recovery, new hardware migration, or high-availability initialization. And other times it might be beneficial to be able to create a point-in-time copy of a server to revert back to in case of detrimental software loads or configuration changes. Microsoft Hyper-V accommodates such endeavors with:
- Clones: Duplicate VMs generated via SCVMM (requires that the reference VM first have security identifiers removed by running the SYSPREP.exe utility)
- Snapshots: Point-in-time copies of a VM (disk, VM configuration, or both)
- Failover cluster support: Hyper-V is cluster aware and can run VMs on an active node of a Microsoft cluster
- Live Migration support: Uninterrupted rollover of a VM from one node of a Microsoft failover cluster to another node transparent to the client
For more information about combining failover clustering and Live Migration in Hyper-V, read the Microsoft TechNet article at technet.microsoft.com/en-us/library/cc732181(WS.10).aspx.
With all of these advantages, it's no surprise that virtualization of mission-critical resource intensive network applications, such as SQL Server, has become so popular. And although some of these advantages beg further discussion, this article isn't a tutorial on the implementation of a VM but rather a look at optimizing SQL Server on a VM. So let's look at what makes SQL Server a good, or bad, candidate for virtualization.
Unique Challenges to Virtualizing SQL Server
Like many of Microsoft's enterprise server applications, SQL Server demands deep hardware resources. But what makes SQL Server challenging is both its architecture and its importance. Far more applications today are using SQL Server as their data repository, making SQL Server one of the most mission critical applications on a network and often the subject of aggressive SLA objectives. Multiple SQL Server instances and SQL Server business intelligence (BI) solutions can influence VM design. And the critical nature of SQL Server databases often require high availability solutions such as database mirroring and failover clustering to assure constant data availability to users.
Multiple instances: SQL Server has long had the capability to install multiple instances of the software onto a single OS (see Figure 3). Each instance employs its own SQL Server service, which means that processor overhead increases with each additional instance installed. Some registry information and even a service or two, such as the Distributed Transaction Coordinator service, are shared among all instances of SQL Server, but the two most industrious services of the relational database engine, MSSQLSERVER and SQLSERVERAGENT, are unique per instance. Multiple instances give administrators the opportunity to segregate sysadmin authority and programmers the chance to isolate their application databases apart from other data.
A downside of multiple instances is that they are subject to a single point of failure on a host OS. Corruption or a problem in the host OS could affect all instances running on the OS. An outage in host OS could cause outage in all the SQL Server instances.
Should you immediately embark on a project to consolidate all of your SQL Server databases into a single instance? No. Multiple instances maintain an administrative boundary and in large enterprises with segregated IT departments, there may be different teams of SQL Server system administrators (sysadmin role members) who should have control over certain databases but not others. Separation of duties is a viable reason to continue supporting multiple SQL instances. Another justification is the need for differing SQL instance level objects and settings, such as throttle settings and native endpoint objects. But these arguments for multiple instances could also be satisfied by installing each instance on a separate VM (see Table 1).
Guest clustering vs. host clustering: Another long-time challenge to supporting mission-critical SQL Server databases has been maintaining SLA commitments regarding data availability. Many software manufacturers, Microsoft included, rely on SQL Server databases as a data repository in their applications thanks, in part, to the product's value on the dollar (Express Edition is free). Furthermore, business information workers often rely on the family of Business Intelligence (BI) products that ship with the Standard, Enterprise, and Datacenter editions.
Keeping the data in a SQL Server database up and online all the time requires sophisticated high-availability solutions. Within the SQL Server and Windows OS products, Microsoft offers several options for data redundancy including:
- Database snapshot: Read-only point-in-time virtual copy of a database
- Log shipping: Latent transaction log copy process between two copies of a database
- Database mirroring: Synchronous or asynchronous transaction commitment to two copies of a database
- Replication: Snapshot, transactional, merge, or peer-to-peer data replication between two databases (granular to row or column level depending on article partitioning)
- Failover clustering: Failover clustering via Server 2008 R2 Failover Cluster to protect against OS or SQL Server service failures
Of these tools, clustering has become the front running solution for providing immediate failover of a corrupt or incapacitated SQL instance (and thus all databases in it). Traditional host clustering, which is the art of implementing Microsoft cluster services on a Windows Server OS, is a popular topology. For more information about known issues with SQL Server 2005 installed on a Server 2008 failover cluster, see the Microsoft article "List of known issues when you install SQL Server 2005 on Windows Server 2008" at support.microsoft.com/kb/936302.
With the introduction of Hyper-V clustering technologies, the schematic of clustering has changed. The legacy idea of clustering multiple physical servers is now known as host clustering. Host clustering grants high availability to the host OS supporting a VM that has SQL Server installed, as well as the VM itself housing the SQL Server services.
If something goes awry with the primary physical node (whether due to the host OS or VM), the services are brought online on the secondary node. Unfortunately, client connections will be interrupted during the migration and automatic reconnection. Shared SAN storage such as iSCSI, Fibre Channel, or Serial Attached SCSI (SAS) are ideal for host clustering topologies and can provide service-level failover, if each VM is configured to use a virtual disk that maps to its own logical storage unit on the SAN.
Alternatively, clustering VMs inside the virtualization layer is called guest clustering. In this design, the VMs themselves are clustered together as if they were independent physical servers. However, the state of both the guest OS and the applications running in it will be monitored and stored on the shared SAN storage of the guest cluster. Guest clustering allows failover of a single VM or single application within a VM in the event of VM failure through no fault of the host OS.
Key differences between host clusters and guest clusters are outlined in Table 2. In Figure 4, host clustering between two physical nodes is providing fault tolerance of all three VMs, whereas guest clustering between VM1B-act and VM2B-pas is providing fault tolerance of the VM1B-act guest OS within the same physical server. Should the host clustering fail over, VMs VM2B-act and VM2B-pas would take on the guest clustering design.
Although guest failover clustering for SQL Server VMs is supported by Microsoft, you need to be running Server 2008. See the Microsoft article "Support policy for Microsoft SQL Server products that are running in a hardware virtualization environment" at support.microsoft.com/kb/956893.
Database mirroring: SQL Server 2008 offers a data duplication strategy called database mirroring. In essence, a database is backed up and restored to a separate SQL Server instance from the original. The two instances are connected through an instance-level TCP Endpoint restricted to forwarding traffic for the purpose of database mirror activity only. As transactions are committed to the original database, they are also committed to the mirror through the endpoint (synchronously or asynchronously, depending on the mirror topology choice). The mirror database is offline to users until invoked during a failover. Database mirroring is a fault-tolerance solution, not a load-balancing solution.
Mirroring can be supported between two VMs each running an instance of SQL Server. Whether the two VM's should reside on the same physical server or not depends on the level of protection being sought. If you are employing database mirroring strictly to protect from database and/or instance failure, the same physical server and host OS should be sufficient. If you also want to protect the database from physical device or storage failure, then separate physical servers would be necessary. When deciding, keep in mind that TCP/IP traffic between the SQL Server instances involved in the mirror design (and optional witness SQL Server instance) can be significant on a highly volatile database. Traffic between VMs on the same physical server never truly sees the light of day on the Ethernet network thanks to simulated network devices and Hyper-V translated drivers. Traffic between VMs on opposite physical servers will affect Ethernet network performance.
BI and virtualization: In recent versions, Microsoft has done an admirable job of adding functionality and purpose to the SQL Server product line. So much so, in fact, that many companies have begun to place a great deal of critical business data in the SQL Server Database Engine server application. But it doesn't stop there. The SQL Server family of products also boasts an impressive line-up of Business Intelligence (BI) applications that can integrate heterogeneous data platforms, automate routine data management tasks, construct data warehouses, and report on a myriad of data sources. These products include:
- SQL Server Integration Services (SSIS): Data extraction, transformation, and loading (ETL) solution
- SQL Server Analysis Services (SSAS): Data warehouse and mining solution
- SQL Server Reporting Services (SSRS): Data reporting solution
Each of these BI subsystems is supplied with SQL Server Standard, Enterprise, and Data center editions, and Microsoft supports all of them to run in a virtualized environment.
Running Hyper-V with SQL Server: If your past experience in SQL Server virtualization is with Microsoft Virtual Server or Microsoft Virtual PC products, then you're in store for a treat. Microsoft Hyper-V is a step above these predecessor applications and has a completely different architecture primed for enterprise server applications. Better yet, it ships at no additional cost for Server 2008 R2 and is freely included in Server 2008 Hyper-V. Its native hypervisor eliminates the need for costly additional virtualization products when supporting Microsoft server applications and its Integration Services can monitor both the host and guest OSs for health, time synchronization, registry key sharing, and graceful shutdowns.
Processor loads: Hyper-V on Server 2008 R2 supports up to 4 CPUs per VM and up to 384 VMs per host. Hyper-V multi-processor support for SQL Server 2008 lets SQL Server take advantage of multiple CPUs in the VM. Performance gains tend to diminish slightly when increasing from two to three processors or from three to four. Of course, each instance of SQL Server is unique and only testing and monitoring will reveal for certain the point of diminishing return when employing multiple processors. Variables such as the product involved (relational database engine versus BI), the number of concurrent sessions, and the nature of frequent actions can all influence CPU utilization. But Hyper-V's ability to simulate multiple processors in a VM is crucial to optimizing SQL Server performance.
Perhaps the biggest advantage Hyper-V has in supplying processor power to its VMs lies in its architecture. Unlike other virtualization applications, Hyper-V doesn't install on top of Server 2008 R2 in the User Mode layer. Rather, it lies beneath the OS as an abstraction layer, or micro kernel, to the hardware. This placement gives Hyper-V full management control over all hardware for both the host OS calls and each VM guest OS. Although each VM is configured with a specific number of virtual processors, it's actually Hyper-V that determines which physical CPU core handles each request and how much time the request gets (including both host and guest OS requests).
Thanks to Hyper-V's configuration settings (see Figure 5), administrators can dictate CPU allocations per VM by configuring virtual processors. This is both a blessing and a curse. Although it's great to have such granular control over who handles each thread, without extensive knowledge and skills an administrator could inadvertently set the allocations too sparingly and underutilize physical processors while incapacitating CPU-hungry server applications running in the VMs. Hyper-V also supports virtual processors that don't map back to a single physical CPU or core to best accommodate server applications in a VM. And CPU Core Parking in Server 2008 R2 Hyper-V consolidates processes onto a minimal number of CPU cores, allowing idle cores to be suspended, or parked, to consume less energy.
For more information about monitoring performance of a Hyper-V VM, read the Microsoft white paper "Running SQL Server 2008 in a Hyper-V Environment - Best Practices and Performance Recommendations" at sqlcat.com/whitepapers/archive/2008/10/03/running-sql-server-2008-in-a-hyper-v-environment-best-practices-and-performance-recommendations.aspx.
Memory addressing: In addition to managing processor cores, Hyper-V also manages physical RAM allocation. At creation time, a VM is configured to use a specific amount of memory space. Upon boot-up of the VM, the hypervisor reserves the configured space for that particular child partition (each VM is a child partition in Hyper-V) in physical RAM. Each guest OS can then use the issued memory space as if it is physical memory and offer virtual addresses to each application installed on the OS. These address spaces are known as:
- System Physical Address (SPA): Physical RAM address space of physical server hosting Hyper-V; uses paging file on physical disk to store overages
- Guest Physical Address (GPA): Allocated RAM address space allotted to a VM child partition
- Guest Virtual Address (GVA): Memory space greater than GPA allocated by guest OS to installed applications; uses paging file on virtual disk to store overages
In this design, Hyper-V must map memory twice—once from guest OS virtual memory addresses to the GPA address allocated to the VM child partition, then again from the GPA address to the SPA. Thus there are two opportunities for data in memory to become paged to the hard disk. Excessive paging becomes a performance bottleneck and can endanger the physical server's ability to support multiple VMs simultaneously. Because Hyper-V has knowledge of all allocated memory space it can monitor physical RAM usage and prevent insufficient memory errors in the host OS.
Server 2008 R2 Hyper-V introduces a new technology called Second Level Address Translation (SLAT), wherein the hypervisor adds a second layer of paging to the architectural paging table of compliant CPU hardware. Because SLAT stores address translation information for both layers of virtual memory, the hypervisor doesn't need to retain information about the SPA-to-GPA mappings for multiple VMs on the server. This reduces overhead of Hyper-V while improving performance by maintaining translations at the hardware layer instead of with software. Memory-intensive applications running in VMs, such as SQL Server, benefit greatly from SLAT.
Live Migration: Being able to move an entire VM from one host to another is paramount to high availability, disaster recovery, and scalability. From updates to unexpected maintenance, occasions arise in which the VM must be relocated in order to continue servicing users. In Server 2008 R2, Cluster Shared Volumes allow multiple clustered VMs to use the same virtual storage (SAN logical storage unit) yet still be candidates for migration individually.
For more information about using Cluster Shared Volumes, see the Microsoft article "Cluster Shared Volumes Support for Hyper-V" at technet.microsoft.com/library/dd630633(WS.10.aspx#BKMK_overview.
Live Migration leverages Server 2008 R2 failover clustering and the hypervisor to move a running VM from one node in the host cluster to another with no interruption to client sessions. The VM's files are stored in the shared storage of the Server 2008 R2 failover cluster (an iSCSI or Fibre Channel SAN), and each VM can simultaneously access those files. When a live migration is initiated, the VM memory from the first node is copied to the backup node. The VM is then started on the backup node, which can immediately access the VM files on the shared storage.
SQL Server 2008 R2
Slated for release in May 2010, SQL Server 2008 R2 will further enhance SQL Server's support for Hyper-V. In this vein, SQL Server 2008 R2 is even friendlier toward clustering and virtualization than the current version of SQL Server.
Clustering support: SQL Server 2008 R2 introduces new cluster support for taking advantage of Server 2008 R2 failover clusters and Hyper-V guest clusters. Some examples of enhancements anticipated in R2 are:
- Clustered installs: Choice A = Integrated (1 node; additional nodes are added separately as needed) versus Choice B = Advanced/Enterprise (all nodes are named during install and SQL Server binaries installed on each)
- Service accounts: R2 will support using service identifiers as opposed to domain user accounts for service accounts, thereby breaking dependency on authentication architecture
- Online node management: Nodes can be added/deleted from a cluster without interrupting services on the active node
Virtualization: A few enhancements in SQL Server 2008 R2 might influence your virtualization design. First, database mirroring will offer better transaction log compression during write-ahead synchronization. Also, a new lock hint in T-SQL allows programmers to disable lock escalation during specific statements. Databases maintained on separate instances across multiple VMs due to lock performance issues might soon be candidates for consolidation.
Business intelligence: BI is perhaps the most enhanced area of SQL Server 2008 R2. Although most of the changes are in client support, such as SharePoint integrated SSRS and self-service BI in Excel 2010, these enhancements will spur the implementation of BI solutions. As BI implementations proliferate in the enterprise, additional scalability will be important to support increased workloads and meet business SLAs.
Installing SQL Server 2008 on Hyper-V
Installing SQL Server 2008 into a Hyper-V VM differs little from installing the application on a physical server's host OS. However, you should examine certain considerations before beginning the installation. Minimum OS requirements must be met by the guest OS chosen for the VM. In addition, the Hyper-V virtual disk type should be chosen carefully to best support SQL Server. Hyper-V offers three possible virtual disk structures:
- Fixed: Hard-coded with a set size, and that space is allocated on the physical disk for the .vhd file at VM creation. Despite the percentage used, the fixed .vhd file will always be the fixed size.
- Dynamic: Initial .vhd size grows as the data stored grows. This structure makes efficient use of the disk initially but is vulnerable to insufficient free space due to other demands on the hard drive.
- Pass-through: VMs write directly to storage. This is the optimal storage strategy for SQL Server because many SAN platforms offer LUN administration that can stretch a single LUN across multiple physical spindles and replicate data for fault tolerance and high availability.
The bottom line is that production SQL Server 2008 instances are prime candidates for running in a VM, and Server 2008 R2 Hyper-V is the best virtualization application Microsoft has produced to date for supporting mission-critical, resource-intensive server applications such as SQL Server. This article introduces the key concepts and will hopefully whet your appetite to learn more about these powerful tools.