The famous fictional Hitchhiker's Guide to the Galaxy tells readers, first and foremost: "Don't panic." I often think of this message when I try to talk about Quality of Service (QoS) with an organization: Nervous glances quickly fill the room and everyone remembers that they either have a doctor's appointment or a parent/teacher meeting and they need to leave the meeting immediately.

Related: "Why Quality of Service Matters in the Cloud"

This nervousness isn't entirely without just cause. Implementing QoS isn't something to be undertaken lightly. The task involves an understanding of the types of traffic on the network, the relative importance of that traffic, the network infrastructure itself, and how to actually implement QoS on the network and OSs. The good news is that QoS is more approachable than ever in Windows Server 2012. Organizations have different options for how to implement QoS. But first, why do you even need it?

The Need for QoS

In today's highly connected data centers, the servers that make up the data center communicate with one another for many reasons:

  • Application data
  • Replication
  • Cluster traffic
  • Network storage traffic (e.g., Server Message Block—SMB, iSCSI)
  • Management traffic
  • Backup data

Add in virtualization and things get even more complicated. On a Hyper-V server, I might need five network connections, minimum! Also consider that redundant paths might be required for some types of traffic, so multiple connections might be teamed together. Separate connectivity might be needed for storage when technologies such as Fibre Channel are used. All this can make a potential mess. So how do organizations architect their fabric?

Think of your network as a highway. You hear a siren behind you. Cars around you begin trying to move out the way, but doing so during rush-hour gridlock is difficult and the emergency vehicle can't get through—a disaster for whomever is relying on the aid it provides. Or perhaps the highway you're on has a special lane for emergency vehicles. Most times, that lane is empty. As you sit in rush-hour traffic, your anger builds as you realize how much quicker you could get home if only other cars could use that lane. Of course, the traffic department could just keep adding lanes to the highway to avoid congestion, but that isn't a practical solution.

Your network is like that highway. Maybe all the network traffic shares a network connection, in which case you risk crucial traffic being unable to traverse the network in a timely fashion during times of high load. Or perhaps you have dedicated network connections for each type of traffic to ensure that it always gets the bandwidth it needs when it needs it. Maybe you keep adding connections to accommodate all the traffic.

Although most organizations have traditionally used the second option, doing so has become far more challenging for several reasons:

  • Data centers are shifting to 10Gb networks instead of 1Gb networks. Having more than two 10Gb connections per server isn't cost-effective, so having dedicated connections for each type of traffic no longer makes sense.
  • As virtualization becomes more prevalent, blade-type servers become more common. But these servers typically have limitations on the number of supported adapters, limiting connections. There are some exceptions if the data center uses converged fabrics, which allow virtual adapters to be created for the host, giving almost unlimited flexibility in how the traffic is divided (although behind the scenes, this is really its own kind of QoS).
  • Traditionally, networks have been heavily over-built to ensure that bandwidth is available. Because of the increased importance and use of different types of network traffic, this over-building and complexity have become unmanageable for most organizations. Many of the dedicated network connections for a specific type of traffic are either not used or barely used the majority of the time.

Virtualization has also introduced its own challenges. Many OS instances now run on a single piece of hardware and share a set of network connections. Maintaining separate network adapters for every virtual machine (VM) is highly impractical. A single bad VM can consume all available network bandwidth, starving the other VMs. Therefore, you need a mechanism to ensure not only that different types of traffic have sufficient bandwidth when they need it but also that you can fairly divide network resources of the same type for the different VMs, or tenants, that use the virtual infrastructure.

Imagine that you're a hoster and have many tenants using your services. You need to ensure that each tenant gets a fair amount of resources. You also might need different levels of network speeds, such as Gold, Silver, and Bronze.

As you might expect, QoS is the solution to these problems. QoS fully embraces and enables the multi-tenant and converged 10Gb fabric data centers that we're seeing more frequently. Let's look at the types of QoS that are available through Windows Server 2012, including software-based QoS, hardware-enabled QoS, and QoS that is specific to virtualization.

You might think the problem can be solved just by adding 10Gb instead of 1Gb to your network. After all, surely a pipe that is 10 times larger than what was available previously will be big enough? That's the same analogy as adding more lanes to the highway: For a while, it might cure the problem, but no matter which resources you give to a workload, it will ultimately grow to use them all and want more. Even a 10Gb connection will become saturated over time, and the problem—certain types of traffic not getting enough bandwidth when needed—will return.

Software QoS  

Windows has had the ability to implement software QoS for a long time. Navigate within a Group Policy Object (GPO) to Computer Configuration, Policies, Windows Settings, Policy-based QoS. Policies can be created to throttle bandwidth to a specific speed for different applications, source-and-destination IP address combinations, and specific protocols. But QoS previously specified a maximum bandwidth per traffic type as the means to manage the traffic.

With a maximum-bandwidth configuration, the workload that is affected by the policy can never exceed the amount of allocated bandwidth, which guarantees and enforces a predictable network throughput. For example, suppose that I have a 10Gb network connection and I divide the traffic this way:

  • 1Gb for management
  • 1Gb for live migration
  • 1Gb for cluster or cluster shared volume (CSV)
  • 2Gb for iSCSI
  • 5Gb for VMs

This looks great. Certainly, in normal circumstances, the VM traffic would be the majority, and the other types of traffic are guaranteed their own amounts for the times they need it. But that's the problem with using maximum-bandwidth settings for bandwidth management: The majority of the time, 50 percent of the available network bandwidth is hardly used, but it's been reserved for the times that it is needed, just like having a separate lane for emergency traffic. Meanwhile, the servers have a heavy virtualization traffic load and could likely benefit from using that 50 percent of the bandwidth when it isn't in use. We're wasting a large amount of our resources and restricting our capabilities unnecessarily. (Furthermore, QoS in Windows Server 2008 R2 didn't work with Hyper-V.)

One area in which the maximum-bandwidth method is useful is when you must pay for bandwidth usage, such as a WAN connection between offices. In such a scenario, limiting bandwidth to a specific value is a good idea.

Windows Server 2012 introduces the concept of a minimum-bandwidth policy, which allows the different types of traffic to have a relative weight assigned to them. Let's see how that would work for the same types of traffic that we used in the maximum-bandwidth example:

  • 10 for management
  • 20 for live migration
  • 20 for cluster or CSV
  • 10 for iSCSI
  • 40 for VMs

Note that these values don't represent any kind of unit; they're just relative weights.

The way that minimum bandwidth works is that by default, any traffic type can use any available network bandwidth. Even though VM traffic has a weight of 40, that traffic could consume 100 percent of the network bandwidth as long as there is no contention on the network. When there is no contention, workloads can use whatever bandwidth they want, up to the limit of the network fabric itself. Only in times of contention do the minimum-bandwidth relative weights come into play. In times of contention, the different types of traffic are guaranteed to get bandwidth based on these weights: Live migration traffic would get 20 percent, while VM traffic would get 40 percent. All the weights together add up to 100. The actual minimum bandwidth for one type of traffic in contention scenarios can be found by dividing the relative weight of that traffic type by the sum of all the weights.

It is also possible to use strict minimum-bandwidth configuration, in which the types of traffic are given absolute bandwidth values as the minimum. For example, management traffic is given 1Gb and VM traffic is given 4Gb. However, this approach can be difficult to administer, compared to relative weightings. You must be careful not to over-provision the minimum values that you assign to the workloads. Also, just because a type of traffic is set to a minimum of 1Gb does not guarantee that it will get 1Gb. Most networks have many switches and routes in them. Although a type of traffic might be guaranteed 1Gb of bandwidth within the local server, after that traffic leaves the server it is at the mercy of the other traffic on the network.

There is another problem with using strict minimum bandwidth. Suppose that you use NIC teaming with two 10Gb network adapters; 20Gb of bandwidth is available when the environment is healthy. If you use strict minimum bandwidth and configure up to 20Gb of minimum bandwidth, those bandwidth guarantees cannot be met if an adapter fails. That defeats the whole point of NIC teaming: to provide consistent service even when a failure occurs.

For these reasons, the strict minimum-bandwidth approach is not recommended. Use relative weighting whenever possible.

The biggest difference between using a maximum- or minimum-bandwidth approach is that the minimum-bandwidth option allows the highest utilization of network resources when there is no contention, whereas the maximum-bandwidth option always limits traffic to its configured maximum, potentially wasting available bandwidth. There is generally no price for intra–data center bandwidth, so using as much as possible is ideal. If you are using QoS from before Windows Server 2012, then migrating to the minimum bandwidth policies allows you to maximize your resource utilization and avoid waste.

Hardware QoS

In addition to software QoS, modern networking equipment also enables its own QoS capabilities. This type of hardware-level QoS is known as Data Center Bridging (DCB) and is an IEEE standard. This article focuses more on software-level QoS, but it's important to understand that if you have DCB-enabled network hardware, then you can leverage those capabilities through Windows Server 2012. Traffic management is offloaded to the network adapter instead of being processed by the host OS. In addition, DCB can support flow control for certain types of network traffic and can request a slowdown of traffic from the source (typically a switch). The good news is that most 10Gb network equipment actually supports DCB. By default, DCB is not enabled in Windows Server 2012 and must be installed as a built-in feature, using the following command:

Install-WindowsFeature Data-Center-Bridging

Virtualization QoS

Software QoS capabilities are also available to VMs. The ability to have QoS with VMs is crucial for many environments, especially hosters or organizations that have different business units sharing the infrastructure. Being able to ensure that different tenants get fair amounts of network resources and to enable different levels of service based on premium rates for a Gold network connection are huge benefits. Virtualization offers these capabilities for CPU, memory, and even storage, so being able to offer them for the network completes the resource-management picture.

Remember that minimum-bandwidth capabilities can use relative weighting or a strict bandwidth value. With VMs, using minimum relative weighting is even more important because VMs are mobile. A VM can be moved between hosts, so trying to set strict bandwidth minimums won't work because different hosts have different VMs with their own configurations. If you try to use live migration to move a VM to another host on which the strict minimum-bandwidth configuration cannot be satisfied, then the live migration will fail. Using relative weighting always works because the bandwidth is relative to whichever workloads are on a server.

It is important to understand that minimum-bandwidth QoS policies applied to VMs affect only traffic that is sent from the VM to the physical wire. If the traffic is between VMs on the same host, then minimum-bandwidth QoS policies do not apply. VM-to-VM traffic on the same host never touches the physical network adapter but is routed internally by the Hyper-V virtual switch and therefore does not use up any bandwidth on the network. Maximum-bandwidth QoS policies do apply for VM-to-VM traffic, in addition to VM-to-wire traffic. The reason for the difference is that some organizations charge based on network resources used. If a maximum amount of bandwidth is set, then tenants do not expect to be charged more than the maximum that they have configured.

Note that this maximum bandwidth applies only to outbound traffic from the VM (egress); inbound (ingress) is unrestricted. Why not limit inbound traffic? Inbound traffic is already at the host, so dropping it would not bring much benefit. Also, unless the traffic is TCP, there is no way to tell the sender to slow down or stop.

For VMs, only strict bandwidth can currently be configured using the Hyper-V Manager GUI, as shown in Figure 1. This is unfortunate, given that best practice is not to use strict minimum bandwidth but rather to use relative weight minimum bandwidth.

Figure 1: Hyper-V Manager GUI Restrictions

The good news is that like the rest of Windows Server 2012, everything that can be accomplished with the GUI can also be accomplished using Windows PowerShell, which actually exposes more functionality, including the relative weight configurations. In addition, through PowerShell you can configure QoS on the Hyper-V virtual switches—something that is impossible using the graphical tools.

Managing QoS

Many QoS configurations are performed using PowerShell, and the management is consistent between software and hardware QoS or DCB for the majority of actions. The first task is to classify the types of network traffic so that policies can be applied.

When using hardware QoS, there is a limit of eight classifications of traffic; there is no such limitation when using software QoS. For example, you might create an iSCSI type classification and a Live Migration type classification. The good news is that the PowerShell modules that are used to create the classifications have a number of built-in classifications that include the most common types of traffic (i.e., iSCSI, NFS, SMB, Live Migration, SMB Directory, and Wild Card, which covers everything else). If you need other classifications, you can create your own filters.

After the data is classified, you can create and apply policies to control the allocated bandwidth. Rather than list the cmdlets here, I recommend that you review the Microsoft article "Network Quality of Service (QoS) Cmdlets in Windows PowerShell," which goes into detail about each cmdlet that you need to use.

The following is a simple example that creates a new policy for Live Migration type traffic, using the built-in Live Migration filter:

New-NetQosPolicy "Live Migration" -LiveMigration -MinBandwidthWeightAction 40

Another option for using QoS relates to a Hyper-V host, because of the virtual switches that are present. Virtual switches are exposed to the management OS, not just to the VMs. Therefore, you can create multiple virtual adapters for use by the Hyper-V host itself. For example, you can create a live migration virtual network adapter, a cluster virtual adapter, and so on. You can then apply QoS policies directly to the virtual network adapters to control their bandwidth. Doing so might be easier than the typical requirement to use all the separate types of filters to differentiate between traffic. For example, the following two commands create a new virtual adapter on the Hyper-V host OS and then assign it a policy of a minimum-bandwidth relative weight:

Add-VMNetworkAdapter -ManagementOS -Name "LiveMigration" -SwitchName "External Switch"
Set-VMNetworkAdapter -ManagementOS -Name "LiveMigration" -MinimumBandwidthWeight 40

Get on the Road

The primary message surrounding Windows Server 2012 QoS is that it can change the way you think about networking in the data center. Instead of having separate network connections for each workload, use larger pipes and use QoS to ensure that different types of traffic receive the necessary bandwidth. Even if you have DCB-capable network hardware, you might want to use software QoS, especially in Hyper-V environments; software QoS is more scalable in terms of number of policies. Some work is involved in implementing QoS, but the new relative minimums make the process much easier while ensuring that you get maximum utilization of resources.