Peek under the hood for a better understanding of your hybrid environment
For those of us working in the cloud, the most exciting feature of Microsoft Exchange Server 2010 Service Pack 2 (SP2) is undoubtedly the Hybrid Configuration Wizard (HCW). Prior to SP2, the process for configuring the necessary connections to enable interoperability between an on-premises Exchange environment and Microsoft Exchange Server Deployment Assistant (aka ExDeploy), you needed 65 pages of detailed instructions and guidance that included somewhere in the neighborhood of 60 steps (depending on your configuration) to complete an on-premises infrastructure for a successful hybrid deployment -- or just to make on-premises servers communicate properly with Exchange Online.was a daunting task. Even with the help of the
Enter Exchange 2010 SP2, which includes the masterfully designed HCW, among other changes and features. The HCW simplifies the configuration process to fewer than 10 clicks of a mouse. (The exact number of steps varies depending on whether you plan to use centralized or direct mail flow, whether you'll use full hybrid mode or online archiving, and the amount of free/busy sharing.) Many of us in the community were ecstatic when Microsoft released the wizard, and we anxiously downloaded and installed it on our systems. To date, the wizard has come through on its promise to simplify and automate hybrid deployments, with little input required from the systems administrator.
Although I applaud Microsoft's accomplishment, I worry about future administrators who will never have the chance to trudge through a manual hybrid deployment. For administrators who haven't lived in a pre-SP2 world, the complexities of what the wizard does in the background will be lost over time. Ultimately, problems could result for organizations if something within the Exchange federation or organization relationship (e.g., free/busy lookups, centralized mail flow) breaks. How will administrators know how to troubleshoot problems if they don't understand the multitude of configuration changes that the hybrid wizard is executing while they sit in blissful ignorance, happily clicking Next?
In this article, I'll explain what's under the hood of the HCW as well as which changes are made (from an architectural standpoint) to simplify the deployment scenario and increase the success rate of hybrid deployments. (This article assumes that you've previously deployed Active Directory Federation Services -- ADFS -- and the Microsoft Directory Synchronization Tool but doesn't go into detail about those configurations, which are beyond the scope of the article.) I'll also provide some guidance for administrators to overcome problems that they might encounter during the process. Let's dive in!
The first thing to look at is the use of DNS namespaces for mail routing and federated delegation. Prior to SP2, Microsoft required that to facilitate the relationship between your on-premises and Office 365 environments, you had to implement two additional namespaces:
The exchangedelegation namespace was used to create an Exchange federation between the on-premises Exchange environment and Office 365. The term federation is widely used and can often confuse administrators. In this context, we're talking about Exchange federation, which is used to help Exchange and Office 365 share information. This type of federation uses the Microsoft Federation Gateway and federation trusts to share information, such as calendar free/busy data and MailTips. The service namespace was used as a secondary accepted domain, to route mail from the on-premises environment to Office 365.
One of the most common questions that Exchange administrators asked when this solution first came out was simply, Why are two additional namespaces required when Office 365 uses a domain.onmicrosoft.com namespace for each tenant domain? There really was no good answer. Microsoft developed this method, which seemed to work but proved to be inefficient and fragile in production. Microsoft addressed the issue in SP2 by no longer requiring these namespaces. Instead, SP2 uses the onmicrosoft.com namespace, which Office 365 assigns and manages, for mail routing. In addition, Exchange federation is now established at the root domain.com. You can forget about both of the aforementioned namespaces. Although this domain consolidation isn't exactly part of the HCW, it's worth noting before you launch the wizard.
Now that we've covered the DNS improvements, let's explore the HCW itself, examining each step.
Before running the HCW, you must add the Office 365 tenant to the on-premises Exchange Management Console (EMC). HCW executes several Windows PowerShell cmdlets and requires an established remote PowerShell session, which EMC generates when you add your cloud tenant. After the cloud tenant has been added, you can find the wizard under Organization Configuration in the On-Premises section of the EMC.
After you add the Office 365 tenant, the next step is to create an HCW object by clicking New Hybrid Configuration in the Actions pane, as Figure 1 shows. This first step might appear to simply create an unconfigured object, but it actually executes several PowerShell commands that generate a new self-signed certificate and a new federated trust between the on-premises Exchange environment and the Microsoft Federation Gateway.
The final step in creating the new HCW object creates the environment to allow the full HCW to run later. From a PowerShell perspective, the following commands are executed during this first phase:
After these commands execute successfully, you'll see the Hybrid Configuration object in the Organizational Configuration section of the EMC.
As you step through the wizard, several pages gather administrator input and are then used to execute the configuration commands in the background. The first page gathers on-premises and cloud credentials. These credentials must have the appropriate administrative rights on each side. The on-premises account must be a member of the Organizational Admins security group, and the Office 365 credentials must be a member of the Global Admins group within your Office 365 tenant, as Figure 2 shows.
The next page compares the list of on-premises accepted domains to the domains that are listed in the Office 365 tenant. If a domain name matches, then it can be added as the namespace that's being configured. For organizations with more than one accepted domain, you can select all the domains you want to add that will be part of your hybrid configuration, as Figure 3 shows. Note that the domain must be validated in Office 365 before it shows up in this list.
The next page provides a text file, which you'll use to prove ownership of the domain (i.e., domain proof), as Figure 4 shows. Copy this text to the clipboard and create a DNS TXT record within your public DNS zone file. Be sure to select the box to confirm that the TXT record has been created in the public DNS. Microsoft uses this TXT record to verify that you own the public DNS namespace. Only authorized administrators have the right to change public DNS records for your zone, so adding this record proves to Microsoft that you're authoritative for your domain namespace.
On the Manage Hybrid Configuration Servers page, which Figure 5 shows, you can select the Client Access and Hub Transport servers that will be used for mailbox moves, sharing, and hybrid mail flow. Depending on the existing on-premises environment, the selection of servers might vary. If your environment already has Exchange 2010 in place, select the Client Access and Hub Transport servers that are in your primary site and have the most available resources. As a matter of best practice, your servers should have the appropriate amount of memory, CPU, and disk space. In my experience, a dual-core 3GHz processor and 8GB of RAM is generally more than enough processing power. Also, depending on the length of time you plan to be in coexistence, you might consider running multiple Client Access and Hub Transport servers in an array, for fault tolerance. Doing so will ensure business continuity if a server fails.
At a recent engagement, I had a customer that insisted fault tolerance wasn't necessary. Halfway through the migration, the hybrid server -- onto which we'd centralized mail flow -- failed, effectively shutting down all mail flow for the organization. So do yourself a favor and double up on those Client Access and Hub Transport servers. They don't need to be physical servers, so leverage your virtual infrastructure if you have one.
Your hybrid server or servers should be located in the same Active Directory (AD) site as your existing Exchange farm and can't reside in a demilitarized zone (DMZ). In addition, if you have Exchange Server 2003 in your environment, then you'll want to add the Mailbox role to one of your hybrid servers. This server will host the free/busy replica and allow for free/busy lookups between your Exchange 2003 users and Office 365 users. Exchange 2003 uses the legacy public folder hierarchy to provide users with free/busy lookup services. Exchange 2010 (and Exchange Server 2007) changed the way free/busy lookups are handled, introducing the Availability service, which provides a more robust method for providing these lookups. For free/busy lookups to work in a hybrid environment after the Mailbox role has been installed on the Exchange 2010 hybrid server, create a public folder database and move the free/busy calendar to the Exchange 2010 server. The hybrid server will then act as the intermediary lookup between the legacy Exchange 2003 environment and the Office 365 environment.
The Manage Hybrid Configuration Mail Flow Settings page, which Figure 6 shows, requires the public IP address and Fully Qualified Domain Name (FQDN) of the on-premises hybrid server. The assumption is that the FQDN is assigned to the IP address that's listed on this page. Later, you'll see how these settings are used to configure inbound and outbound Microsoft Forefront Online Protection for Exchange (FOPE) connectors.
The Manage Hybrid Configuration Mail Flow Security page, which Figure 7 shows, queries Exchange 2010 Hub Transport servers for any installed certificates and allows you to select which server you want to use for the hybrid configuration.
This certificate should be a publicly signed certificate from a source such as VeriSign or GoDaddy. The certificate is used for Transport Layer Security (TLS) authentication between FOPE and the on-premises server. This is also the page on which you can set the mail flow options:
The default selection is to use DNS rather than centralized mail flow, but be sure to update the setting according to your strategy. By centralizing mail flow, you allow your administrators to track messages more easily, because all messages come through the hybrid server.
An important and often overlooked item when selecting the mail flow option is the Sender ID Framework (SPF) record. The Sender ID Framework is an authentication technology protocol that helps protect mail servers from spoofing and phishing, by verifying the domain name from the message sender. The framework validates the origin of email messages by verifying the IP address of the sender against the alleged owner of the sending domain. If your organization uses SPF records, it's important to note a few points. First, if centralized mail flow is configured, all mail appears to come from the on-premises server; no updates to SPF records need to be made. However, if you intend for your Office 365 users to send messages directly to the Internet, then you must add the Office 365 SPF values. Microsoft has published a Sender ID Framework SPF Record wizard to help you craft your record. As an example, this is what an SPF record might look like if you choose to route mail from your on-premises environment and Office 365, assuming that your mail server's public DNS name is mail.domain.com:
v=spf1 a mx:mail.domain.com include:outlook.com ~all
The final page, which Figure 8 shows, is a summary that details the selected features, based on your input in the wizard. Click Manage to execute the wizard, and wait for the magic to happen!
As you can see, the wizard makes creating a hybrid deployment very simple. It requires minimal input from the administrator and converts that input into a series of PowerShell commands that builds your configuration without further administrative input. You can imagine how administrators might take the wizard for granted. So what really happens after you click Manage? Let's take a look. This example uses my demo tenant office365support.com. The specific values in your environment will vary, depending on your domain and DNS records. (For those of you who've deployed the wizard, you can follow the process by reviewing the log in \%ExchangeInstallPath%\Logging\Update-HybridConfiguration.)
First, the wizard establishes remote PowerShell sessions by authenticating with both the on-premises and cloud tenants. After the sessions have been established, a series of Get commands grab information from the on-premises server and Office 365. (Note that most of these commands run twice: once in the local environment and once on the Office 365 tenant.) These commands run against each of the servers that you selected in the wizard, gathering all the necessary information to begin creating the hybrid relationship.
The next set of commands attempts to configure legacy Exchange support by digging into the servers to determine where the free/busy calendar folder resides. If you look through the logs in this section, you might see an error like this one: ERROR:System.Management.Automation.RemoteException: 'Server2' does not have the right Exchange Server version or role required to support this operation. You don't need to worry about this error; the hybrid wizard is simply trying to determine which server is the appropriate one to host the free/busy replica.
In environments that include Exchange 2003, the wizard continues to iteratively review each server to determine whether it hosts the free/busy replica. This step is important because of the inherent difference in free/busy lookups between Exchange 2003 and Exchange 2010. The free/busy replica must reside on the Exchange 2010 server that acts as the intermediary between the legacy Exchange 2003 mailboxes and the cloud-based mailboxes. Without the free/busy public folder, users on Exchange 2003 can't see the free/busy information for cloud users, and vice versa.
After the wizard fetches the email address policy and version, it upgrades that information and adds the necessary email aliases for mail flow -- specifically, adding @domain.mail.onmicrosoft.com as an alias to all users. This step is extremely important for mail routing. During the migration process, after a user has been migrated to the cloud, @domain.mail.onmicrosoft.com is set as the routing address. From that point forward, when an attempt is made to deliver a message to the mail user on premises, Exchange uses the Route To address to forward the message to the cloud-based account by using a scoped routing connector that's created in a subsequent step.
Next, the wizard checks prerequisites within your organization and inside your Office 365 tenant, before creating organizational relationships. This step enables organization customization, which allows the wizard to build the organizational relationships. As you walk through the wizard log, you'll see multiple lines for this process. This extremely important step creates a relationship object on both sides of the federated trust, allowing these important hybrid functions:
The final phases of the process include the configuration of the Send and Receive connectors in Exchange. This phase also configures the inbound and outbound server settings within the FOPE service. Before configuring the connectors, the HCW gets the set of datacenter IP addresses that are used for the tenant and the Exchange certificate that's being used:
The next step, which Listing 1 shows, grabs the existing connector information and creates new connectors to allow mail flow to and from Office 365. As you can see in this relatively simple example, the wizard executes 53 commands to implement the hybrid configuration that you, as an administrator, defined in less than 10 pages. Although the wizard is an amazing step forward for hybrid environments, it's inherently dangerous for inexperienced administrators who don't understand the changes being made or their purpose.
In my experience, the HCW runs flawlessly -- most of the time. However, I've encountered an issue in which the Get-FederationInformation command times out. As a workaround, open a separate PowerShell instance and connect to your Office 365 tenant, using the same Global Admin credentials that you specified in the wizard. When the new session is established, the wizard resumes.
The problem might also be related to the Windows Communication Foundation (WCF) and Windows Workflow Foundation (WF); repairing the WCF and WF components by running the following command in the C:\Windows\Microsoft.NET\Framework\v3.0\Windows Communication Foundation directory might solve the problem:
Other common reasons for the HCW to fail are generally related to
o invalid Send and Receive connectors
o incorrect certificates
o missing security patches
Firewall restrictions are often the root cause of many mail-flow issues. As I mentioned in the beginning of the article, preplanning and testing are the keys to success. You should make a solid plan before you connect to your first server. (Speaking of plans, check out the Exchange Server Deployment Assistant. This tool helps you plan out your deployment, based on your specific environment and requirements.) A solid plan should include planning out all the IP addresses that are to be used in the hybrid deployment. After you've identified those IP addresses, you can configure the firewall to allow the appropriate traffic flow to and from those addresses. One common problem that I hear from customers and other IT professionals is that the list of IP addresses can be difficult to find, as well as confusing. I've found the most comprehensive list of IP addressesin the Microsoft article " Office 365 URLs and IP Address Ranges."
DNS records are also a source of great pain for many engineers who are trying to deploy the hybrid solution. One of my fellow MVPs, Loryan Strant, developed an excellent Office 365 testing site to determine whether all DNS records are configured properly. Note that this site assumes that you want mail to flow from Office 365. If you use the site to test your IP addresses and want mail to flow from your on-premises mail system, you can ignore the DNS record for the cloud-based Autodiscover.
Many other common problems are related to misconfigurations within the existing Exchange environment. Several tools can help identify such issues. The Exchange Best Practices Analyzer is an excellent tool to start with. This tool reviews all your crucial components to ensure that they're configured according to best practices and provides links for items that are out of compliance.
In addition to some of the tips that I've provided here, the following resources can provide invaluable information as you plan and execute your hybrid deployment:
Hopefully, this article sheds some light on the hybrid configuration process and how crucial it is to understand what's going on under the hood in the HCW. As I stated at the beginning of the article, I applaud Microsoft for making this process easy, but I hope that all administrators will go beyond the shiny GUI and dig deep into what happens behind the scenes.
Listing 1: Creating New Connectors