Microsoft Exchange Server 2007 marked a dramatic departure for Microsoft in several ways: It consigned the x86 architecture to the dustbin of computing history; it made dramatic changes to the way that Exchange high availability works; and it added integration with voicemail and faxes. This last item, which Microsoft called Exchange Unified Messaging (UM), was seen as the least revolutionary of all the Exchange 2007 changes because it largely duplicated features that third parties had already been offering in various ways.
However, Exchange 2007 UM was better-integrated, faster, and more robust than its third-party competitors. Because it was integrated directly with the Exchange transport and store architecture, it was able to deliver messages directly to user mailboxes without the need for IMAP-based polling, and it tied into Active Directory (AD) to provide dial-by-name capability based on the contents of the Global Address List (GAL). It was lacking a few features, such as the ability to trigger the message waiting indicator (MWI) on PBX-connected phones, but overall it was a significant upgrade to the state of UM for Exchange. The fact that most naysayers missed was that the architecture of Exchange 2007 UM laid the groundwork for some compelling features that couldn't feasibly be delivered by third parties. Now, with the release of Exchange 2010, we have the proof of that fact.
When the Phone Rings: A Recap
Exchange 2010 answers incoming calls in the same basic way that Exchange 2007 does. An incoming call is routed to the Exchange UM server by a PBX, Microsoft Office Communications Server (OCS), or a VoIP gateway. (Hereafter I'll refer to whatever device operates the phone system as a PBX, even if it's really OCS.) The PBX has a set of instructions that specify what should happen when a call for a given extension isn't answered. These instructions can specify different actions for calls to busy extensions and those that aren't answered within a certain time. In this case, let's say that the called extension goes unanswered. Following its instructions, the PBX transfers the call to the UM server, which accepts the call request—provided it comes from a known UM IP gateway—plays the appropriate greeting, and records the response.
An alternative scenario occurs when you're using Exchange to provide automated attendant functionality. In this case, you set the PBX to route all incoming calls to the auto attendant pilot number on the Exchange server. The UM server answers all calls to the pilot number with a standardized greeting, then lets callers perform dial by name, transfer to other auto attendants, and so on.
Some aspects of these behaviors have changed in Exchange 2010, but the overall call-answering flow is unchanged. In this article, I'll discuss just the changes to the flow where they're pertinent.
Fax: Gone But Not Forgotten
Let's start with the easiest change to describe: Exchange 2010 UM no longer includes support for fax reception through a local PBX. The fax feature set of Exchange 2007 was a great idea that was somewhat before its time: The idea of integrating fax with Exchange was a good one, but it came at a time when overall usage of fax was declining and when companies such as OpenText were shipping full-featured fax integration products. Getting it to work depended on having the right PBX properly configured, and it didn't offer features that competing third-party solutions provided. More importantly, because Exchange 2007 included only fax reception, customers who wanted full fax functionality still had to deploy another solution to provide faxing from the desktop. By the time the expense and difficulty of doing so were factored in, it was often more cost-effective to deploy a third-party fax system rather than to deploy Exchange UM just for the fax feature.
However, Exchange 2010 still receives faxes after a fashion. Exchange 2007 listens for the T.38 CNG (calling number generated) tone, an 1100 Hz sound that fax machines use to signal to each other. When it receives the CNG tone, it treats the call as a fax and tries to receive it as a T.38 stream. Exchange 2010 does things a bit differently; there are actually three ways an incoming fax can trigger the answering. Before we can discuss them, though, I need to point out that UM call answering is typically a two-stage process. The PBX or IP gateway sends a SIP INVITE message to Exchange, which responds and answers the call. The two sides agree on a set of parameters for the conversation, then exchange audio data using the Real-time Transport Protocol (RTP).
The first, and simplest, method for Exchange 2010 to detect a fax call is that Exchange itself can detect the CNG tones. It can't do this until after it's answered the call, however. The second method is that the IP gateway or PBX can notice the CNG tones in the incoming audio stream, at which point the gateway sends a notification to Exchange in the RTP audio stream that essentially says, "Hey, this is a fax call!" Exchange then sends a SIP REFER message to the gateway or PBX, which transfers the call to the fax over IP (FoIP) provider. The most complex process occurs only with gateways that don't support dynamic notification of the T.38 protocol in RTP. In this case, the gateway sends a new SIP INVITE to Exchange with a different RTP profile—the one for fax—specified. Exchange puts the gateway on hold, then sends the SIP REFER message to have the call transferred.
How does Exchange know where to send the fax? You have to specify the URL of the fax service you want to use as part of the UM dial plan. After that's been done, you can control whether users can receive faxes by adjusting the Allow inbound faxes option on either the dial plan or on a UM mailbox policy. Exchange redirects the call, with its original caller and callee data intact, to the FoIP provider. The fax provider accepts the fax, then returns it to Exchange as an email message, usually with a TIFF (.tif) attachment. That might seem like an odd choice of attachment format, but the TIFF standard specifies a way to produce multipage attachments in a single file, making it a decent choice for fax transmission.
This process provides for a more efficient means of distributing inbound faxes to recipients; the Exchange 2007 method of delivering all faxes to one place was a hassle for recipients. There are several fax services that support Exchange 2010 fax reception, giving you a choice of service terms and prices. Microsoft maintains a list of FoIP partners on its Exchange Independent Software Vendors web page. Customers moving to Exchange 2010 who have already deployed Exchange 2007 fax reception will find that they have no choice but to move to a fax service even if they don't want to. One additional note I should point out is that this integration doesn't work—and thus faxes cannot be received—if you're using OCS 2007 R2 as your PBX solution.
Exchange 2007 includes two types of speech capability: automatic speech recognition (ASR) and Text-to-Speech (TTS). In addition, Exchange UM uses prerecorded audio prompts that give callers information and instructions. These three features are all language-specific, and they're bundled together into language packs. Installing, say, the French language pack on a UM server enables the server to provide ASR, TTS, and prerecorded prompts in French.
However, not all Exchange 2007 language packs include ASR. For example, the Mandarin Chinese language pack provides TTS support, but no ASR. In fact, only the English language packs (for British, Australian, and American dialects) include full ASR. This limitation has been a major blocker for UM deployment in large multinational enterprises. The solution, though, wasn't entirely up to the Exchange team.
Both ASR and TTS capabilities are provided by the Speech Server core. Speech Server was a separate product until about 2005, at which point it was removed from Microsoft's product catalog and rolled into what became OCS. Its core was built into OCS 2007 and Exchange 2007. In fact, the Exchange 2007 UM role is in large measure a custom Speech Server application. Exchange 2010 uses a much newer version of Speech Server (which is also present in OCS 2007 R2). Exchange 2010 currently supports ASR, TTS, and prompts for Simplified and Traditional Chinese; Dutch; British, Australian, and American dialects of English; Canadian and traditional French; German; Italian; Japanese; Korean; Brazilian Portuguese; Spanish (in both Catalan and Latin American dialects); and Swedish. Language packs have been promised for several additional languages and dialects, including Indian English, Russian, and Hong Kong Chinese. These additions make Exchange 2010 much more useful for global deployments, as well as to organizations that depend on being able to handle callers with languages other than English.
Changes to Caller ID Resolution
One of the most useful features of Exchange UM is its ability to resolve callers' phone numbers to give you information about who called as well as a set of links to return the call. Every call that comes to the UM server should contain calling line ID (CLID) data that indicates the source of the call. Note that the PBX or gateway could mangle or even omit CLID data, in which case Exchange won't be able to resolve the number; in some locales, it's possible that the CLID data wasn't delivered to the PBX or gateway by the phone system, although this is becoming more and more rare.
The easiest case to resolve is one where the caller's name is known because they're using Outlook Voice Access to place a call or they're calling from Communicator or a Communicator Phone Edition phone. These methods require the user to be authenticated, so Exchange can identify the caller.
Failing that scenario, Exchange uses a special type of proxy address known as the Exchange UM (EUM) proxy address. The EUM proxy for a given user is essentially that user's extension number coupled with the Fully Qualified Domain Name (FQDN) of the dial plan that hosts the user. For example, firstname.lastname@example.org is the EUM proxy address for the user who owns extension 1006 in the "pa-hq.corp.contoso.local" dial plan.
Exchange tries to match the CLID data against a number of sources. It begins by constructing an EUM address with the CLID information, then checking it against the called party's dial plan. This step catches the case where a user in one dial plan calls another user in the same dial plan. If that match fails, the constructed proxy is tested against other available dial plans in the organization, covering the case where a user in one dial plan makes an internal call to a user on another dial plan.
If that method doesn't find a match, the caller's EUM proxy is checked to see if it looks like a valid Session Initiation Protocol (SIP) Uniform Resource Identifier (URI), such as sip:email@example.com. If it contains an @ character, it's checked against a list of known SIP proxy addresses. If it's not found, or if the EUM proxy contains a + character, it's normalized to E.164 format and checked again, this time using the msExchUMCallingLineIDs attribute in AD. You have to manually populate this attribute by using the Set-User cmdlet with the -UMCallingLineIds parameter; using this attribute is a useful way to store a phone number that can be matched against callers without being visible to users. If there's no match, the msRTCSIP-Line attribute is checked; this attribute is present only if OCS 2007 (or a later version) is installed and the caller has an extension registered with OCS.
If the number still isn't matched, the next step is to look it up in AD. Every user object in AD has several potential phone numbers: There are attributes named telephoneNumber, otherTelephone, homePhone, mobile, and more, but they're not indexed, meaning that searching against them is inefficient. So, you can control whether AD matching is possible by setting the AllowHeuristicADCallingLineIdResolution attribute on the dial plan. This attribute is enabled by default in Exchange 2010 SP1.
Although searching AD sounds like a great solution, there's one problem with it: If you haven't used a consistent and correct format to put the numbers in, you won't get predictable—or even necessarily correct—results. Microsoft suggests you use E.164 format for these numbers even if you're not using OCS or another SIP system that requires it. You can use the -NumberingPlanFormats attribute on the dial plan to specify a mask that Exchange will use in converting whatever format you're using into something else.
Lastly, the caller EUM proxy is checked against the user's personal Contacts folder if Contacts resolution is turned on for the dial plan. Exchange 2010, sadly, doesn't provide a way to match caller ID data against public folders with contact data, though.
At this point, one of two things is true. If Exchange found a matching user, the user's name will be displayed, and whatever phone numbers could be located will be included in the voicemail or missed call notification message. If any number matches, all of the user's numbers will be displayed; for example, if the user's work number matches, the resulting message (with its click-to-call links) will include all numbers defined for that user. If the number couldn't be matched, only the number will be displayed.
Text Preview of Voicemail
Voice Mail Preview is a cool new Exchange 2010 feature that attempts to take some of the mystery out of voicemail. Think about it: When you get a voicemail, you might know who it's from (or you might not, depending on whether caller ID matching worked), but you have no way to know whether the message is important or not without listening to it. Depending on where you are, listening to the message could be difficult or impossible; in many situations, you can peek at the contents of an email message more easily than you can listen to a voicemail. Voice Mail Preview bridges this gap by performing a speech-to-text transcription of the message.
The first question most people have is whether the text preview is accurate. The answer: sometimes yes, sometimes no. Transcription accuracy depends on many factors, including sound quality, how fast the speaker talks, and whether they have a strong accent.
One complaint often levied about Exchange 2007 UM is that it was mostly English-only. Therefore, the second question often asked about Voice Mail Preview is whether it is limited to US English. The answer here is a definite no; each UM language pack includes support for Voice Mail Preview. However, the language used for transcription is based on the language set for the dial plan of the called party's extension. Therefore, if a French-speaking user calls a UM user whose extension is in a dial plan that's set to Latin American Spanish, Exchange attempts to transcribe the French message as Spanish, with hilarious and inaccurate results. This restriction aside, having multilingual capability for this feature is a welcome addition.
Legacy voicemail systems have long offered the ability to mark a message as private. Private messages can't be forwarded to other users. Exchange 2010 offers a similar capability: When you leave a voicemail message, you can choose to mark it as private, and Exchange applies an Active Directory Rights Management Service (AD RMS) template to it that prevents clients from forwarding it. Of course, this feature requires you to have AD RMS installed and running—no small task.
You can enable protected voicemail separately for authenticated and unauthenticated callers, and you can even force all voice messages to be protected. You can't, however, control which AD RMS template is applied; you always get the Do Not Forward template.
Protected voice messages can be played only by compatible clients. By default, Outlook Web App (OWA) 2010 and Outlook 2010 can play them by using their built-in inline media player component. However, if you prefer, you can force users to listen to protected messages with Outlook Voice Access or Play on Phone; this choice is both the most secure route and the most useful, given that Mac OS X and Linux clients as well as mobile devices won't be able to play these messages in unaltered form.
Message Waiting Indicator
Legacy voicemail systems offer another feature that Exchange 2007 didn't include: a method of signaling users that a message is waiting. The MWI feature varies from PBX to PBX. Some systems light a lamp on the phone; others turn on a stutter dial tone.
Exchange 2010 supports MWI notification; when you enable it, the Exchange UM server sends a notification to the PBX whenever a covered mailbox gets a new voicemail. The notification itself is generated when the contents of the Voice Mail search folder (which is automatically created on user mailboxes when they're UM-enabled) change. The UM server subscribes to that folder, so any time a voicemail arrives or is removed, the UM server receives a notification and can send the appropriate notification to the PBX.
What happens to it after that is up to the PBX—and whatever devices are connected to it. The convenience of having voicemail messages in the same Inbox as everything else seems to remove the need for this feature, but it's critical to some users; I've seen more than one Exchange 2007 UM deployment put on hold until a third-party MWI solution could be integrated.
Building on the Foundation
Exchange 2010 UM builds on the technical foundation of Exchange 2007 UM and adds some impressive new features. The widely expanded language support, ability to keep voice messages private, and Voice Mail Preview add convenience and flexibility to an already useful product.