Virtualization can simplify your identity and access management environment
Most Identity and Access Management (IAM) systems contain a lot of pieces, and though Active Directory (AD) is usually one of those pieces, it isn't the only one. Virtual directory servers have been around for years, but they're experiencing a dramatic increase in popularity thanks to their strengths and the new requirements of cloud computing. What exactly is a virtual directory server? How is it different from AD or a metadirectory server? Why would you want to use one?
Before I talk about virtual directory servers as a solution, let's outline the problem. Not long ago, identity data came solely from within IT. The best-known example is of course AD, which serves as both a network OS and an identity store. The core AD instance is almost always owned by enterprise IT, and its contents are populated by other IT-owned systems such as HR databases.
And who were the consumers of this identity data? IT-owned enterprise applications that typically had some degree of AD integration. They were members of an AD domain, and they used AD security groups to control access to various parts of the application.
All this was great—if you were an application in an AD forest with all your users, or in a forest with trusts set up to give you access to another forest's users. But what about an application in a forest that would never trust the corporate forest, perhaps due to its location (e.g., in a demilitarized zone—DMZ—or entirely outside the corporate firewall)? What if you weren't a Windows application at all? The challenge for companies is how to take a wide range of identity-related data sources and make the consolidated data available to whatever applications need it.
The first solution for these scenarios was the metadirectory server. To understand how a metadirectory server works, you need to figure out the "meta" part. To understand what a meta "anything" is, repeat the "anything" in its definition. For example, file metadata is data about the data (such as the last modified date), and a metadirectory is a directory of directories. A metadirectory server such as Microsoft's Identity Lifecycle Management (ILM) collects data from a variety of data sources, via connectors that you configure for each source, into a central repository. (In ILM, this is referred to as a metaverse.) Once you have this consolidated identity data, indexed off a unique attribute like an employee number, you can push some or all of the data associated with an employee to many different repositories and applications simultaneously. These applications therefore all have a consistent core set of identity data (though they might use different sets of attributes), regardless of whether they talk directly to each other or not.
For example, a popular use of metadirectory services is to populate an AD forest in a DMZ (supporting an Internet-facing app) with all the user IDs in a particular OU in the corporate forest. Security requirements dictate that there can't be a forest trust between the corporate AD forest and the outward-facing DMZ forest. So how do you enable company employees that must administer or use the DMZ forest to use the same credentials they use in the internal corporate forest (i.e., single sign-on—SSO)? A metadirectory server solves this problem by pushing the appropriate objects and attributes out to the DMZ forest from its metaverse, so the credentials between corporate and DMZ forest are in sync, as you see in Figure 1. Note that this is a one-way synchronization; to prevent any possible security compromise in the DMZ forest from making its way to the metaverse, updates to the DMZ forest don’t move to the metaverse.
Figure 1 - Metadirectory server solution
Metadirectory servers are powerful, but they’re expensive to purchase, they’re expensive to design and deploy, and they have lots of moving parts to maintain once deployed. A key concept to remember with metadirectory servers is that you're moving identity data into and out of potentially many applications, on a scheduled basis (e.g., once a day). Therefore, there’s some latency between when an authoritative data source such as HR makes an update and when all downstream systems have that update. Another challenge that metadirectory servers are ill-equipped to deal with is identity sources and destinations that may no longer be within the enterprise (e.g., cloud identity providers); IT can’t make changes to the sources to fit the metadirectory nor make administrative or technical changes to the endpoint.
Finally, there might be a question of scalability; a metadirectory server moves a lot of identity data around to its various endpoints—even if the endpoints don't need it immediately. Take, for example, an application that has hundreds of thousands of authorized users in its local identity store, which is populated by a metadirectory server. Suppose the application requires the lastLogon attribute from the corporate AD—an attribute that changes regularly. Even if only five users per day log on to this application, all the identity data for all the app's users must be initially pushed to the app. And because its value is updated regularly, lastLogon for many users must also be pushed to the app at each sync cycle. That's a lot of processing cycles that essentially go to waste.
The best way to understand a virtual directory server is to follow a process that’s similar to the way I dissected the metadirectory server: Take a good definition of virtualization (I like Edwin Yuen's definition, which says that “virtualization is simply isolating one computer resource from the other resources") and apply it to a directory server. At its simplest, a virtual directory server isolates multiple identity sources at the server’s back end to appear as a single, virtual directory for the applications that access the server at its front end.
In contrast to a metadirectory server (which presents a unified view from multiple identity sources by collecting all the source's data into one big metadirectory), a virtual directory server has no equivalent database—it only collects the data required when it's needed. (A cache can be used to improve performance, but nothing on the metaverse scale.) It does this by issuing real-time queries to the appropriate data sources and consolidating the data returned into a single view before presenting it to the server's applications. The apps don't know this identity data view is collected from different data sources; the virtual directory server isolates, or abstracts, the heavy lifting of querying each source for the particular bits of identity data that a given application needs.
Figure 2 shows an example of the most common virtual directory server scenario: simplifying access to a messy identity environment. In this example, an enterprise has a web service that employees can use inside or outside the company. This enterprise has acquired another company; that company's AD forest (still containing its employee accounts) doesn’t yet have any trust established with the corporate forest, but the newly acquired company must be able to access the web service. The web service also uses attributes from a custom database. This would be a messy problem to solve with a metadirectory server, involving attribute synchronization with two potentially large forests.
Figure 2 - Common virtual directory server scenario
Using a virtual directory server, the solution is pretty straightforward. The virtual directory server is configured to provide the web service with a view that contains the service's needed attributes. The service issues a standard (e.g., LDAP) query to the virtual directory server, which executes a real-time query to its various sources, consolidates the replies into one response, and returns the response to the service, which can use it for authentication or authorization. The virtual directory server provides a layer of abstraction between the web service and its data sources, and the only data that goes over the wire is what's required for a given transaction.
An emerging scenario that plays to virtual directory server technology's strengths is cloud computing and the enterprise. To securely provide access to a cloud service such as Salesforce.com, a federated trust must be established between the identity provider (your enterprise) and the service provider (Salesforce). You accomplish this by deploying an on-premises federation server such as Active Directory Federations Services (AD FS) or PingFederate, which performs LDAP queries to your identity environment and constructs tokens to provide to Salesforce. You could also use an Identity as a Service (IDaaS) provider, such as Symplified or Okta, to handle the federation for you.
Regardless of how you provide federation, the challenge for many businesses is that their identity environment isn't cohesive. If you plan to use one federation service across the company instead of one per isolated identity store, you must provide one endpoint for the federation service to use. Figure 3 shows how a virtual directory server can work to simplify your identity federation architecture.
Figure 3 - Simplifying your identity federation architecture
Speaking of federation services, they're becoming more common as a component of IAM systems; Radiant Logic now offers a RadiantOne Cloud Federation Service product that combines a virtual directory server with a federation service. This theoretically simplifies your IAM infrastructure a bit.
Several years ago, in "Virtual Directories Enhance Identity and Access Management Solutions", Gartner analyst (and former Windows IT Pro Senior Technical Editor) John Enck stated, "By year-end 2009, 80 percent of organizations deploying IAM solutions will use virtual directory technology as part of the IAM infrastructure." Has this happened for you? Have you deployed or are you considering a virtual directory server solution? I've created a short survey to check up on virtual directory technology's adoption; please take a few seconds of your time to let me know what you're doing!