Cloud computing is a very popular topic, but when I ask most IT professionals to explain it, I always encounter varying degrees of confusion. This confusion is even prevalent regarding Azure, Microsoft’s cloud computing platform. Because Azure fits into the middle tier of the cloud computing service model—Platform as a Service (PaaS)—it’s very developer focused, rather than IT pro focused. This doesn’t mean, however, that cloud computing won’t be vitally important for IT pros to understand for their future. In an effort to help explain what Microsoft is doing in cloud computing, I sat down at Microsoft’s 2011 MVP Global Summit with Windows IT Pro contributing editor, Microsoft technical fellow, and old friend Mark Russinovich to have him explain what Windows Azure is and how it’s important to Microsoft’s future. We also discussed the latest updates to Mark’s famous Sysinternals tools, as well as the completion of a personal project he’s been working on for a long time. Karen Forster, director of technical communications at Microsoft and former Windows IT Pro editorial director, joined us for the conversation.
Well-known among IT pros as the OS researcher who developed unique utilities for Windows by reverse-engineering the Windows OS, Mark joined Microsoft in 2006 when the company purchased his Winternals software company. As one of only 20 technical fellows throughout Microsoft, Mark occupies one of the highest individual contributor positions in the company—the technical track equivalent of the management track’s corporate vice president. An interesting aspect of Mark’s role as a technical fellow is that because he has no direct reports, he must accomplish his goals by his considerable influence alone. After moving from the Windows division, where he was involved in the planning of Windows 7 and its successor, Mark moved to the Azure team because he recognized the growing importance of both cloud computing and mobile computing trends. On the Azure team, he works with team leaders, as well as developers in various Azure divisions. He focuses on the design of the fabric controller, which Mark describes as “the [Azure] kernel, if you think of Azure as an OS—the kernel, which knows how to manage the server hardware and deploys services and defines what an Azure application is.”
Let’s see what Mark had to say about the importance of Azure to cloud computing and Microsoft, where he sees the cloud heading in the future, and Microsoft’s role in moving IT services into the cloud.
Sean Deuby: From the IT pro’s point of view, what exactly is Azure? How does it fit in with Microsoft’s other online properties? Is it truly different, or is it just another “Live” service?
Mark Russinovich: Cloud computing service models include Infrastructure as a Service (IaaS), Platform as a Service (PaaS), and Software as a Service (SaaS). What you’ve seen IT pros focus on is Infrastructure as a Service. So in their own data centers, provisioning servers, provisioning applications on those servers, managing them, monitoring them—Infrastructure as a Service is on-demand multi-tenant access to infrastructure resources.
Canonical examples include Amazon EC2, VMware, and Hyper-V. Those kinds of infrastructure clouds or virtualization platforms let you basically rent someone else’s server and deploy your OS and applications to that server—but other than that, those platforms try to look as closely like your data center as possible, to make it easy to just take your apps, lift them up, put them there, and as much as possible use the same management that you use to manage your on-premises data center applications.
There’s fidelity loss when you’re going to a public cloud like Amazon’s EC2. There’s higher fidelity when you go to a private cloud like VMware vCloud that providers such as Rackspace would provide. Or if someone else is hosting a Hyper-V cloud, you also get high fidelity. What PaaS tries to do is raise the level of abstraction up one level. The benefit of taking it up a level from an IT pro perspective and from a business perspective is that you’re not now in the business of worrying about provisioning the OS, provisioning of the runtimes, and provisioning the database and the other infrastructure services that these traditional server applications use or are built on. The benefit from a developer’s perspective is that you don’t have to worry about any of that stuff.
On top of that, the platform makes it really easy to write a cloud or a 24x 7, highly available, highly elastic application. That’s what Azure is about—and PaaS from an Azure perspective for the compute part of it makes it almost brain-dead simple to write an app that’s multi-tier, multi-instance, and has this ability to scale up and down very quickly and be able to stay up 24x 7 even in the face of hardware failures or configuration updates, or updates of the service to new versions.
Sean: Doesn’t Azure also support coexistence, meaning the ability to have a hybrid application that’s partly on premises and partly in the cloud?
Mark: Yes. Just to finish discussing PaaS, there’s compute PaaS, which is what Azure has, and then there’s the building-block Platform as a Service, which is all the other services that cloud applications will use to implement functionality for it. If you look at on-premises server applications, a lot of times they have a database back end—which is stood up with a SQL Server instance or pair of instances if you want high availability on the data center. With a cloud, looking at a cloud application, it’s using the same kind of PaaS building blocks to provide that functionality; in this case it would be SQL Azure. And the characteristics I talk about for compute cloud applications—24x 7, highly available, highly elastic—apply to those as well. And you pay only for what you use, rather than overprovision, which is another big problem that on-premises has.
This is the problem where if you’re building your own data center and you’re deploying your apps to it, to determine how much hardware you need, you look at the app and ask, “What’s the maximum load this app is going to have?” Around Christmas, it’s like 100 times what it is normally, so we need 100 times the hardware that we do for everyday operations. And in the cloud, it’s because of this pay-as-you-go, highly elastic nature, you pay for the 1 percent you use on a daily basis, and around the holidays you scale it up to 100 times and you pay for that for the time that you need it. Then you just go back down afterwards, instead of having all this wasted capacity.
From an IT pro perspective, this composite type of application—the hybrid application—becomes interesting in the case where, for the most part, I want to run my applications on premises, but around the holidays, when I’ve been paying for 1 percent and I’m monitoring things closely, I want to burst into somebody else’s cloud so that I can take advantage of that elasticity and pay for it only right when I need it, and then come back down into my own—we see customers that are really interested in that type of scenario.
Sean: So it’s a conservative way to get your feet wet on public cloud services. You design an application for likely usage—you don’t have to design for maximum capacity, you design for average capacity, and you have a fail-safe off to the cloud for maximum capacity.
Mark: Yes. What we’re talking about is future scenarios that will be enabled by the Windows Azure appliance. The other scenario is hot standby for disaster recovery, where you’ve got the standby of the application up in somebody else’s data center in the cloud, and you’ve got the active one on premises, but you fail over if there’s a problem with the on-premises one. The app is running in the cloud but then talking to resources back on premises, which is something that’s enabled today (with Windows Azure Connect, which lets you actually domain-join the machines in the cloud and also have access to on-premises network resources, basically making them appear as if they’re on your intranet). So Windows Azure public cloud applications can access your on-premises SQL Server database, for example. If you still have lots of data on premises, or data that you don’t want to leave your premises, that connectivity allows this kind of hybrid connection between on-premises stuff and cloud stuff.
Sean: You mentioned Infrastructure as a Service and PaaS. Do you want to say anything about SaaS?
Mark: Azure started out as a platform aimed at internal services only, but once people saw it, they realized, “Hey, we could actually deliver this to the outside world, and they’d probably find it useful and interesting as well.” Windows Azure has become an extremely important platform for Microsoft, because Microsoft is planning to build lots of SaaS offerings (such as)—but what are they going to build them on? They’re going to build them on Windows Azure, so Azure is important for that aspect of the business as well. But as a Platform as a Service, it’s really a platform for SaaS—for building Software as a Service. If you look at an IT pro ISV scenario—a guy writing line-of-business applications—that’s really SaaS, to some extent. It’s a cloud application. But it eventually enables the ability of other people that create these multi-tenant cloud applications that they’re then selling to IT pros.
Sean: So in other words, Microsoft is itself using Azure to build SaaS applications that are the service-enabled versions of the enterprise software you’re selling today.
Mark: Exactly. Actually, that explanation highlights the kind of transition Microsoft is going through right now.
Sean: How important is Azure to Microsoft?
Mark: Microsoft’s new philosophy is “cloud first,” and then ship what we deliver to the cloud in the on-premises box solution, because there’s something drastically different between the way the cloud works and the way that on-premises server software works. On-premises server software is the traditional box model—I get an update every 2 years. And I might hop on that update bandwagon every release, or I might skip a release or two because the old one was good enough for awhile.
But the cloud is shipped every month. What’s going on with Windows Azure is that we’re shipping every month. There’s a new version of the fabric controller rolled out across all our data centers once a month. All of the other cloud properties are in the same kind of cadence. New features, major features, might only surface once every 6 months or at some longer cadence. Every one of these incremental updates is fixing bugs and introducing the pieces required to create the functionality that’s going to end up surfacing as a feature that we sell to customers or make available to customers. So it makes total sense for us to say that the cloud will be so important, and we’ll be delivering updates so frequently to cloud-based applications, that updates to the box product will be snapped off the cloud version at regular intervals.
You can see that happening with the SQL Server team. SQL Azure is off in a cloud—it shares the same core with SQL Server, so as the cloud continues to evolve, the result of that evolution is going to make it back into SQL Server for the on-premises version.
You’re probably going to see all ISVs follow this same model.
Sean: Is the fabric controller that you’re working on going to be used as a platform to connect Microsoft SaaS products?
Mark: Yes. For example, System Center will eventually have multiple components that run in the cloud. Windows Intune, which is the client-focused System Center management solution, runs in the cloud. And upcoming System Center cloud-based components will be built on the Windows Azure fabric. So these components will take advantage of PaaS, but they’re really SaaS.
Sean: So you’re building SaaS based on PaaS, which means there’s probably some Infrastructure as a Service going on too.
Mark: Today, Infrastructure as a Service means compatibility with server applications. We’ve got a new programming model in Windows Azure. Typically, most server applications don’t fit that programming model, so you need to have a developer tweak them.
Sean: Part of what makes working on the fabric controller interesting is the scaling of it, where you’ve described it as being analogous to the Windows kernel and controlling resources. So whereas the Windows kernel is the microscopic version, the fabric controller is the macroscopic version. Do you have anything to relate, as far as the scale at which you’re working?
Mark: The data centers have literally on the order of tens of thousands of machines, and we’re operating on hundreds of petabytes of storage.
Sean: And the fabric controller has to be able to seamlessly and efficiently deal with all of that?
Mark: Yes it does. But the fact is that the fabric controller and the system as a whole today don’t have the ability to control these resources on a worldwide scale yet. The scale they’re operating under is within the data center. But we’re enabling those kinds of cross–data center worldwide capabilities, which is part of the challenge of growing it. Today when you deploy an application to Windows Azure, you say, “I want it to be in this region and that region,” but you really mean “this or that data center,” although we don’t really talk publicly about where our data centers are. So you say northeast, and it goes to some data center in the northeast. You can’t say, “I want this application to be in the northeast and the southwest”—but one day you will be able to. Or you’ll be able to say, “Run a hot one in the northeast and have a standby in the southwest—or if it’s going to be in the northeast but if it grows, it can grow into other regions as well.” We still have to have people specify which regions they can run in, because there are all sorts of legal, compliance, and government issues.
Sean: You’re also working with technical fellows John Shewchuk and Vittorio Bertocci on identity issues in Azure. Can you talk about that? What’s going on in terms of identity?
Mark: When you look at Windows Azure, you wonder, “Where’s the identity?” The reason identity would be useful for services, like when an application goes and talks to other services, is visible in what you have to do today when you want your application to talk to Windows Azure storage or SQL Azure or any of these other services that the application uses as building blocks, because each one of those services has its own credentials.
For storage, your access is controlled by your key, so you take the key that you got when you created your Windows Azure storage account and then deploy that key with your application. And then that key is something you have to protect; it becomes something that you have to manage, keep track of, and roll over periodically for security reasons—and that means updating the application, which is a big pain in everyone’s rear. So it would be really nice if you could go to Windows Azure storage and say, “This application can create Windows Azure storage BLOBs, and they can read these BLOBs over here, and they can read this table but not modify it, and they get this amount of storage.” You could deploy your application and it would automatically have access to your storage account. And nobody else would. So that’s one place where identity would be useful.
The other area in which identity is important is granularity of administration. Today with a Windows Azure subscription you get your Live ID, and your Live ID manages your subscriptions—all aspects, including creating your Azure storage accounts, creating your services, staging, deploying, and production—which works great for an organization or an individual that’s solely responsible for the whole end-to-end management of that subscription and those services. But when you take a look at how enterprises are going to want to use Windows Azure—they’re probably going to want one billing relationship with Microsoft, which means one subscription—but then they’re going to have lots of different departments deploying services. And even within a service, different people will be associated with that service: You’re going to have someone who wrote the service, you’re going to have someone who tests the service, and you’re going to have someone who manages and monitors the service in production. So you’re going to want access control over each of those operations. And today, your Live ID is your admin for everything.
Sean: So there’s no granularity to it.
Mark: Right; so that’s what we’re going to do.
Sean: What about Azure’s Access Control Service? Can you give us a thumbnail of what it’s about?
Mark: Access Control Service is an identity service that can also federate with other identity services. This is one way for your application to get its own identity. You can have an application’s identity provided by Access Control Service. So this is one of those client building blocks. What I’ve just talked about, the need to bring identity into the Windows Azure platform, is going to show up as features inside Access Control Service that are available to IT pros to be able to manage identities, roles, and user accounts.
Sean: Is there anything else you want to make sure the IT pro community understands?
Mark: I think that as far as IT pros being affected by this cloud revolution, I don’t expect that IT pros are going to be the ones running to management and saying, “We need to go into the cloud.” They’re going to be the ones focused on what they’re charged with managing, which is their infrastructure and their applications and the way that things operate today.
But what you’re going to see is business decision makers—the people who are watching the money—saying, “This buying and maintaining 100 times our normal capacity just so we can cope with Christmas, just doesn’t make any sense. It’s such a waste of money. Let’s see if we can optimize. Why don’t you look at this cloud thing, where we basically rent resources?” So you’re going to see that request coming to the IT pros; that’s what they’re going to get.
And IT pros will also see their own company’s developers starting to deploy applications to the cloud, and they’ll find out about it second-hand. In other words, they’re circumventing IT policies, because IT pros—in addition to managing everything and making sure that certain policies are being followed and security is the way it should be and compliance is the way it should be—they’re going to end up discovering after the fact that cloud adoption is already happening in their company. And that’s going to force them into figuring out, “How are we going to deal with the cloud?”
Karen Forster: Do you think IT pros will be asked in the next 5 to 7 years to evaluate what should go to the cloud verses what shouldn’t?
Mark: Absolutely—and how to get there from here. That’s exactly what they’re going to be asked to do. If you just look at the state of the world today in Windows Azure, there’s apps you have on premises, that IT pros might be asked to move into the cloud. Microsoft IT is going through this exercise. Of course they’re saying, “Let’s see if we can take advantage of Windows Azure.” So Microsoft IT looks across all their applications, and they see which ones have compliance requirements that aren’t satisfied by Windows Azure—those apps are out for now. Then they ask, “Which ones do we not have the source code for and that don’t fit the Windows Azure application model?” Those apps are out because we can’t get them to work in the cloud. Of the remaining ones, what benefit, or return, can we expect on the work it takes to migrate these apps to the cloud? If they’re running fine on premises, and we’re not having to over-capacity them, or provision them for massive swings in capacity, and they’re just humming along, it probably doesn’t make sense. But if, for example, our performance review application—twice a year, when it’s review time—this thing explodes, maybe we should take a look at moving to the cloud even if it takes a lot of work to do so. It’ll probably pay off. IT pros are going to be asked to do that sort of thing.
Sean: So IT pros are sort of evolving into something like a cloud service advisor or a broker.
Karen: I’d also add that identity and security will have to be evaluated.
Mark: Yes, that’s all part of it. For example, what identity requirements does the app have? And is there data that the app uses that needs to stay within our data center? Maybe it’s not even for compliance reasons, just that it’s the core of the business. And we don’t want to risk having it be compromised at all. They want full control and responsibility. They want to be able to point at someone in their org and say, “You’re responsible.”
Sean: Right, because they don’t want to manage it themselves, and IT’s role is to manage information technology—it just doesn’t necessarily have to be on premises. But what we’re talking about is a brand new role. So some jobs will go away, and other new roles will be created.
Mark: And a whole bunch of roles will just morph a little bit. Take a SQL Server database administrator, for example. A database administrator is focused on on-premises databases today, but in the future it will be on-premises and cloud databases—but they’re still databases. The part about the cloud database aspect is that the DBA no longer has to tell the IT department to buy a certain number of servers and install the OS and certain versions of the database and keep it patched. That part is gone from the cloud side of things, but the rest of it is still there.
Karen: How does client-side administration fit into the whole cloud model?
Mark: I think Intune shows where things are going with that. Today IT pros have to go in and install on-premises applications, just like the ones we’ve been talking about, to manage their clients. What Intune does is say it’s no longer an on-premises application—it’s SaaS (System Center SaaS) that’s monitoring and managing my clients and telling me when something’s wrong and trying to fix it automatically when something breaks.
Sean: Intune is a local agent, and it sends data off to the service in the cloud rather than on premises.
Mark: Yes. Like I said at the beginning, there will always be client computing. The question is, How are the clients being managed? How are they being connected? Is it being done through my intranet or through the Internet?
Sean: Can you talk about the importance of on-premises virtualization versus putting everything out in the cloud? Some people say to put it all out in the cloud; some people say to virtualize locally.
Mark: That’s something I haven’t really touched on, which is the Windows Azure platform appliance—a key part of the Azure strategy. This is the ability to take Windows Azure—the hardware that it runs on and the software—and put it in your own data center. Or to have a hosting service provider take it and put it in their data center and then sell it to customers. I think that’s a huge differentiator for us—the fact that we’re going to have this ability because it extends the reach of the platform to anywhere people want it, whereas today, the public cloud is only acceptable to people when it just happens to meet their requirements, because their requirements are very general. The public cloud is addressing kind of the generalist case, like the most generally sought-after certification, or the geographic regions where we have a huge market that justifies us putting a data center in that region—whereas the platform appliance will let hosting service providers create very customized environments for specific niche markets, like following certain certifications and government requirements.
Sean: Is this private cloud in a box?
Mark: It’s PaaS in a box, because it’s different from the VMware cloud, which you can put on premises, which is an infrastructure (IaaS) cloud. Or Hyper-V, which you can put on premises and run with System Center and it’s kind of like your private cloud. There’s actually a difference between buying the appliance and putting it on premises, versus sharing an appliance on a hosting service provider with other customers—the former makes it kind of non-cloudish because you’re paying for the whole appliance. You have a block of capacity rather than having elastic capacity. You lose that when you go away from a multi-tenant model. The hosting service provider can provide a multi-tenant model even in the kind of situation in which they’re addressing the niche markets that have these unique requirements—maybe it’s just for the UK government—but different departments in the UK government can now buy elastic capacity out of this sort-of-semi-public cloud.
Sean: So it’s public PaaS compared with private PaaS—whereas VMware’s would be more private Infrastructure as a Service.
Mark: Yes, if VMware made an appliance or someone sold the VMware appliance as private Infrastructure as a Service. Until now you’ve heard public cloud and private cloud. What people mean when they say public cloud is Windows Azure or Amazon; when they say private cloud it means System Center Virtual Machine Manager cloud or VMware cloud—that distinction, that way of drawing those lines, is going to go away, and it’s just going to become Windows Azure wherever you want it, and whatever anybody else does.
Sean: Do you mean that the distinction between private and public cloud is more differentiated by the vendor in this case because it’s designed to seamlessly interact between public and private—meaning that it’s a hybrid cloud?
Mark: Yes. That’s one of the design principles of the Windows Azure appliance—it’s running the same software that runs in the appliance as can run in the public cloud. The other way might not be the case because the private appliance that someone’s managing—they might not update it at the same time we update the public cloud. They might be at N-1; so if they want to take advantage of the features in N, they have to upgrade to N (the same version). But the cloud will always be able to run whatever runs in the appliance, to enable the bursting scenario.
Sean: Tell me about Sysinternals. I heard that Bryce retired.
Mark: Yes, Bryce retired. He’s travelling around the world—if you see him, tell him I’m looking for him! He retired in October, so Sysinternals is just me now. It’s sad because with Bryce, I think we got more than twice as much done—so now it’s less than half capacity.
Sean: But you’re working on an Azure tool?
Mark: Yes, I’m working on a storage tool. There’s a good opportunity for a new tool and for me to learn. Actually this is the first WPF .NET tool that I’ve written of any scale—but unfortunately my time has just been completely killed, so it’s about 80 percent done and it’s sitting there. I’m actually looking for someone to help with it.
Sean: Another interesting thing that you’re doing, that I don’t think is well-known yet, is that you’ve written a novel—a cyber thriller called Zero Day. Can you tell us about it?
Mark: After 9/11, computer security malware started to explode. This was back in 2001-2002, when it exploded so badly that Windows XP SP2 became “stop everything and work on security releases for Windows XP.”
When I was looking at the kind of malware that was being distributed then, and the things it was doing, it was very amateur—uncoordinated and unfocused—and yet it was causing all this trouble. And I thought, “If I was a bad guy, what could I do?” I could do a lot of really bad things, and probably get away with it. It would require almost no research on my part. I’d be able to sit in my office at home and unleash this thing, and wreak havoc on the world. What kind of person might love a weapon like that? Well, the same kind of people that did 9/11. So that’s where the idea for the book came from.
I started writing the book in 2005 and finished it right before I joined Microsoft. Then I spent the last 5 or 6 years getting it published, and it’s finally out. It was published by Thomas Dunne on March 15.
Sean: You were kind enough to share some drafts of the book with me, so I look forward to reading the finished version!