Cloud computing services from both Microsoft and Google experienced outages in the past several days, leading to predictable and tired Chicken Little commentaries from the tech blogosphere. But these outages don't mean that the cloud computing model is broken—just that there's still some maturing that needs to happen.
Microsoft's online services infrastructure experienced an issue that affected some customers in North America online last Wednesday, causing interruptions in and various Windows Live services for a few hours. That same day, Google's cloud productivity service, Google Docs, went offline for about an hour. Both companies explained the respective outages over the next few days and pledged to prevent these issues from happening again.
Microsoft said it experienced network connectivity issues at one of its data centers that services some North American customers. "Microsoft became aware of a Domain Name Service (DNS) problem causing service degradation for multiple cloud-based services," a Microsoft representative explained. "A tool that helps balance network traffic was being updated, and for a currently unknown reason, the update did not work correctly. As a result, the configuration was corrupted, which caused service disruption. We are continuing to review the incident.
"The [Google Docs] outage was caused by a change designed to improve real-time collaboration within the document list," Google explained via its Google Enterprise Blog. "Unfortunately, this change exposed a memory-management bug which was only evident under heavy usage ... We have assembled a list of steps which will reduce the chance of a future event, decrease the time required to notice and resolve a problem, and limit the scope which any single problem can affect."
So, what can we learn by these separate, high-profile, cloud computing outages?
Aside from the usual hand-wringing, I think it's important to note that we're still in the early days of cloud computing and that individuals and companies who rely on such services need some sort of a hybrid strategy in place so that they can work offline when outages do occur. More to the point, however, cloud computing puts outages in the public domain for all to see—a situation that should result in quicker resolutions and better-quality updates over time than was the case with traditional, locally installed software.
Put another way, when a cloud service goes down, everyone can hear you scream. But that's an advantage to this type of platform, not a disadvantage. And it means that problems get fixed quickly.