Trading time for quality to improve Exchange 2013 updates

Microsoft's "mea maxima culpa" posted to the EHLO blog following yesterday's withdrawal of the MS13-061 security patch must have been embarrassing for a proud engineering group. But the really interesting information is buried in the Q&A section at the end of the post, including an indication that Exchange 2013's quarterly update cadence might just be threatened.

Some interesting information was included in the Q&A portion of the Exchange team’s post following the MS13-061 fiasco. As you might recall, this security bulletin was the first test of the new serving model for Exchange 2013. All went well until the patch knocked out the Microsoft Exchange Search Host Controller service and brought content indexing to a crashing halt. Microsoft recalled and reissued MS13-061 after fixing the problem.

This wasn’t the first problem with an update for Exchange. It probably won’t be the last either. What’s upsetting many customers is that no improvement seems to have occurred over the last 18 months as we have had a succession of updates and patches issued and withdrawn. Microsoft promised that the new servicing model would be much better because “the same code is already deployed in the Exchange Online service and has been validated against millions of mailboxes.” In other words, Office 365 would have discovered any lingering bugs that lurked inside an update before customers even had a chance to install the code.

Given MS13-061, you would be forgiven in doubting this statement. However, it still holds true because MS13-061 is not a cumulative update. It’s a patch and, as Microsoft explains, “Exchange Online does not deploy .msp patches into the environment; instead, Exchange Online deploys new full builds of the product (cumulative updates, if you will) on a regular release cadence.” What this means is that the servers running Exchange Online are regularly taken offline, reduced to bare metal, and completely reinstalled with the latest and greatest software, so there’s no necessity to apply security patches. Reimaging servers makes a lot of sense when you have to manage tens of thousands of servers. There’s no way that letting Windows Update do its stuff would work inside such a massive automated datacenter environment.

So Microsoft couldn’t detect the problem within Exchange Online. But they run some on-premises Exchange too, don’t they? The answer is that they do in the famous “dogfood” environment that is specifically designed to allow Microsoft engineers enjoy the fruit of their labors by using their own code in production. Microsoft regularly updates dogfood with new builds of Exchange. Alas, dogfood didn’t come to the rescue here either because Microsoft did not deploy MS13-061 into the dogfood environment. To be fair, Microsoft admits this oversight and says “Unfortunately, this security update did not get deployed into our dogfood environment prior to release.”  Given the previous history of problematic updates such an omission is curious, to say the least.

At the end of the post, Microsoft poses a question that many customers would have asked:

You have told us time and time again that you were going to improve your testing procedures, and yet each time you have to tell us that you missed something. When will it end?

It’s a horrible situation for an engineering organization to have to answer a question like this because of the implicit admission that problems exist in their testing procedures “time and time again”.

That being said, Microsoft did as well as they could in answering the question. The most interesting comment being “we have recently made the decision to delay the release of Exchange 2013 RTM CU3 by several weeks to ensure that we have enough run time testing within our dogfood environment.

Microsoft admits that additional testing means that the quarterly update cadence to which they aspired for Exchange 2013 might have to change. I think everyone who uses Exchange 2013 will breathe a deep sigh of relief at the prospect of higher quality updates. I would certainly trade time for quality any day. Achieving quality in releases and updates is just about the only way the Exchange development team can now rebuild its reputation.

Follow Tony @12Knocksinna

Discuss this Blog Entry 7

on Aug 15, 2013

Tony

As posted once CU strategy was released "I cannot wait for fun to begin" ... what can I say ?
Somehow I cannot blame poor engineering crew, whomever came up with CU cycle idea has imposed unnecessary pressure to release "something". This actually may have positive side effects for Microsoft ... look Office365 was not affected right ? This is conspiracy fun part ...
The whole Exchange 2013 saga:
Product released = no coexistence.
CU strategy = see above.
Testing ... seriously nobody has time to test such a small patch on test system ? Automated testing will only get you so far.
This shows that cooperation within Exchange team is broken, or they do not understand their product.
I am not familiar with inner workings of MS, but last successful release and product maintenance was Windows 7 and 2008 line. Maybe Steve S could fix all this. On premm customers imho should stick to what works and forget about CU unless absolutely necessary and test, test, test.
Or move to Office365 and relax.

on Aug 15, 2013

@ Keruzam
I can say this Exchange On-Premises customers should always be behind ONE SP I.E. if today we have Exchange 2010 SP3, Exchange On-Premises customers should be Exchange 2010 SP2 in production, and test SP3 in the lab before production release.

on Aug 15, 2013

@Dart_Veder

I am on 2010 SP2 with RU6 ;)

on Aug 15, 2013

@Dart_Vader

Nahh... going to Office365 or hosted.
http://m.infoworld.com/t/microsoft-windows/microsoft-botches-six-windows-patches-in-latest-automatic-update-224988

I don't even understand why we complain ?
what other options are available ?

on Aug 15, 2013

Option is stay behind One SP :-) Exchange 2010 SP2 & SP3 On-Premises is very solid today.

on Aug 16, 2013

I have to withdraw my conspiracy theory about Office365 ... looks like ADFS updates could have also impacted Cloud ... so the only other reason for last Black Tuesday results must be mole plunted by hpphzr.

on Aug 19, 2013

Dart_Vader

Exchange 2013 is the last one on prem, mark my words. Cloud is upon us ... I am diving in, but first have to SP3 my 2010 ... wish me luck.
To give credit Office Mobile for Android is working great.

Please or Register to post comments.

What's Tony Redmond's Exchange Unwashed Blog?

On-premises and cloud-based Microsoft Exchange Server and all the associated technology that runs alongside Microsoft's enterprise messaging server.

Contributors

Tony Redmond

Tony Redmond is a senior contributing editor for Windows IT Pro and the author of Microsoft Exchange Server 2010 Inside Out (Microsoft Press) and Microsoft Exchange Server 2013 Inside Out: Mailbox...
Blog Archive

Sponsored Introduction Continue on to (or wait seconds) ×