VMware makes a big mistake

It often happens that the top of the class makes a big mistake. Invariably, the ones at the bottom of the class giggle and make fun of the misfortunes of others. The problem with the recent blunder by VMware is that no one is laughing. The world’s virtualization leader, whose hypervisor represents nearly 95% of the market, has effectively “forgotten” to take out a part of the code from VMware ESX Server 3.5 Update 2, which disables the product that was passed August 12. This time bomb, apparently inserted during the beta phase of the product, was not removed. As a result, thousands of servers that upgraded to the latest version of the software began to malfunction at midnight. The main problem: an inability to start new VMs on the affected servers. In the physical world, this equates to a huge amount of servers persistently refusing to start. Nothing really serious, just a catastrophic problem on the level of a datacenter… The only consolation is that VMs that have already been started will continue to operate normally until they are restarted.

Even with all of this damage, we must recognize that VMware has responded quickly and seriously. The publisher officially acknowledged its blunder on August 12 and issued the first urgent message to its clients at 11:30 p.m. The editor explained the problem and promised a quick solution. The next morning, around 7:20 a.m., the first express patches arrived. All of them required stopping VMs or moving them to another server via VMotion, while we wait for the permanent fix for ESX Server. This is nothing to smile about as system administrators are already quite stressed, but we need a real solution that completely eliminates the problem.

Moving towards “double-sourcing” strategies

A mistake of such magnitude could have repercussions. It could indeed make people reflect on new vulnerabilities of virtualization. Whatever the protective mechanisms used by users, nothing can protect against a critical error in the hypervisor code. When such a mistake occurs, the entire production capacity disappears, until the release of a miracle patch.

No publisher, whether Microsoft, Sun, Oracle or Citrix, is safe from a similar mishap. It is no doubt time for clients to think seriously about double sourcing. VMware will undoubtedly leave a few feathers in terms of market share, but it is undoubtedly with double-sourcing that virtualization could truly become widespread (after all, companies usually have two suppliers for their equipment).

Remember: The VMware mishap allows for a final remark on the beauty of activation and inactivation codes, licensing restrictions and other mechanisms of DRM software. When a publisher puts them in place, it is always a good idea, a more effective way to control the use of its products, to better manage licenses or fight against piracy. The problem is that one day this kind of Watchdog always ends up turning against his master. Due to its obsession with piracy, the commercial software industry has multiplied these mechanisms, explicitly casting doubt on the honesty and integrity of its customers. Today, VMware actually costs as much as Microsoft did a couple of years ago with the Windows activation mechanisms. How to encourage these giants to reflect? One of the advantages of open-source is that there are no such limitations…

Thanks to Christophe Bardy for inspiring this article.

Speak Your Mind