Topics:

Cloud outages: Just like yours and mine

Email LinkedIn
Tools


The outage in Amazon's Elastic Compute Cloud service late last month, which threw a slew of businesses offline and apparently destroyed some of their data for good, was triggered by a configuration error in a routine network upgrade, the company said. 

A network configuration error? For real?

While this incident rightly exacerbates the fears of those already wary of cloud services, cloud champions are countering that the skepticism is overblown and misdirected. Customers bear part of the blame, they argue, if sites go dark when their cloud services fail; customers have to shoulder more responsibility for staying online when they entrust their business to a cloud provider. After all, they say, computing is a complicated business and things happen.

Things do happen, of course.  It would be tough to find an IT operation that hasn't experienced a network configuration error. Computing errors can be difficult and expensive to avoid, and difficult and expensive to fix. And this, ironically, is one of the reasons companies decide to move services to the cloud. Doesn't it stand to reason that those whose business it is to provide these services are more capable of handling them--in other words, less likely to make configuration errors during routine network upgrades?

Apparently not. If customers want guaranteed uptime, cloud proponents are saying, they need to take matters into their own hands. They need to keep sophisticated expertise in-house to work with the provider; they need to deploy their own third-party load balancing technologies and have their own failover strategies in place.

I guess if you're running a small business or a start-up, this is the best you can ask for. But for any enterprise capable of handling these services in-house, it's looking more and more sensible to keep them there. Why pay someone else to make the same mistakes you can make on your own? Especially if you have to invest in and manage much of the expertise and technology yourself anyway?

If cloud service providers hope to make greater inroads in the enterprise market, they need to do more to prevent outages. Amazon said it "will audit [its] change process and increase the automation to prevent this mistake from happening in the future." Sounds like a good place to start. - Caron