Monday, September 28, 2009

The Complexities of Simple

Back in the early, halcyon days of Cloud Computing there was really only one game in town; Amazon Web Services. Whether by luck or cunning, Amazon got a big, hairy head start on everyone else and so if you wanted Cloud-based storage, queues or computation you used AWS and life was, well, simple.

But now that we clearly have a full-blown trend on our hands there are more choices. The good folks from Rackspace picked up on the whole Cloud-thing early on and have leveraged their expertise in more traditional colo and managed servers to bring some very compelling offerings to market. Google, after their initial knee-jerk reaction of trying to give everything away has decided that what they have might be worth paying for and is actually charging people. And Microsoft, always a late riser, has finally rubbed the sleep dirt out of their eyes and finished their second cup of coffee and is getting serious about this cloud stuff. It's clear that this is going to be a big market and there will be lots of competitors.

So, we have choices. Which is good. But it also makes things more complicated. Several efforts are now under way to bring simplicity back in the form of unifying API's or REST interfaces that promise a Rosetta stone-like ability to let your applications speak to all of the different services out there without having to learn all of those different dialects. Sounds good, right?

Well, it turns out that making things simple is far more complicated than most people realize. For one thing, the sheer number of things that need to be unified is still growing rapidly. Just over the past year or so, Amazon alone has introduced:
  • Static IP addresses (via EIP)
  • Persistent block storage (EBS)
  • Load balancing
  • Auto scaling
  • Monitoring
  • Virtual Private Clouds
And that's just one offering from one company. It's clear that we have not yet fully identified the complete taxonomy of Cloud Computing. Trying to identify a unifying abstraction layer on top of this rapidly shifting sand is an exercise in futility.

But even if we look at an area within this world that seems simpler and more mature, e.g. storage, the task of simplifying is actually still quite complex. As an exercise, let's compare two quite similar services; S3 from AWS and Cloud Files from Rackspace.

S3 has buckets and keys. Cloud Files has containers and objects. Both services support objects up to 5GB in size. So far, so good. S3, however, has a fairly robust ACL mechanism that allows you to grant certain permissions to certain users or groups. At the moment, Cloud Files does not support ACL's.

Even more interesting is that when you perform a GET on a container in Cloud Files, the response includes the content-type for each object within the container. However, when you perform a GET on a bucket in S3, the response does not contain the content-type of each key. You need to do a GET on the key itself to get this type of meta data.

So, if you are designing an API to unify these two similar services you will face some challenges and will probably end up with a least common denominator approach. As a user of the unifying API, you will also face challenges. Should I rely on the least common denominator capabilities or should I actually leverage the full capabilities of the underlying service? Should the API hide differences in implementations (e.g. the content-type mentioned above) even if it creates inefficiencies? Or should it expose those differences and let the developer decide? But if it does that, how is it really helping?

I understand the motivation behind most of the unification efforts. People are worried about lock-in. And there are precedents within the technology world where unifying API's have been genuinely useful, e.g. JDBC, LDAP, etc. The difference is, I think, timing. The underlying technologies were mature and lots of sorting out had already occurred in the industry. We are not yet at that point in this technology cycle and I think these unification efforts are premature and will prove largely ineffective.

No comments:

Post a Comment