Tuesday, April 28, 2009

Cloud Lock-In. Not your father's lock-in.

There seems to be a lot of angst about the risks of lock-in with cloud computing. I think there are some real issues to be concerned about but most of the discussion seems to be centered around API's and I think that's wrong.

First, let's define what we mean by lock-in. This description from wikipedia provides a good, workable definition:

In economics, vendor lock-in, also known as proprietary lock-in, or customer lock-in, makes a customer dependent on a vendor for products and services, unable to use another vendor without substantial switching costs. Lock-in costs which create barriers to market entry may result in antitrust action against a monopoly.

In the world of software development and IT, examples of things that have caused lock-in headaches would be:
  • Using proprietary operating system features
  • Using proprietary database features
  • Using hardware which can only run a proprietary OS
So, a typical scenario might be that you have a requirement to develop and deploy some software for internal use within your company. You do the due diligence and make your choices in terms of the hardware you will buy, the OS you will use, the database, etc. and you build and deploy your application.

But in the process of doing that, you have used features in the operating system or database or other component that are unique to that vendor's product. There may be have been good reasons at the time to do so (e.g. better performance, better integration with development tools, better manageability, etc.) but because of those decisions the cost of changing any of the significant components of that software becomes too high to be practical. You are locked-in.

In that scenario, the root cause of the lock-in problem seems to be the use of proprietary API's in the development of the application so it kind of makes sense that the focus of concern in Cloud Computing Lock-In would also be the API's. Here's why I don't think it's different for the Cloud Service case:
  • Sunk Cost - While the use of proprietary API's in the above example represent one barrier to change, a more significant barrier is actually the sunk cost of the current solution. In order to deploy that solution internally, a lot of money had to be spent upfront (e.g. hardware, OS server and client licenses, database licenses). To move the solution off the locked-in platform not only involves considerable re-write of the software (OpEx costs) but also new CapEx expenses and potential write-down of current capital. In the case of a Cloud Service, these sunk costs aren't a factor. The hardware and even the licensing costs for software can be paid by the hour. When you turn that server off, your costs go to zero.
  • Tight Coupling vs. Loose Coupling - Even if you focus only on the API's and the rework necessary to move the solution to a different platform, the fact that Cloud Computing services focus on REST and other HTTP-based API's dramatically changes the scope of the rework when compared to moving from one low-level tighly-coupled API to another one. By definition, your code that interacts with Cloud Services will be more abstracted and loosely-coupled which will make it much easier to get it working with another vendor's Cloud Service.
To see what the real lock-in concern is with Cloud Services, think about where the real pain would be in migrating a large application or service from one vendor to another. For most people, that pain will be around the data associated with that application or service. Rather than sitting in your data center, it now sits in that vendors cloud service and moving that, especially for large quantities of data will present a real barrier.

So, how do you mitigate that concern? Well, you could try to keep local backups of all data stored in a service like S3 but for large quantities of data that becomes impractical and diminishes the value proposition for a data storage service in the first place. The best approach is to demand that your Cloud Service vendors provide mechanisms to get large quantities of data in and out of their services via some sort of bulk load service.

Amazon doesn't yet offer such a service but I was encourage by this thread on their S3 forum which suggests that AWS is at least thinking about the possibility of such a service. I encourage them and other Cloud Services vendors like Rackspace/Mosso to make it as easy as possible to get data in AND out of your services. That's the best way to minimize concerns about vendor lock-in.


  1. You are right about how different lock-in is in case of the cloud. How about cloud technology lock-in? E.g. developing a nifty app for google app engine datastore with the assumption that it's consistent and then moving that to simpleDB that is only eventually consistent? I am more worried about these kind of scenarios. Or are these gonna be rare?

  2. Good point. I think that's another example of how the things that might create issues are not the same kinds of things you probably worried about 5 or 10 years ago when you thought about lock-in. The API's really shouldn't be the focus.