As some of you may know, I've spent the past three years or so focused on AWS-related consulting through my own little company, CloudRight. It's been fun and exciting and I feel that I've really had a front row seat for the amazing growth and excitement around cloud computing. But consulting has it's downsides, too. After a while the pace of new projects started to lose it's lustre and I found myself pining for the fjords, or at least for a bit more focus in my professional life.
So, I'm excited to say that I have joined the development team at Eucalyptus. I like their technology, I like their positioning in the marketplace, I like their commitment to open source but mainly I just really like the team. Everyone there is not only great at what they do, they are also great people and in my experience that's the recipe for a great company. I'm absolutely thrilled to be a part of it.
My main focus at Eucalyptus will be in the area of tools. Basically trying to make sure that all of the capabilities of the core system are easily and consistently accessible to users and administrators. The current Euca2ools command line utilities are a great start but we all feel there is an opportunity to do a lot more.
This is also great news for boto. Euca2ools are built on top of boto so, for the first time, boto will actually be a part of my day job rather than something I try to squeeze in between gigs and after hours. That should mean more frequent and consistent releases and better quality overall.
And now, it's time for the traditional "new job" fish slapping dance...
Elastician
The springy new world of computing.
Friday, July 2, 2010
Sunday, June 13, 2010
Using Reduced Redundancy Storage (RRS) in S3
This is just a quick blog post to provide a few examples of using the new Reduced Redundancy Storage (RRS) feature of S3 in boto. This new storage class in S3 gives you the option to tradeoff redundancy for cost. The normal S3 service (and corresponding pricing) is based on a 12-nines 11 nines (yes, that's 99.999999999% - Thanks to Jeff Barr for correction in comments below) level of durability. In order to achieve this extremely highly level of reliability, the S3 service must incorporate a high-level of redundancy. In other words, it keeps many copies of your data in many different locations so that even if multiple locations encounter failures, your data will still be safe.
That's a great feature but not everyone needs that level of redundancy. If you already have copies of your data locally and are just using S3 as a convenient place to store data that is actively being accessed by services within the AWS infrastructure, RRS may be for you. It provides a much lower level of durability (99.99%) at a significantly lower cost. If that fits the bill for you, the next three code snippets will provide you with the basics you need to start using RRS in boto.
Create a New S3 Key Using the RRS Storage Class
Convert An Existing S3 Key from Standard Storage Class to RRS
Create a Copy of an Existing S3 Key Using RRS
That's a great feature but not everyone needs that level of redundancy. If you already have copies of your data locally and are just using S3 as a convenient place to store data that is actively being accessed by services within the AWS infrastructure, RRS may be for you. It provides a much lower level of durability (99.99%) at a significantly lower cost. If that fits the bill for you, the next three code snippets will provide you with the basics you need to start using RRS in boto.
Create a New S3 Key Using the RRS Storage Class
Convert An Existing S3 Key from Standard Storage Class to RRS
Create a Copy of an Existing S3 Key Using RRS
Friday, June 4, 2010
AWS By The Numbers
I recently gave a short talk about Amazon Web Services at GlueCon 2010. It was part of a panel discussion called "Major Platform Providers" and included similar short talks from others about Azure, Force.com and vCloud. It's very hard (i.e. impossible) to give a meaningful technical overview of AWS in 10 minutes so I struggled a bit trying to decide what to talk about. In the end, I decided to try to come up with some quantitative data to describe Amazon Web Services. My goal was to try to show that AWS is:
After the conference, I was going to post my slides but I realized they didn't really work that well on their own so I decided instead to turn the slides into a blog post. That gives me the opportunity to explain the data and resulting graphs in more detail and also allows me to provide the graphs in a more interactive form.
Data? What data?
The first challenge in trying to do a data-heavy talk about AWS is actually finding some data. Most of the data that I would really like to have (e.g. # users, # requests, etc.) is not available. So, I needed to find some publicly available data that could provide some useful insight. Here's what I came up with:
Service Introduction and Popularity
This first graph uses data scraped from the forums. Each line in the graph represents one service and the Y axis is the total number of messages in that services forum for the given month. The idea is that the volume of messages on a forum should have some relationship to the number of people using the service and, therefore, the popularity of the service. Following the timeline across also shows the date of introduction for each of the services.
Note: If you have trouble loading the following graph, try going directly to the Google Docs spreadsheet which I have shared.
The following graph shows another, simpler view of the forum data. This view plots the average number of views on the forum for each service normalized.

API Complexity
Another piece of publicly available data for AWS is the WSDL for each service. The WSDL is an XML document that describes the operations supported by the service and the data types used by the operations. The following graph shows the API Complexity (measured as the number of operations) for each of the services.

Velocity
Finally, I wanted to try to measure the pace of innovation by AWS. To do this, I used the spreadsheet I created that tracked all significant service and feature announcements by AWS. I then counted the number of events per quarter for AWS and used that to compute an agile-style velocity.

Summary
Hopefully these graphs are interesting and help to prove the points that I outlined at the beginning of the talk. I actually have a lot more data available from the forum scrapping and may try to mine that in different ways later.
While this data was all about AWS, I think the bigger point is that the level of interest and innovation in Amazon's services is really just an indicator of a trend across the cloud computing market.
- A first mover - AWS introduced their first web services in 2005
- A broad offering - 13 services currently available
- Popular - details of how I measure that described below
- Prolific - the pace of innovation from AWS is impressive
After the conference, I was going to post my slides but I realized they didn't really work that well on their own so I decided instead to turn the slides into a blog post. That gives me the opportunity to explain the data and resulting graphs in more detail and also allows me to provide the graphs in a more interactive form.
Data? What data?
The first challenge in trying to do a data-heavy talk about AWS is actually finding some data. Most of the data that I would really like to have (e.g. # users, # requests, etc.) is not available. So, I needed to find some publicly available data that could provide some useful insight. Here's what I came up with:
- Forum data - I scraped the AWS developer forums and grabbed lots of useful info. I use things like forum views, number of messages and threads, etc. to act as a proxy for service popularity. It's not perfect by any means, but it's the best I could come up with.
- AWS press releases - I analyzed press releases from 2005 to the present day and use that to populate a spreadsheet of significant service and feature releases.
- API WSDL's - I parsed the WSDL for each of the services to gather data about API complexity.
Service Introduction and Popularity
This first graph uses data scraped from the forums. Each line in the graph represents one service and the Y axis is the total number of messages in that services forum for the given month. The idea is that the volume of messages on a forum should have some relationship to the number of people using the service and, therefore, the popularity of the service. Following the timeline across also shows the date of introduction for each of the services.
Note: If you have trouble loading the following graph, try going directly to the Google Docs spreadsheet which I have shared.
The following graph shows another, simpler view of the forum data. This view plots the average number of views on the forum for each service normalized.
API Complexity
Another piece of publicly available data for AWS is the WSDL for each service. The WSDL is an XML document that describes the operations supported by the service and the data types used by the operations. The following graph shows the API Complexity (measured as the number of operations) for each of the services.
Velocity
Finally, I wanted to try to measure the pace of innovation by AWS. To do this, I used the spreadsheet I created that tracked all significant service and feature announcements by AWS. I then counted the number of events per quarter for AWS and used that to compute an agile-style velocity.
Summary
Hopefully these graphs are interesting and help to prove the points that I outlined at the beginning of the talk. I actually have a lot more data available from the forum scrapping and may try to mine that in different ways later.
While this data was all about AWS, I think the bigger point is that the level of interest and innovation in Amazon's services is really just an indicator of a trend across the cloud computing market.
Sunday, May 23, 2010
Boto and Google Storage
You probably noticed, in the blitz of announcements from the recent I/O conference that Google now has a storage service very similar to Amazon's S3 service. The Google Storage (GS) service provides a REST API that is compatible with many existing tools and libraries.
In addition to the API, Google also announced some tools to make it easier for people to get started using the Google Storage service. The main tool is called gsutil and it provides a command line interface to both Google Storage and S3. It allows you to reference files in GS or S3 or even on your file system using URL-style identifiers. You can then use these identifiers to copy content to/from the storage services and your local file system, between locations within a storage service or even between the services. Cool!
What was even cooler to me personally was that gsutil leverages boto for API-level communication with S3 and GS. In addition, Google engineers have extended boto with a higher-level abstraction of storage services that implements the URL-style identifiers. The command line tools are then built on top of this layer.
As an open source developer, it is very satisfying when other developers use your code to do something interesting and this is certainly no exception. In addition, I want to thank Mike Schwartz from Google for reaching out to me prior to the Google Storage session and giving me a heads up on what they were going to announce. Since that time Mike and I have been collaborating to try to figure out the best way to support the use of boto in the Google Storage utilities. For example, the storage abstraction layer developed by Google to extend boto is generally useful and could be extended to other storage services.
In summary, I view this as a very positive step in the boto project. I look forward to working with Google to make boto more useful for them and for the community of boto users. And as always, feedback from the boto community is not only welcome but essential.
In addition to the API, Google also announced some tools to make it easier for people to get started using the Google Storage service. The main tool is called gsutil and it provides a command line interface to both Google Storage and S3. It allows you to reference files in GS or S3 or even on your file system using URL-style identifiers. You can then use these identifiers to copy content to/from the storage services and your local file system, between locations within a storage service or even between the services. Cool!
What was even cooler to me personally was that gsutil leverages boto for API-level communication with S3 and GS. In addition, Google engineers have extended boto with a higher-level abstraction of storage services that implements the URL-style identifiers. The command line tools are then built on top of this layer.
As an open source developer, it is very satisfying when other developers use your code to do something interesting and this is certainly no exception. In addition, I want to thank Mike Schwartz from Google for reaching out to me prior to the Google Storage session and giving me a heads up on what they were going to announce. Since that time Mike and I have been collaborating to try to figure out the best way to support the use of boto in the Google Storage utilities. For example, the storage abstraction layer developed by Google to extend boto is generally useful and could be extended to other storage services.
In summary, I view this as a very positive step in the boto project. I look forward to working with Google to make boto more useful for them and for the community of boto users. And as always, feedback from the boto community is not only welcome but essential.
Tuesday, April 20, 2010
Failure as a Feature
One need only peruse the EC2 forums a bit to realize that EC2 instances fail. Shock. Horror. Servers failing? What kind of crappy service is this, anyway. The truth, of course, is that all servers can and eventually will fail. EC2 instances, Rackspace CloudServers, GoGrid servers, Terremark virtual machines, even that trusty Sun box sitting in your colo. They all can fail and therefore they all will fail eventually.
What's wonderful and transformative about running your applications in public clouds like EC2 and CloudServers, etc. is not that the servers never fail but that when they do fail you can actually do something about it. Quickly. And programmatically. From an operations point of view, the killer feature of the cloud is the API. Using the API's, I can not only detect that there is a problem with a server but I can actually correct it. As easily as I can start a server, I can stop one and replace it with a new one.
Now, to do this effectively I really need to think about my application and my deployment differently. When you have physical servers in a colo failure of a server is, well, failure. It's something to be dreaded. Something that you worry about. Something that usually requires money and trips to the data center to fix.
But for apps deployed on the cloud, failure is a feature. Seriously. Knowing that any server can fail at any time and knowing that I can detect that and correct that programmatically actually allows me to design better apps. More reliable apps. More resilient and robust apps. Apps that are designed to keep running with nary a blip when an individual server goes belly up.
Trust me. Failure is a feature. Embrace it. If you don't understand that, you don't understand the cloud.
What's wonderful and transformative about running your applications in public clouds like EC2 and CloudServers, etc. is not that the servers never fail but that when they do fail you can actually do something about it. Quickly. And programmatically. From an operations point of view, the killer feature of the cloud is the API. Using the API's, I can not only detect that there is a problem with a server but I can actually correct it. As easily as I can start a server, I can stop one and replace it with a new one.
Now, to do this effectively I really need to think about my application and my deployment differently. When you have physical servers in a colo failure of a server is, well, failure. It's something to be dreaded. Something that you worry about. Something that usually requires money and trips to the data center to fix.
But for apps deployed on the cloud, failure is a feature. Seriously. Knowing that any server can fail at any time and knowing that I can detect that and correct that programmatically actually allows me to design better apps. More reliable apps. More resilient and robust apps. Apps that are designed to keep running with nary a blip when an individual server goes belly up.
Trust me. Failure is a feature. Embrace it. If you don't understand that, you don't understand the cloud.
Monday, April 19, 2010
Subscribing an SQS queue to an SNS topic
The new Simple Notification Service from AWS offers a very simple and scalable publish/subscribe service for notifications. The basic idea behind SNS is simple. You can create a topic. Then, you can subscribe any number of subscribers to this topic. Finally, you can publish data to the topic and each subscriber will be notified about the new data that has been published.
Currently, the notification mechanism supports email, http(s) and SQS. The SQS support is attractive because it means you can subscribe an existing SQS queue to a topic in SNS and every time information is published to that topic, a new message will be posted to SQS. That allows you to easily persist the notifications so that they could be logged or further processed at a later time.
Subscribing via the email protocol is very straightforward. You just provide an email address and SNS will send an email message to the address each time information is published to the topic (actually there is a confirmation step that happens first, also via email). Subscribing via HTTP(s) is also easy, you just provide the URL you want SNS to use and then each time information is published to the topic, SNS will POST a JSON payload containing the new information to your URL.
Subscribing an SQS queue, however, is a bit trickier. First, you have to be able to construct the ARN (Amazon Resource Name) of the SQS queue. Secondly, after subscribing the queue you have to set the ACL policy of the queue to allow SNS to send messages to the queue.
To make it easier, I added a new convenience method in the boto SNS module called subscribe_sqs_queue. You pass it the ARN of the SNS topic and the boto Queue object representing the queue and it does all of the hard work for you. You would call the method like this:
>>> import boto
>>> sns = boto.connect_sns()
>>> sqs = boto.connect_sqs()
>>> queue = sqs.lookup('TestSNSNotification')
>>> resp = sns.create_topic('TestSQSTopic')
>>> print resp
{u'CreateTopicResponse': {u'CreateTopicResult': {u'TopicArn': u'arn:aws:sns:us-east-1:963068290131:TestSQSTopic'},
u'ResponseMetadata': {u'RequestId': u'1b0462af-4c24-11df-85e6-1f98aa81cd11'}}}
>>> sns.subscribe_sqs_queue('arn:aws:sns:us-east-1:963068290131:TestSQSTopic', queue)
That should be all you have to do to subscribe your SQS queue to an SNS topic. The basic operations performed are:
The actual policy looks like this:
{"Version": "2008-10-17", "Statement": [{"Resource": "arn:aws:sqs:us-east-1:963068290131:TestSNSNotification", "Effect": "Allow", "Sid": "ad279892-1597-46f8-922c-eb2b545a14a8", "Action": "SQS:SendMessage", "Condition": {"StringLike": {"aws:SourceArn": "arn:aws:sns:us-east-1:963068290131:TestSQSTopic"}}, "Principal": {"AWS": "*"}}]}
The new subscribe_sqs_queue method is available in the current SVN trunk. Check it out and let me know if you run into any problems or have any questions.
Currently, the notification mechanism supports email, http(s) and SQS. The SQS support is attractive because it means you can subscribe an existing SQS queue to a topic in SNS and every time information is published to that topic, a new message will be posted to SQS. That allows you to easily persist the notifications so that they could be logged or further processed at a later time.
Subscribing via the email protocol is very straightforward. You just provide an email address and SNS will send an email message to the address each time information is published to the topic (actually there is a confirmation step that happens first, also via email). Subscribing via HTTP(s) is also easy, you just provide the URL you want SNS to use and then each time information is published to the topic, SNS will POST a JSON payload containing the new information to your URL.
Subscribing an SQS queue, however, is a bit trickier. First, you have to be able to construct the ARN (Amazon Resource Name) of the SQS queue. Secondly, after subscribing the queue you have to set the ACL policy of the queue to allow SNS to send messages to the queue.
To make it easier, I added a new convenience method in the boto SNS module called subscribe_sqs_queue. You pass it the ARN of the SNS topic and the boto Queue object representing the queue and it does all of the hard work for you. You would call the method like this:
>>> import boto
>>> sns = boto.connect_sns()
>>> sqs = boto.connect_sqs()
>>> queue = sqs.lookup('TestSNSNotification')
>>> resp = sns.create_topic('TestSQSTopic')
>>> print resp
{u'CreateTopicResponse': {u'CreateTopicResult': {u'TopicArn': u'arn:aws:sns:us-east-1:963068290131:TestSQSTopic'},
u'ResponseMetadata': {u'RequestId': u'1b0462af-4c24-11df-85e6-1f98aa81cd11'}}}
>>> sns.subscribe_sqs_queue('arn:aws:sns:us-east-1:963068290131:TestSQSTopic', queue)
That should be all you have to do to subscribe your SQS queue to an SNS topic. The basic operations performed are:
- Construct the ARN for the SQS queue. In our example the URL for the queue is https://queue.amazonaws.com/963068290131/TestSNSNotification but the ARN would be "arn:aws:sqs:us-east-1:963068290131:TestSNSNotification"
- Subscribe the SQS queue to the SNS topic
- Construct a JSON policy that grants permission to SNS to perform a SendMessage operation on the queue. See below for an example of the JSON policy.
- Associate the new policy with the SQS queue by calling the set_attribute method of the Queue object with an attribute name of "Policy" and the attribute value being the JSON policy.
The actual policy looks like this:
{"Version": "2008-10-17", "Statement": [{"Resource": "arn:aws:sqs:us-east-1:963068290131:TestSNSNotification", "Effect": "Allow", "Sid": "ad279892-1597-46f8-922c-eb2b545a14a8", "Action": "SQS:SendMessage", "Condition": {"StringLike": {"aws:SourceArn": "arn:aws:sns:us-east-1:963068290131:TestSQSTopic"}}, "Principal": {"AWS": "*"}}]}
The new subscribe_sqs_queue method is available in the current SVN trunk. Check it out and let me know if you run into any problems or have any questions.
Thursday, February 25, 2010
Stupid Boto Tricks #2 - Reliable Counters in SimpleDB
As a follow-up to yesterday's article about the new consistency features in SimpleDB, I came up with a handy little class in Python to implement a reliable integer counter in SimpleDB. The Counter class makes use of the consistent reads and conditional puts now available in SimpleDB to create a very Pythonic object that acts like an integer object in many ways but also manages the synchronization with the "true" counter object stored in SimpleDB.
The source code can be found in my bitbucket.org repo. I have copied the doc string from the class below to give an example of how the class can be used. Comments, questions and criticisms welcome. As with all Stupid Boto Tricks, remember the code is hot off the presses. Use with appropriate skepticism.
The source code can be found in my bitbucket.org repo. I have copied the doc string from the class below to give an example of how the class can be used. Comments, questions and criticisms welcome. As with all Stupid Boto Tricks, remember the code is hot off the presses. Use with appropriate skepticism.
A consistent integer counter implemented in SimpleDB using new
consistent read and conditional put features.
Usage
-----
To create the counter initially, you need to instantiate a Counter
object, passing in the name of the SimpleDB domain in which you wish
to store the counter, the name of the of the counter within the
domain and the initial value of the counter.
>>> import counter
>>> c = counter.Counter('mydomain', 'counter1', 0)
>>> print c
0
>>>
You can now increment and decrement the counter object using
the standard Python operators:
>>> c += 1
>>> print c
1
>>> c -= 1
>>> print c
0
These operations are automatically updating the value in SimpleDB
and also checking for consistency. You can also use the Counter
object as an int in normal Python comparisons:
>>> c == 0
True
>>> c < 1
True
>>> c != 0
False
If you have multiple processes accessing the same counter
object it will be possible for your view of the Python to become
out of sync with the value in SimpleDB. If this happens, it will
be automatically detected by the Counter object. A ValueError
exception will be raised and the current state of your Counter
object will be updated to reflect the most recent value stored
in SimpleDB.
>>> c += 1
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
...
ValueError: Counter was out of sync
>>> print c
2
>>>
In addition to storing the value of the counter in SimpleDB, the
Counter also stores a timestamp of the last update in the form of
an ISO8601 string. You can access the timestamp using the
timestamp attribute of the Counter object:
>>> c.timestamp
'2010-02-25T13:49:15.561674'
>>>
Subscribe to:
Posts (Atom)