Friday, October 30, 2009

Using RDS in Boto

Initial support for RDS has just been added to boto.  The code currently lives in the subversion trunk but a new boto release will be out very soon that will also include the new RDS module.  To get things started, I'll give a short tutorial on using RDS.

The first thing we need to do is create a connection to the RDS service.  This is done in the same way all other service connections are created in boto:


>>> import boto
>>> rds = boto.connect_rds()

Ultimately, we want to create a new DBInstance, basically an EC2 instance that has been pre-configured to run MySQL.  Before we can do that, we need to create a couple of things that are required when creating a new DBInstance.  First, we will need a DBSecurityGroup.  This is very similar to the SecurityGroup used in EC2 but it's considerably more simple because it is focused on only one type of application, MySQL.  Within a DBSecurityGroup I can authorize access either by a CIDR block or by specifying an existing EC2 SecurityGroup.  Since I'm going to be accessing my DBInstance from an EC2 instance, I'm just going to authorize the EC2 SecurityGroup that my instance is running in.  Let's assume it's the group "default":


>>> sg = rds.create_dbsecurity_group('group1', 'My first DB Security group') 
>>> ec2 = boto.connect_ec2()
>>> my_ec2_group = ec2.get_all_security_groups(['default'])[0]
>>> sg.authorize(ec2_group=my_ec2_group)

 Now that we have a DBSecurityGroup created, we now need a DBParameterGroup.  The DBParameterGroup is what's used to manage all of the configuration settings you would normally have in your MySQL config file.  Because you don't have direct access to your DBInstance (unlike a normal EC2 instance) you need to use the DBParameterGroup to retrieve and modify the configuration settings for your DBInstance.  Let's create a new one:


>>>pg = rds.create_parameter_group('paramgrp1', description='My first param group.')

 The ParameterGroup object in boto subclasses dict, so it behaves just like a normal mapping type.  Each key in the ParameterGroup is the name of a config entry and it's value is a Parameter object.  Let's explore one of the Parameters in the ParameterGroup.  Because the set of parameters is quite large, RDS doesn't send all of the default parameter settings to you when you create a new ParameterGroup.  To fetch them from RDS, we need to call get_params:


>>> pg.get_params()
>>> pg.keys()
[u'default_week_format',
 u'lc_time_names',
 u'innodb_autoinc_lock_mode',
 u'collation_server',
<...>
  u'key_buffer_size',
 u'key_cache_block_size',
 u'log-bin']
>>> param = pg['max_allowed_packet']
>>> param.name
u'max_allowed_packet'
>>> param.type
u'integer'
>>> param.allowed_values
u'1024-1073741824'
>>> param.value = -5
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)

ValueError: range is 1024-1073741824
>>> param.value = 2048
>>> param.apply()

Because the Parameters have information about the type of the data and allowable ranges, we can do a pretty good job of validating values before sending them back to RDS with the apply method.

Now that we have a DBSecurityGroup and DBParameterGroup created, we can create our DBInstance.


>>> inst = rds.create_dbinstance(id='dbinst1', allocated_storage=10,
instance_class='db.m1.small', master_username='mitch',
master_password='topsecret', param_group='paramgrp1',
security_group='group1')

At this point, RDS will start the process of bringing up a new MySQL instance based on my specifications.  There are lots of other parameters available to tweak.  In addition, you can do things like set the preferred maintenance window and when you would prefer to have snapshots run.  To check on the status of our instance, we can do the following:


>>> rs = rds.get_all_dbinstances()
>>> rs
[DBInstance:dbinst1]
>>> inst = rs[0]
>>> inst.status
>>> u'available'
>>> inst.endpoint
>>> (u'dbinst1.c07mrl4pthxk.us-east-1.rds.amazonaws.com', 3306)

So, at this point our new DBInstance is up and running and we have the endpoint and port number we need to connect to it.  One of the nice things about RDS is that once the instance is running, I can use RDS to perform a lot of the management tasks associated with the server.  I can do snapshots of the server at any time, or I can automate that process.  I can change any of the parameters associated with the server and decide whether I want those changes to take place immediately or to wait until the next maintenance window.  I can also use the modify_dbinstance method to tell RDS to increase the allocated storage on my server or even move my instance up to a larger instance class.

The current RDS code is checked in.  It's still beta quality but we will be releasing a 1.9 version of boto early next week which will include this code as well as support for VPC and a ton of bug fixes.  So, if you get a chance, give the boto RDS module a try and let us know what you think.

Tuesday, October 27, 2009

RDS: The End of SimpleDB?

The recent announcement of Amazon's Relational Database Service is generating a lot of buzz.  And well it should.  For people who require a relational database for their applications and have been rolling their own with EC2 and EBS, it offers a really nice option.  Let AWS manage that database for you and focus more attention on your app.  It also represents another inevitable step up the ladder from IaaS to PaaS for AWS and gives pretty good triangulation data about where cloud computing will be in a few years.

But does RDS also mean the end of SimpleDB?  There have already been posts on the SimpleDB forum to that affect.  I think the answer is "no" but it does illustrate what I think has been a misstep in the evolution of SimpleDB.

Let me start by saying that I love SimpleDB.  I use it all the time.  I have built a number of real applications and services with it and in my experience it "just works".  I know there are some applications that just require a full-blown relational database but in my experience I've been able to do everything I need to do with SimpleDB.  And I absolutely love the fact that it's just there as a service, doing whatever it needs to do to scale along with my app.

But it seems like SimpleDB has always been a bit of a red-headed stepchild at AWS.  They haven't had a clear, consistent strategy for it.  When people compared it to a relational database, rather than following the NoSQL philosophy they tried to make SimpleDB look more like a relational database.  They deprecated it's elegant set-based query language with a SQL subset in hopes of attracting the relational crowd.  But I think mainly what happens is that people focus on the "subset" aspect and are always pining for yet more SQL compatibility.  I just don't think it won them many converts.

So, does RDS represent the end of SimpleDB?  I really don't think so.  The two offerings are very, very different.  AWS needs to embrace that difference and communicate it more clearly.  There are a lot of applications out there that can benefit from the lightweight, super-scalable, and easy to use qualities of SimpleDB.  MySQL simply can't compete on those dimensions.  I'm pretty sure AWS agrees but it would be nice to see some positive reinforcement from them soon, before their user base get's scared.  I don't think that building RDS on the back of SimpleDB is what AWS had in mind.

Thursday, October 1, 2009

Managing Your AWS Credentials (Part 3)

The first part of this series described the various AWS credentials and the second part focused on some of the challenges in keeping those credentials secret. In this short update, I want to talk about a new feature available in AWS that will help you keep your credentials more secure.

The new Multi Factor Authentication (MFA) provides an additional layer of protection around access to your AWS Console and AWS Portal. As you recall from Part 1, controlling access to these areas is vitally important because they in turn allow access to all of your other AWS resources and credentials.

To use MFA, you need to sign up for the service and buy an inexpensive security device such as the one shown below:


Once you have the device and are registered for MFA, when you attempt to log in to your AWS Portal or AWS Console, you will be asked for the email address associated with the account, the (hopefully very strong) password associated with the account and then finally you will be asked to enter the 6-digit number that appears on your device when you press the little grey button.

That means that even if someone discovers your password, they still need the device before they can log in. And, even if they have the device, they still need your password. Hence the name, Multi-Factor. The devices are inexpensive and the additional security they provide for your AWS credentials is well worth the cost. I highly recommend MFA. At least for your production credentials.

Another useful new security capability is Key Rotation. This is automatically enabled for all AWS accounts and allows you to create a new AccessKeyID and SecretAccessKey but allow the old ones to remain active. In that way, you can create new keys and then begin transitioning your servers to use the new credentials and when the transition is complete, you can then disable the old credentials. That's handy anytime you think you may have been compromised but its also a good idea to do it as part of a regular security routine.